1# Souin Response Truncation Bug Analysis
2
3## Issue Summary
4Large file downloads (e.g., 3MB tar.gz) are sometimes cached as truncated versions (~4KB) in Souin's OTTER backend.
5
6**Behavior:**
7- First request (cache miss): Downloads full 2.55MB ✓
8- Second request (cache hit): Downloads only 3792 bytes (4KB) ✗
9
10## Root Cause
11
12The issue is a **race condition in Souin's response buffering during streaming**:
13
141. **Application Handler (pgs/web_asset_handler.go:304):**
15 ```go
16 w.WriteHeader(status)
17 _, err := io.Copy(w, contents) // Streaming large file to ResponseWriter
18 ```
19
202. **Souin Middleware Flow (middleware.go:1085-1117):**
21 ```go
22 // Upstream() runs in a goroutine
23 go func(vr *http.Request, cw *CustomWriter) {
24 errorCacheCh <- s.Upstream(cw, vr, next, ...) // <- blocks until complete
25 }(req, customWriter)
26
27 select {
28 case <-req.Context().Done():
29 // ...
30 case v := <-errorCacheCh:
31 _, _ = customWriter.Send() // <- Called AFTER Upstream returns
32 }
33 ```
34
353. **Upstream Function (middleware.go:542-580):**
36 ```go
37 sfValue, err, shared := s.singleflightPool.Do(singleflightCacheKey, func() (interface{}, error) {
38 if e := next(customWriter, rq); e != nil { // <- Calls app handler
39 // ...
40 }
41 // ...
42 err := s.Store(customWriter, rq, ...) // <- Caches response HERE
43 defer customWriter.handleBuffer(func(b *bytes.Buffer) {
44 b.Reset() // <- Resets buffer after Store completes
45 })
46 return singleflightValue{
47 body: customWriter.Buf.Bytes(), // <- Captured BEFORE reset
48 // ...
49 }, err
50 })
51 ```
52
534. **Store Function (middleware.go:378-402):**
54 ```go
55 customWriter.mutex.Lock()
56 b := customWriter.Buf.Bytes() // <- Reads buffer contents
57 bLen := customWriter.Buf.Len()
58 customWriter.mutex.Unlock()
59
60 res := http.Response{
61 Body: io.NopCloser(bytes.NewBuffer(b)), // <- Creates response with buffer snapshot
62 }
63 // ...
64 response, err := httputil.DumpResponse(&res, true) // <- Serializes for cache storage
65 // Store response to cache
66 ```
67
68## The Problem
69
70**Timing Issue with Streaming:**
71
72When the application streams data via `io.Copy(w, contents)`, this happens **concurrently** with Souin's caching:
73
741. **Time T0:** `io.Copy()` starts streaming file data to CustomWriter
75 - Data arrives in chunks via `Write()` calls
76 - Souin buffers all data in `customWriter.Buf`
77
782. **Time T1:** First chunk (~4KB) is buffered
79 - Souin's `Store()` is called (inside Upstream singleflight closure)
80 - Reads `customWriter.Buf.Bytes()` at this moment
81 - **If `io.Copy()` is still executing, only partial data is in the buffer!**
82
833. **Time T2:** The rest of the file continues streaming
84 - But Store has already captured only 4KB
85 - Cache stores truncated response
86
87## Why It's Intermittent
88
89- **Sometimes works:** If the entire file is buffered BEFORE Store is called
90- **Sometimes fails:** If Store reads the buffer while io.Copy is mid-stream
91
92The intermittency depends on:
93- File size
94- Network speed
95- System load (goroutine scheduling)
96- HTTP/2 vs HTTP/1.1 framing
97
98## Solution Applied
99
100**In pgs/web_asset_handler.go (L303-311):**
101
102```go
103w.WriteHeader(status)
104_, err := io.Copy(w, contents)
105
106if err != nil {
107 logger.Error("io copy", "err", err.Error())
108}
109
110// Ensure all data is flushed to the client and cache before closing connection
111// This prevents Souin's response buffer from being truncated mid-stream
112if flusher, ok := w.(http.Flusher); ok {
113 flusher.Flush()
114}
115```
116
117**Why this helps:**
118- `Flush()` forces HTTP/2 frame boundaries to be sent
119- Tells Souin the response is complete before the handler returns
120- Prevents Souin from caching partial responses
121
122## Proper Long-Term Fix (in Souin)
123
124Souin should implement one of:
125
1261. **Wait for io.Copy() to complete before reading buffer:**
127 - Don't call `Store()` until the handler function fully returns
128 - Currently Store is called during Upstream while app handler may still be executing
129
1302. **Use a completion signal:**
131 - Application signals when response is complete (e.g., CloseNotifier)
132 - Souin only snapshots buffer after this signal
133
1343. **Buffer all data before storing:**
135 - Ensure CustomWriter fully buffers before any Store operation
136 - Requires synchronization changes
137
1384. **Implement streaming cache validation:**
139 - Store only cacheablecomplete responses
140 - Validate Content-Length matches actual data before caching
141
142## Files Affected
143
144- `/home/erock/dev/pico/souin/pkg/middleware/writer.go` - CustomWriter
145- `/home/erock/dev/pico/souin/pkg/middleware/middleware.go` - Upstream/Store logic
146- `/home/erock/dev/pico/pkg/apps/pgs/web_asset_handler.go` - Application handler (FIXED)