New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] blank page cached on context canceled #337
Comments
For reference, a working key contains this:
Meaning it really comes from the backend server. |
Unrelated: why the HTTP/0.0 ? |
Hey @frederichoule that's weird. Do you have a reproducible repository/example please? What I think at the moment is on the first time a user tried to load the page but the backend returned nothing with an HTTP 200 OK status code. Btw I don't know why your server returns an HTTP/0.0 protocol version, I don't deal with it and don't try to override it. |
1- I'll try to reproduce the issue locally and let you know. 2- The output is clearly not from the backend, since there's no 3- Backend returns HTTP/1.1, Caddy returns HTTP/1.1, but stored key is HTTP/0.0 - This is a non-problem, I was just wondering. |
Here is the Caddyfile:
|
Unable to reproduce the problem locally for now. I am closing this for now. |
That could be something like cache poisoning (I wonder if @jenaye could help us about that). It could explain the
|
It's a brand new website, not really public yet. I think it might have to do with robot scanning the website, generating incorrect cached page. I'll investigate it more thoroughly soon. |
I was able to pinpoint when the blank page bug happens. Whenever I have a blank page in cache, it comes from a request that contains this in the logs:
Any idea where it's coming from? I'll investigate after lunch, but if you have a hint for me, let me know! |
Nice, that means I don't handle correctly the client disconnections. It should be quite easy to reproduce if that's this case. Let's reopen this issue then 🙂 |
Have you been able to reproduce the issue yet @darkweak ? I'll give it a try tomorrow if you didn't have time yet. |
Reproducible with the following code // runner/requester.go
package main
import (
"context"
"fmt"
"net/http"
"time"
)
func main() {
ctx, _ := context.WithTimeout(context.Background(), 10*time.Millisecond)
rq, _ := http.NewRequestWithContext(ctx, http.MethodGet, "http://localhost:4443/handler", nil)
res, err := http.DefaultClient.Do(rq)
fmt.Printf("-----\nres => %+v\nerr => %+v\n-----\n", res, err)
} // handlers/handler.go
package main
import (
"fmt"
"net/http"
"time"
)
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
fmt.Println("Sleep...")
time.Sleep(3 * time.Second)
fmt.Println("Wake up")
w.Write([]byte("Hello after sleep"))
})
http.ListenAndServe(":80", nil)
} With the given caddyfile
./caddy run
go run handler/handler.go
go run runner/requester.go
curl -I http://localhost:4443/handler
# Should return the right headers without content |
To keep you updated, I'll work on that tomorrow. |
Amazing, thanks! Haven't had time to take a look yet, sorry about that. |
It should work using @frederichoule xcaddy build --with github.com/darkweak/souin/plugins/caddy@47ea558146d978d6be0ec6a42804f679200d6d70 --with github.com/darkweak/souin@47ea558146d978d6be0ec6a42804f679200d6d70 |
Amazing! Will try it ASAP. |
use the commit |
Testing right away. |
First page load works, second page load throws a panic and the server stops.
|
I hope that this commit fix your issue |
Seems OK locally. I just pushed to prod for a small segment of our network and will let you know if we still get blank pages. Thanks! |
I still have blank pages. I'm unable to debug right now as my internet connection is pretty weak, but I'll give you more informations later. |
Reopened again 😅 |
Caddy crashed completely twice in the last 2 days. The last line in the log each time is related to context canceled. I don't know how to reproduce the crash locally yet. I'll find a way to make Caddy log the crash output somewhere.
|
Do you have some debug logs please ? Because these are not really explicit. I will write the patch to add more debug/info logs during the request process in the next days. |
This is all I got from Caddy. I'll read the doc to see what i can do to get more. |
When you're running caddy you can set the |
It is already on |
What are the previous logs before the crash ? |
Only regular requests logs - but I had to reboot the server and didn't take a copy of the logs beforehand. I will do it at the next crash (so far looks like it's once per day... my TTL is 24 hours.. maybe related to stale?) |
Crashed again, almost exactly 24h after last reboot (which is my TTL) - again, nothing in the log, but Caddy is not accepting requests anymore. Can't see what make it crash in the logs. Will lower the TTL and try to debug that. |
Got it.
|
Nice, now we know where, I have to check why and how to reproduce. |
Hey sorry for the delay. I tried many time with in memory storages but that doesn't occur. I wonder if that could come from the reconnection in redis/etcd/olric storages |
I haven't had time to dig deeper on that so far. We use Redis as storage, so maybe we can investigate on that side. |
Hey, I'm not able to reproduce your issue, even if I try to cut the connection to redis and let Souin reconnect itself. I think it comes from the replacer caddy code. |
Let's close this for now. I had to move to another project for now so I didn't have time to dive deeper on that. Thanks for your help @darkweak |
Hey Souin sometimes caches blank pages. Here my setup: xcaddy build --with github.com/caddyserver/cache-handler Caddyfile (cache part)
I believe this is the relevant log:
More specifically it starts with this error:
Going deeper into the log, I also found an interesting error before the one above.
Interestingly, it has "context canceled" message, the same as OP. |
To summarize the common factor msg: "context canceled" I deduced that this is rare because most people use Souin for cache HTTP/1 upstream. I can share my upstream server with you so you can test Please note that the error seems random because I have yet to see the exact pattern. |
Hello @midnight-wonderer thank you for your feedback. I think the workflow to reproduce is
I will try to make a reproducible example in the E2E tests but that's quite hard to make it reliable. |
I forgot to tell crucial information: In my case, the blank page was triggered by Google upstream with path This means that regardless of what triggers |
Reproduced that again. I believe that this time, no Google front-end requests were directly involved in the rapid clicks. I am not sure if it is the one, but I found someone is able to reproduce the issue here: |
I looked at the HAR file saved from the cached blank page and have a question. Saved HAR file{
"log": {
"version": "1.2",
"creator": {
"name": "WebInspector",
"version": "537.36"
},
"pages": [],
"entries": [
{
"_initiator": {
"type": "other"
},
"_priority": "VeryHigh",
"_resourceType": "document",
"cache": {},
"connection": "6031",
"request": {
"method": "GET",
"url": "https://souin.example.com/projects/452860/dashboard",
"httpVersion": "http/2.0",
"headers": [
{
"name": ":authority",
"value": "souin.example.com"
},
{
"name": ":method",
"value": "GET"
},
{
"name": ":path",
"value": "/projects/452860/dashboard"
},
{
"name": ":scheme",
"value": "https"
},
{
"name": "accept",
"value": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8"
},
{
"name": "accept-encoding",
"value": "gzip, deflate, br"
},
{
"name": "accept-language",
"value": "en-IN,en;q=0.7"
},
{
"name": "cache-control",
"value": "max-age=0"
},
{
"name": "cookie",
"value": "_document_session=IbSV8KRfMWf%2krZ5MYpZww4%2Fy4TSoe2qxshLVwEGyL6BxAyLwysVI4lEAN4nCOjUrsGom7qHrUWXrU5KXavO5c3w0ig5L1WuKcppCx5dXbEomCuw4MGJG4S%2BfwY4Nu3qLOweRMk5IlYH%2FPNubW9RAIKrxqGGksoNiGzmXCLFaj%2FG9940SnBIX%2B4VZB1DOzAdwd23gw3Q%2FSaZTlB%2BMKpbbmBNFwhvaeCbg7KOZb0dISSnCtQ5y0C92mjUFQI1%2Fwoblj1uTLK%2FFQBeKkz2wbRnjUzy3UDfWny1X5e9H7TFcJ2993MHq9OqhA5o7RxiM5HrdFe%2Fc%2F4ikwoB7b7ZKsUVkeGoJTinw2aM81fki86NEhDJp7Lxfs0yYwd8GlZqI36zUV2%2Fhm2fuhQRBSvVrBnu0azCaDw8fVm5vWzfHRyaiJb93YjzrlPBYQxeXMfqaxrkrmO%2BfGVYezXifx--FoCU8E2Lfmi4R%2Bun--DK7FYC5C9DC%2Fhug2bwVP3Q%3D%3D"
},
{
"name": "referer",
"value": "https://souin.example.com/dashboard"
},
{
"name": "sec-ch-ua",
"value": "\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Brave\";v=\"120\""
},
{
"name": "sec-ch-ua-mobile",
"value": "?1"
},
{
"name": "sec-ch-ua-platform",
"value": "\"Android\""
},
{
"name": "sec-fetch-dest",
"value": "document"
},
{
"name": "sec-fetch-mode",
"value": "navigate"
},
{
"name": "sec-fetch-site",
"value": "same-origin"
},
{
"name": "sec-fetch-user",
"value": "?1"
},
{
"name": "sec-gpc",
"value": "1"
},
{
"name": "upgrade-insecure-requests",
"value": "1"
},
{
"name": "user-agent",
"value": "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36"
}
],
"queryString": [],
"cookies": [
{
"name": "_thai_document_session",
"value": "IbSV8KRfMWf%2BEZHVoIGz2cx7eSYwkrZ5MYpZww4%2F4TSoe2qxsIzvLVwEGyL6BxAyLwysVI4lEAN4nCOjUrsGom7qX%2FB%2BrU5KXavO5c3w0ig5L1WuKcB5FppCx5dXbEomCuw4MGJG4S%2BfwY4Nu3qLOweRMk5IlYH%2FPNubW9RAIKrxqGGksoNiGzmXCLFaj%2FG9940SnBIX%2B4VZB1DOzAdwd23gw3Q%2FSaZTlB%2BMKpbbmBNFwhvaeCbg7KOZb0dISSnCtQ5y0C92mjUFQI1%2Fwoblj1uTLK%2FFQBeKkz2wbRnjUzy3UDfWny1X5e9H7TFcJ2993MHq9OqhA5o7RxiM5HrdFe%2Fc%2F4ikwoB7b7ZKsUVkeGoJTinw2aM81fki86NEhDJp7Lxfs0yYwd8GlZqI36zUV2%2Fhm2fuhQRBSvVrBnu0azCaDw8fVm5vWzHRyaiJb93YjzrlPBYQxeXMfqaxrwN8D5jI3krmO%2BfGVYezXifx--FoCU8E2Lfmi4R%2Bun--DK7FYC5C9DC%2Fhug2bwVP3Q%3D%3D",
"path": "/",
"domain": "souin.example.com",
"expires": "1969-12-31T23:59:59.000Z",
"httpOnly": true,
"secure": false,
"sameSite": "Lax"
}
],
"headersSize": -1,
"bodySize": 0
},
"response": {
"status": 200,
"statusText": "",
"httpVersion": "http/2.0",
"headers": [
{
"name": "age",
"value": "2264"
},
{
"name": "alt-svc",
"value": "h3=\":443\"; ma=2592000"
},
{
"name": "cache-status",
"value": "Souin; hit; ttl=84136; key="
},
{
"name": "content-length",
"value": "0"
},
{
"name": "date",
"value": "Thu, 21 Dec 2023 15:19:34 GMT"
},
{
"name": "server",
"value": "Caddy"
}
],
"cookies": [],
"content": {
"size": 0,
"mimeType": "text/plain"
},
"redirectURL": "",
"headersSize": -1,
"bodySize": -1,
"_transferSize": 108,
"_error": null
},
"serverIPAddress": "[2a02:4f8:c3c:1393:5755:95a8:b7cd:21f7]",
"startedDateTime": "2023-12-21T15:57:15.862Z",
"time": 922.765999999191,
"timings": {
"blocked": 26.87999999921309,
"dns": 148.13,
"ssl": 208.81199999999995,
"connect": 546.8190000000001,
"send": 1.9639999999999418,
"wait": 196.54300000042872,
"receive": 2.4299999995491817,
"_blocked_queueing": 22.69099999921309
}
}
]
}
} If I set P.S. Sorry if my comments are too rapid; I typed what is in my mind at the moment. Please let me know if you mind, and I will try to delay them. |
@darkweak is this resolved or not? I am curious since issue is closed, yet other people are reporting issues. |
@midnight-wonderer @fliespl the possible fix would be to not store blank responses in cache, idk. Anyway, I'm reopening this issue. |
@midnight-wonderer thank you for pointing the go pkg issue 👍 |
I have been struggling with a similar problem as @frederichoule had: The problem is that there is not check in I worked on a way to reproduce it, and I have this python script import threading
import socket
from h2.connection import H2Connection
from h2.events import RequestReceived, StreamReset
from h2.config import H2Configuration
import ssl
from h2.errors import ErrorCodes
def root_function(url='localhost'):
while True:
try:
# Create a TCP connection
sock = socket.create_connection((url, 4433))
print(f"Create a TCP connection")
# Wrap the socket for TLS
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
ctx.set_alpn_protocols(['h2'])
sock = ctx.wrap_socket(sock, server_hostname=url)
print(f"Wrap socket for TLS")
# Make sure we're using HTTP/2
assert sock.selected_alpn_protocol() == 'h2'
# Create HTTP/2 connection
config = H2Configuration(client_side=True)
conn = H2Connection(config=config)
conn.initiate_connection()
sock.sendall(conn.data_to_send())
# # Create a new stream
# stream_id = conn.get_next_available_stream_id()
# conn.send_headers(
# stream_id,
# [(':method', 'GET'), (':authority', url), (':path', '/'), (':scheme', 'https')],
# )
# sock.sendall(conn.data_to_send())
# Read some data
while True:
stream_id = conn.get_next_available_stream_id()
conn.send_headers(
stream_id,
[(':method', 'GET'), (':authority', url), (':path', '/mycachepath'), (':scheme', 'https')],
)
sock.sendall(conn.data_to_send())
#data = sock.recv(65535)
#if not data:
#break
conn.reset_stream(stream_id, error_code=ErrorCodes.CANCEL)
events = conn.receive_data(data)
for event in events:
if isinstance(event, RequestReceived):
# Cancel the stream with error code for CANCEL
conn.reset_stream(event.stream_id, error_code=ErrorCodes.CANCEL)
elif isinstance(event, StreamReset):
print(f"Stream {event.stream_id} cancelled.")
sock.sendall(conn.data_to_send())
except Exception as e:
print(f"An error occurred: {e}")
# Create 50 threads running root_function
threads = []
for i in range(1):
thread = threading.Thread(target=root_function)
thread.start()
threads.append(thread)
print(f"Created thread {i}")
# Keep the main thread alive
for thread in threads:
thread.join()% The caddy file is
The git diff for the fix is this: diff --git a/pkg/middleware/writer.go b/pkg/middleware/writer.go
index a786c8f..c108c98 100644
--- a/pkg/middleware/writer.go
+++ b/pkg/middleware/writer.go
@@ -43,6 +43,10 @@ type CustomWriter struct {
func (r *CustomWriter) Header() http.Header {
r.mutex.Lock()
defer r.mutex.Unlock()
+
+ if (r.Req.Context().Err() != nil) {
+ return http.Header{}
+ }
if r.headersSent {
return http.Header{}
} The problem is that now I get another error
And it is always in Hope this helps you |
@rzv-me what is your running version? Did you try with the
|
Hi @darkweak, I am building with the latest version from master, which seems to be the same as the commit hash you sent Go version is 1.21.5 and caddy v2.7.6 |
I have a recurring error where the homepage of the website is blank.
When I check the key value in redis, it looks like this:
What strikes to me is the key name, which is
GET-https-www.website.com-/
, which all the other keys have{-VARY-}
appended to them. If I delete the key, and refresh the website, the new key that appears isGET-https-www.website.com-/{-VARY-}Accept-Encoding:gzip, deflate, br
, and now the page isn't blank anymore.What would cause that?
The text was updated successfully, but these errors were encountered: