Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] blank page cached on context canceled #337

Open
frederichoule opened this issue Apr 21, 2023 · 53 comments · Fixed by #347
Open

[bug] blank page cached on context canceled #337

frederichoule opened this issue Apr 21, 2023 · 53 comments · Fixed by #347

Comments

@frederichoule
Copy link
Contributor

frederichoule commented Apr 21, 2023

I have a recurring error where the homepage of the website is blank.

When I check the key value in redis, it looks like this:

HTTP/0.0 200 OK
Date: Fri, 21 Apr 2023 18:44:19 GMT
X-Souin-Stored-Ttl: 24h0m0s
Content-Length: 0

What strikes to me is the key name, which is GET-https-www.website.com-/, which all the other keys have {-VARY-} appended to them. If I delete the key, and refresh the website, the new key that appears is GET-https-www.website.com-/{-VARY-}Accept-Encoding:gzip, deflate, br, and now the page isn't blank anymore.

What would cause that?

@frederichoule frederichoule changed the title Cache key + {-VARY-} Blank page x {-VARY-} Apr 21, 2023
@frederichoule
Copy link
Contributor Author

For reference, a working key contains this:

HTTP/0.0 200 OK
Cache-Control: public, max-age=86400, must-revalidate
Content-Encoding: gzip
Content-Type: text/html; charset=UTF-8
Date: Fri, 21 Apr 2023 18:58:53 GMT
Server: Caddy
Server: Google Frontend
Vary: Accept-Encoding
X-Cloud-Trace-Context: 63177d3330a11c9b0a44a650162499a1

Meaning it really comes from the backend server.

@frederichoule
Copy link
Contributor Author

Unrelated: why the HTTP/0.0 ?

@darkweak
Copy link
Owner

Hey @frederichoule that's weird. Do you have a reproducible repository/example please?
To explain how the vary works under the hood, it will list all keys starting with (in your case) GET-https-www.website.com-/($|\{VARY\}), either the key ends by the trailing or has the vary separator.
If the first case is the first key to be returned, the key doesn't have any vary headers to validate, so it will match the client expectations and returns the content to the user. It will loop over the other keys and try to validate the varied headers.

What I think at the moment is on the first time a user tried to load the page but the backend returned nothing with an HTTP 200 OK status code.

Btw I don't know why your server returns an HTTP/0.0 protocol version, I don't deal with it and don't try to override it.
Can you post at least your caddy configuration?

@frederichoule
Copy link
Contributor Author

1- I'll try to reproduce the issue locally and let you know.

2- The output is clearly not from the backend, since there's no Server: Google Frontend or X-Cloud-Trace-Context.

3- Backend returns HTTP/1.1, Caddy returns HTTP/1.1, but stored key is HTTP/0.0 - This is a non-problem, I was just wondering.

@frederichoule
Copy link
Contributor Author

Here is the Caddyfile:

{
	order cache before handle
	email xxx@xxx.xxx

	admin off

	on_demand_tls {
		ask http://localhost:5555/__ask
		interval 2m
		burst 5
	}

	cache {
		cache_name Tarmac
		default_cache_control public, max-age=86400, must-revalidate
		allowed_http_verbs GET
		ttl 86400s
		redis {
			url 127.0.0.1:6379
		}
		key {
			hide
		}
	}

}

(cache_regular) {
	cache {
		key {
			disable_body
			disable_query
		}
	}
}

(cache_qs) {
	cache {
		key {
			disable_body
		}
	}
}

(upstream) {

	request_header -Cache-Control

	tls {
		on_demand
	}

	header {
		-X-Cloud-Trace-Context
		Server Tarmac
	}

	@notwww {
		not host localhost
		not header_regexp www Host ^www\.[a-zA-Z0-9-]+\.[a-zA-Z]{2,}$
	}

	redir @notwww https://www.{host}{uri} permanent

	@get {
		method GET
	}

	@hits {
		path /__hits
		query id=*
		method GET
	}

	@redis {
		path /__redis
		method POST
	}

	@query_caliente query caliente=1
	
	@query_search {
		path /search
		query q=*
	}

	@query_after query after=*

	handle @query_caliente {
		import cache_qs
	}

	handle @query_search {
		import cache_qs
	}

	handle @query_after {
		import cache_qs
	}

	handle @hits {
		reverse_proxy http://127.0.0.1:5555
	}

	handle @redis {
		reverse_proxy http://127.0.0.1:5555
	}

	handle @get {
		import cache_regular
	}

	handle {
		abort
	}

	reverse_proxy backend.com {
		header_down -Cache-Control
		header_up -Cache-Control
		header_up Host {upstream_hostport}
		header_up X-Backend-Hostname {host}		
	}

}

localhost {
	tls internal
	import upstream
}

https:// {
	import upstream
}

http:// {
	redir https://{host}{uri} permanent
}

@frederichoule
Copy link
Contributor Author

Unable to reproduce the problem locally for now. I am closing this for now.

@darkweak
Copy link
Owner

That could be something like cache poisoning (I wonder if @jenaye could help us about that). It could explain the

Server: Google Frontend
X-Cloud-Trace-Context: 63177d3330a11c9b0a44a650162499a1

@frederichoule
Copy link
Contributor Author

It's a brand new website, not really public yet. I think it might have to do with robot scanning the website, generating incorrect cached page. I'll investigate it more thoroughly soon.

@frederichoule
Copy link
Contributor Author

I was able to pinpoint when the blank page bug happens.

Whenever I have a blank page in cache, it comes from a request that contains this in the logs:

"error": "context canceled"

Any idea where it's coming from? I'll investigate after lunch, but if you have a hint for me, let me know!

@darkweak
Copy link
Owner

Nice, that means I don't handle correctly the client disconnections. It should be quite easy to reproduce if that's this case. Let's reopen this issue then 🙂

@darkweak darkweak reopened this May 17, 2023
@frederichoule frederichoule changed the title Blank page x {-VARY-} [bug] blank page cached on context canceled May 17, 2023
@frederichoule
Copy link
Contributor Author

Have you been able to reproduce the issue yet @darkweak ? I'll give it a try tomorrow if you didn't have time yet.

@darkweak
Copy link
Owner

Reproducible with the following code

// runner/requester.go
package main

import (
	"context"
	"fmt"
	"net/http"
	"time"
)

func main() {
	ctx, _ := context.WithTimeout(context.Background(), 10*time.Millisecond)
	rq, _ := http.NewRequestWithContext(ctx, http.MethodGet, "http://localhost:4443/handler", nil)
	res, err := http.DefaultClient.Do(rq)

	fmt.Printf("-----\nres => %+v\nerr => %+v\n-----\n", res, err)
}
// handlers/handler.go
package main

import (
	"fmt"
	"net/http"
	"time"
)

func main() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		fmt.Println("Sleep...")
		time.Sleep(3 * time.Second)
		fmt.Println("Wake up")
		w.Write([]byte("Hello after sleep"))
	})
	http.ListenAndServe(":80", nil)
}

With the given caddyfile

{
    order cache before rewrite
    debug
    log {
        level debug
    }
    cache {
      ttl 100s
    }
}

:4443

route /handler {
    cache {
        ttl 10s
    }
    reverse_proxy 127.0.0.1:80
}
./caddy run
go run handler/handler.go
go run runner/requester.go
curl -I http://localhost:4443/handler
# Should return the right headers without content

@darkweak
Copy link
Owner

To keep you updated, I'll work on that tomorrow.

@frederichoule
Copy link
Contributor Author

Amazing, thanks! Haven't had time to take a look yet, sorry about that.

@darkweak
Copy link
Owner

It should work using @frederichoule

xcaddy build --with github.com/darkweak/souin/plugins/caddy@47ea558146d978d6be0ec6a42804f679200d6d70 --with github.com/darkweak/souin@47ea558146d978d6be0ec6a42804f679200d6d70

@frederichoule
Copy link
Contributor Author

Amazing! Will try it ASAP.

@darkweak
Copy link
Owner

use the commit 2cffe5ac9c989e963b3e28c327e6b6c63eab371d instead of the previous one. I pushed a fix on the uncached responses.

@frederichoule
Copy link
Contributor Author

Testing right away.

@frederichoule
Copy link
Contributor Author

frederichoule commented May 25, 2023

First page load works, second page load throws a panic and the server stops.

2023/05/25 19:42:38.124	ERROR	http.log.error	context canceled	{...}, "duration": 0.628182}
panic: Header called after Handler finished

goroutine 56 [running]:
net/http.(*http2responseWriter).Header(0x40003584c0?)
	net/http/h2_bundle.go:6569 +0x80
github.com/darkweak/souin/pkg/middleware.(*CustomWriter).Header(...)
	github.com/darkweak/souin@v1.6.39-0.20230524064538-2cffe5ac9c98/pkg/middleware/writer.go:46
github.com/darkweak/souin/pkg/middleware.(*SouinBaseHandler).Upstream(0x4000990640, 0x40003584c0, 0x4000656c00, 0x40005948e0, 0x4000287c80?, {0x4000144660, 0x22})
	github.com/darkweak/souin@v1.6.39-0.20230524064538-2cffe5ac9c98/pkg/middleware/middleware.go:274 +0x404
github.com/darkweak/souin/pkg/middleware.(*SouinBaseHandler).ServeHTTP.func2()
	github.com/darkweak/souin@v1.6.39-0.20230524064538-2cffe5ac9c98/pkg/middleware/middleware.go:478 +0x40
created by github.com/darkweak/souin/pkg/middleware.(*SouinBaseHandler).ServeHTTP
	github.com/darkweak/souin@v1.6.39-0.20230524064538-2cffe5ac9c98/pkg/middleware/middleware.go:477 +0x1324

@darkweak
Copy link
Owner

I hope that this commit fix your issue f9f8ab6da7e98e2e7898d10122766865e0b4884c

@frederichoule
Copy link
Contributor Author

Seems OK locally. I just pushed to prod for a small segment of our network and will let you know if we still get blank pages. Thanks!

@frederichoule
Copy link
Contributor Author

I still have blank pages. I'm unable to debug right now as my internet connection is pretty weak, but I'll give you more informations later.

@darkweak darkweak reopened this May 28, 2023
@darkweak
Copy link
Owner

Reopened again 😅

@frederichoule
Copy link
Contributor Author

Caddy crashed completely twice in the last 2 days. The last line in the log each time is related to context canceled. I don't know how to reproduce the crash locally yet. I'll find a way to make Caddy log the crash output somewhere.

{"level":"error","ts":1685643137.1032906,"logger":"http.log.error","msg":"context canceled","request":{},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"www.website.com"}},"duration":4.829884199}
{"level":"debug","ts":1685643137.1034744,"logger":"http.handlers.reverse_proxy","msg":"upstream roundtrip","upstream":"upstream.com:80","duration":4.251552925,"request":{},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"www.website.com"}},"error":"context canceled"}

@frederichoule frederichoule reopened this Jun 1, 2023
@darkweak
Copy link
Owner

darkweak commented Jun 1, 2023

Do you have some debug logs please ? Because these are not really explicit. I will write the patch to add more debug/info logs during the request process in the next days.

@frederichoule
Copy link
Contributor Author

This is all I got from Caddy. I'll read the doc to see what i can do to get more.

@darkweak
Copy link
Owner

darkweak commented Jun 1, 2023

When you're running caddy you can set the debug global directive in your caddy file and Souin will run as debug mode and give you the debug logs.

@frederichoule
Copy link
Contributor Author

It is already on debug, those logs in the last comments are what I get from the debug mode, yet I can't see what makes Caddy crash. Here I was able to see it since it happened locally: #337 (comment)

@darkweak
Copy link
Owner

darkweak commented Jun 1, 2023

What are the previous logs before the crash ?

@frederichoule
Copy link
Contributor Author

Only regular requests logs - but I had to reboot the server and didn't take a copy of the logs beforehand. I will do it at the next crash (so far looks like it's once per day... my TTL is 24 hours.. maybe related to stale?)

@frederichoule
Copy link
Contributor Author

frederichoule commented Jun 2, 2023

Crashed again, almost exactly 24h after last reboot (which is my TTL) - again, nothing in the log, but Caddy is not accepting requests anymore. Can't see what make it crash in the logs.

Will lower the TTL and try to debug that.

@frederichoule
Copy link
Contributor Author

Got it.

fatal error: concurrent map writes

goroutine 234338 [running]:
github.com/caddyserver/caddy/v2.(*Replacer).Set(...)
	github.com/caddyserver/caddy/v2@v2.6.4/replacer.go:67
github.com/caddyserver/caddy/v2/modules/caddyhttp/reverseproxy.(*Handler).ServeHTTP.func1()
	github.com/caddyserver/caddy/v2@v2.6.4/modules/caddyhttp/reverseproxy/reverseproxy.go:490 +0x70
github.com/caddyserver/caddy/v2/modules/caddyhttp/reverseproxy.(*Handler).ServeHTTP(0xc000898380, {0x24fd6a0, 0xc00002c980}, 0xc000d47400, {0x24f4de0, 0x2254038})
	github.com/caddyserver/caddy/v2@v2.6.4/modules/caddyhttp/reverseproxy/reverseproxy.go:512 +0x45b
github.com/caddyserver/caddy/v2/modules/caddyhttp.wrapMiddleware.func1.1({0x24fd6a0?, 0xc00002c980?}, 0x24f4de0?)
	github.com/caddyserver/caddy/v2@v2.6.4/modules/caddyhttp/routes.go:290 +0x42
github.com/caddyserver/caddy/v2/modules/caddyhttp.HandlerFunc.ServeHTTP(0x24f4de0?, {0x24fd6a0?, 0xc00002c980?}, 0x40dea8?)
	github.com/caddyserver/caddy/v2@v2.6.4/modules/caddyhttp/caddyhttp.go:58 +0x2f
github.com/caddyserver/caddy/v2/modules/caddyhttp.wrapRoute.func1.1({0x24fd6a0, 0xc00002c980}, 0xc000d47400)
	github.com/caddyserver/caddy/v2@v2.6.4/modules/caddyhttp/routes.go:259 +0x3a8
github.com/caddyserver/caddy/v2/modules/caddyhttp.HandlerFunc.ServeHTTP(0x1d16de0?, {0x24fd6a0?, 0xc00002c980?}, 0x7?)
	github.com/caddyserver/caddy/v2@v2.6.4/modules/caddyhttp/caddyhttp.go:58 +0x2f
github.com/caddyserver/caddy/v2/modules/caddyhttp.wrapRoute.func1.1({0x24fd6a0, 0xc00002c980}, 0xc000d47400)
	github.com/caddyserver/caddy/v2@v2.6.4/modules/caddyhttp/routes.go:238 +0x219
github.com/caddyserver/caddy/v2/modules/caddyhttp.HandlerFunc.ServeHTTP(0x40be65?, {0x24fd6a0?, 0xc00002c980?}, 0xc000a8ef00?)
	github.com/caddyserver/caddy/v2@v2.6.4/modules/caddyhttp/caddyhttp.go:58 +0x2f
github.com/darkweak/souin/plugins/caddy.(*SouinCaddyMiddleware).ServeHTTP.func1({0x24fd6a0?, 0xc00002c980?}, 0x203dbc4?)
	github.com/darkweak/souin/plugins/caddy@v0.0.0-20230525203934-f9f8ab6da7e9/httpcache.go:83 +0x39
github.com/darkweak/souin/pkg/middleware.(*SouinBaseHandler).Upstream(0xc00013b400, 0xc00002c980, 0xc000d47b00, 0xc00021c820, 0x443145?, {0xc0010c6330, 0x2a})
	github.com/darkweak/souin@v1.6.39-0.20230525203934-f9f8ab6da7e9/pkg/middleware/middleware.go:258 +0x1a6
github.com/darkweak/souin/pkg/middleware.(*SouinBaseHandler).ServeHTTP.func3()
	github.com/darkweak/souin@v1.6.39-0.20230525203934-f9f8ab6da7e9/pkg/middleware/middleware.go:484 +0x3e
created by github.com/darkweak/souin/pkg/middleware.(*SouinBaseHandler).ServeHTTP
	github.com/darkweak/souin@v1.6.39-0.20230525203934-f9f8ab6da7e9/pkg/middleware/middleware.go:483 +0x19b3

@darkweak
Copy link
Owner

darkweak commented Jun 3, 2023

Nice, now we know where, I have to check why and how to reproduce.

@darkweak
Copy link
Owner

Hey sorry for the delay. I tried many time with in memory storages but that doesn't occur. I wonder if that could come from the reconnection in redis/etcd/olric storages

@frederichoule
Copy link
Contributor Author

I haven't had time to dig deeper on that so far. We use Redis as storage, so maybe we can investigate on that side.

@darkweak
Copy link
Owner

Hey, I'm not able to reproduce your issue, even if I try to cut the connection to redis and let Souin reconnect itself. I think it comes from the replacer caddy code.

@frederichoule
Copy link
Contributor Author

Let's close this for now. I had to move to another project for now so I didn't have time to dive deeper on that. Thanks for your help @darkweak

@midnight-wonderer
Copy link

Hey
I have the same issue, and I used the default storage. (Badger)

Souin sometimes caches blank pages.

Here my setup:
Caddy build with:

xcaddy build --with github.com/caddyserver/cache-handler

Caddyfile (cache part)

cache {
  mode bypass_request
  ttl 86400s
  key {
    hide
  }
  default_cache_control no-store
  badger {
    configuration {
      Dir /home/caddy/badger-cache
      ValueDir /home/caddy/badger-cache
    }
  }
}

I believe this is the relevant log:

Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3248205,"msg":"using provided configuration","config_file":"/home/caddy/ingress/Caddyfile","config_adapter":""}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3338363,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["//localhost:2019","//[::1]:2019","//127.0.0.1:2019"]}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3360136,"logger":"http.auto_https","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.336098,"logger":"http.auto_https","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"warn","ts":1703044329.3361273,"logger":"http.auto_https","msg":"server is listening only on the HTTP port, so no automatic HTTPS will be applied to this server","server_name":"srv1","http_port":80}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.33631,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc000466f00"}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3727183,"logger":"http.handlers.cache","msg":"Set nextTxnTs to 0"}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3728356,"logger":"http.handlers.cache","msg":"Set backend timeout to 10s"}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3728437,"logger":"http.handlers.cache","msg":"Set cache timeout to 10s"}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3728487,"logger":"http.handlers.cache","msg":"Souin configuration is now loaded."}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3753047,"logger":"http.log","msg":"server running","name":"srv1","protocols":["h1","h2","h3"]}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.375385,"logger":"http","msg":"enabling HTTP/3 listener","addr":":443"}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3775697,"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
Dec 20 03:52:09 ingress.example.com caddy[718]: {"level":"info","ts":1703044329.3775969,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["souin.example.com"]}
Dec 20 03:52:11 ingress.example.com caddy[718]: {"level":"info","ts":1703044331.5345001,"msg":"autosaved config (load with --resume flag)","file":"/home/caddy/.config/caddy/autosave.json"}
Dec 20 03:52:11 ingress.example.com caddy[718]: {"level":"info","ts":1703044331.5346777,"msg":"serving initial configuration"}
Dec 20 03:52:11 ingress.example.com systemd[1]: Started Caddy.
Dec 20 03:52:12 ingress.example.com caddy[718]: {"level":"info","ts":1703044332.9625828,"logger":"tls","msg":"finished cleaning storage units"}
Dec 20 04:01:51 ingress.example.com caddy[718]: {"level":"error","ts":1703044911.1341724,"logger":"http.log.error","msg":"dial tcp [::1]:30701: connect: connection refused","request":{"remote_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","remote_port":"45754","client_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","proto":"HTTP/2.0","method":"GET","host":"souin.example.com","uri":"/dashboard","headers":{"Cache-Control":["max-age=0"],"Sec-Ch-Ua-Mobile":["?0"],"Accept-Language":["en-IN,en;q=0.8"],"Sec-Fetch-Dest":["document"],"Cookie":[],"Accept-Encoding":["gzip, deflate, br"],"Sec-Ch-Ua-Platform":["\"Linux\""],"Upgrade-Insecure-Requests":["1"],"Sec-Gpc":["1"],"Sec-Fetch-Site":["same-origin"],"Sec-Fetch-User":["?1"],"Referer":["https://souin.example.com/projects/452862/dashboard"],"Sec-Ch-Ua":["\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Brave\";v=\"120\""],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"],"Sec-Fetch-Mode":["navigate"],"If-None-Match":["W/\"64dcd516918653ed2d1c773c476c87bb\""]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"souin.example.com"}},"duration":3.001623215,"status":502,"err_id":"bak0babmz","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
Dec 20 04:01:54 ingress.example.com caddy[718]: {"level":"error","ts":1703044914.460583,"logger":"http.log.error","msg":"dial tcp [::1]:30701: connect: connection refused","request":{"remote_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","remote_port":"45754","client_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","proto":"HTTP/2.0","method":"GET","host":"souin.example.com","uri":"/dashboard","headers":{"Sec-Fetch-User":["?1"],"Referer":["https://souin.example.com/projects/452862/dashboard"],"Sec-Fetch-Mode":["navigate"],"Sec-Fetch-Dest":["document"],"Cache-Control":["max-age=0"],"Sec-Ch-Ua":["\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Brave\";v=\"120\""],"Sec-Ch-Ua-Mobile":["?0"],"Sec-Ch-Ua-Platform":["\"Linux\""],"Upgrade-Insecure-Requests":["1"],"Accept-Language":["en-IN,en;q=0.8"],"Accept-Encoding":["gzip, deflate, br"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8"],"Sec-Gpc":["1"],"Sec-Fetch-Site":["same-origin"],"Cookie":[]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"souin.example.com"}},"duration":0.002028886,"status":502,"err_id":"9ic3d5q9t","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
Dec 20 04:01:55 ingress.example.com caddy[718]: {"level":"error","ts":1703044915.3554115,"logger":"http.log.error","msg":"dial tcp [::1]:30701: connect: connection refused","request":{"remote_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","remote_port":"45754","client_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","proto":"HTTP/2.0","method":"GET","host":"souin.example.com","uri":"/dashboard","headers":{"Sec-Ch-Ua":["\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Brave\";v=\"120\""],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8"],"Sec-Gpc":["1"],"Accept-Encoding":["gzip, deflate, br"],"Sec-Ch-Ua-Mobile":["?0"],"Cookie":[],"Cache-Control":["max-age=0"],"Sec-Fetch-User":["?1"],"Referer":["https://souin.example.com/projects/452862/dashboard"],"Sec-Fetch-Dest":["document"],"Sec-Ch-Ua-Platform":["\"Linux\""],"Upgrade-Insecure-Requests":["1"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"],"Accept-Language":["en-IN,en;q=0.8"],"Sec-Fetch-Site":["same-origin"],"Sec-Fetch-Mode":["navigate"]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"souin.example.com"}},"duration":0.000899877,"status":502,"err_id":"0xm433tg8","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
Dec 20 04:01:56 ingress.example.com caddy[718]: {"level":"error","ts":1703044916.040958,"logger":"http.log.error","msg":"dial tcp [::1]:30701: connect: connection refused","request":{"remote_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","remote_port":"45754","client_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","proto":"HTTP/2.0","method":"GET","host":"souin.example.com","uri":"/dashboard","headers":{"Cache-Control":["max-age=0"],"Sec-Ch-Ua-Platform":["\"Linux\""],"Upgrade-Insecure-Requests":["1"],"Accept-Language":["en-IN,en;q=0.8"],"Sec-Fetch-Dest":["document"],"Sec-Fetch-User":["?1"],"Accept-Encoding":["gzip, deflate, br"],"Referer":["https://souin.example.com/projects/452862/dashboard"],"Sec-Ch-Ua":["\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Brave\";v=\"120\""],"Sec-Ch-Ua-Mobile":["?0"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8"],"Sec-Fetch-Site":["same-origin"],"Sec-Gpc":["1"],"Sec-Fetch-Mode":["navigate"],"Cookie":[]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"souin.example.com"}},"duration":0.000626723,"status":502,"err_id":"x4uy8ekzi","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
Dec 20 08:39:08 ingress.example.com systemd[1]: Reloading Caddy...

More specifically it starts with this error:

Dec 20 04:01:51 ingress.example.com caddy[718]: {"level":"error","ts":1703044911.1341724,"logger":"http.log.error","msg":"dial tcp [::1]:30701: connect: connection refused","request":{"remote_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","remote_port":"45754","client_ip":"2410:9008:b641:474e:8173:c051:5ba0:6f3","proto":"HTTP/2.0","method":"GET","host":"souin.example.com","uri":"/dashboard","headers":{"Cache-Control":["max-age=0"],"Sec-Ch-Ua-Mobile":["?0"],"Accept-Language":["en-IN,en;q=0.8"],"Sec-Fetch-Dest":["document"],"Cookie":[],"Accept-Encoding":["gzip, deflate, br"],"Sec-Ch-Ua-Platform":["\"Linux\""],"Upgrade-Insecure-Requests":["1"],"Sec-Gpc":["1"],"Sec-Fetch-Site":["same-origin"],"Sec-Fetch-User":["?1"],"Referer":["https://souin.example.com/projects/452862/dashboard"],"Sec-Ch-Ua":["\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Brave\";v=\"120\""],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"],"Sec-Fetch-Mode":["navigate"],"If-None-Match":["W/\"64dcd516918653ed2d1c773c476c87bb\""]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"souin.example.com"}},"duration":3.001623215,"status":502,"err_id":"bak0babmz","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}

Going deeper into the log, I also found an interesting error before the one above.

{"level":"error","ts":1703014208.7550611,"logger":"http.log.error","msg":"context canceled","request":{"remote_ip":"2410:9008:b641:474e:604e:765e:d8c1:dd26","remote_port":"33296","client_ip":"2410:9008:b641:474e:604e:765e:d8c1:dd26","proto":"HTTP/2.0","method":"GET","host":"souin.example.com","uri":"/__/auth/iframe.js","headers":{"Sec-Gpc":["1"],"Accept-Language":["en-IN,en;q=0.9"],"Referer":["https://souin.example.com/__/auth/iframe?apiKey=AIzaSyDhTIykU2R5ovNZrRmPlN_c_0IS8-YNbrU&appName=%5BDEFAULT%5D&v=10.5.2&eid=p&usegapi=1&jsh=m%3B%2F_%2Fscs%2Fabc-static%2F_%2Fjs%2Fk%3Dgapi.lb.en.y0xCMa4KeeI.O%2Fd%3D1%2Frs%3DAHpOoo8-3MGCaatZB3kdS5TpZdd-gOSBHg%2Fm%3D__features__"],"Accept-Encoding":["gzip, deflate, br"],"Sec-Ch-Ua":["\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Brave\";v=\"120\""],"Sec-Ch-Ua-Mobile":["?1"],"Accept":["*/*"],"Sec-Fetch-Mode":["no-cors"],"User-Agent":["Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36"],"Sec-Ch-Ua-Platform":["\"Android\""],"Sec-Fetch-Dest":["script"],"Sec-Fetch-Site":["same-origin"],"Cookie":[]},"tls":{"resumed":false,"version":772,"cipher_suite":4865,"proto":"h2","server_name":"souin.example.com"}},"duration":0.191336176}

Interestingly, it has "context canceled" message, the same as OP.

@midnight-wonderer
Copy link

To summarize the common factor

msg: "context canceled"
proxying: something from Google frontend. For me, I have to do this: https://firebase.google.com/docs/auth/web/redirect-best-practices#proxy-requests (Option 3).

I deduced that this is rare because most people use Souin for cache HTTP/1 upstream.
Google frontend, on the other hand, utilizes also h2, h3 protocols.

I can share my upstream server with you so you can test
https://slime-systems-ab8f083fce.firebaseapp.com

Please note that the error seems random because I have yet to see the exact pattern.
It might take a lot of tries.

@darkweak
Copy link
Owner

Hello @midnight-wonderer thank you for your feedback.

I think the workflow to reproduce is

-> User sends a request to the proxy
-> The proxy tries to reach the upstream
X-> The user cancel his request
<- The upstream cancel the request too and returns a blank page
<- The proxy gets a blank page
<- The proxy stores the blank page

I will try to make a reproducible example in the E2E tests but that's quite hard to make it reliable.

@midnight-wonderer
Copy link

I forgot to tell crucial information:

In my case, the blank page was triggered by Google upstream with path /__/auth/iframe, but the blank page was cached at the path /projects/452862/dashboard, which has a normal HTTP/1 upstream.

This means that regardless of what triggers context canceled, the blank page can be cached to the adjacent requests.

@midnight-wonderer
Copy link

Reproduced that again.
This time, I noticed that it happened when I clicked on a link in mobile browsers.
But I don't just click a link normally; my finger shudders a bit, causing a rapid double click.

I believe that this time, no Google front-end requests were directly involved in the rapid clicks.

I am not sure if it is the one, but I found someone is able to reproduce the issue here:
golang/go#52183
If we assume it is a bug in the request library, can Souin prevent the problematic response from being cached?

@midnight-wonderer
Copy link

I looked at the HAR file saved from the cached blank page and have a question.

Saved HAR file
{
  "log": {
    "version": "1.2",
    "creator": {
      "name": "WebInspector",
      "version": "537.36"
    },
    "pages": [],
    "entries": [
      {
        "_initiator": {
          "type": "other"
        },
        "_priority": "VeryHigh",
        "_resourceType": "document",
        "cache": {},
        "connection": "6031",
        "request": {
          "method": "GET",
          "url": "https://souin.example.com/projects/452860/dashboard",
          "httpVersion": "http/2.0",
          "headers": [
            {
              "name": ":authority",
              "value": "souin.example.com"
            },
            {
              "name": ":method",
              "value": "GET"
            },
            {
              "name": ":path",
              "value": "/projects/452860/dashboard"
            },
            {
              "name": ":scheme",
              "value": "https"
            },
            {
              "name": "accept",
              "value": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8"
            },
            {
              "name": "accept-encoding",
              "value": "gzip, deflate, br"
            },
            {
              "name": "accept-language",
              "value": "en-IN,en;q=0.7"
            },
            {
              "name": "cache-control",
              "value": "max-age=0"
            },
            {
              "name": "cookie",
              "value": "_document_session=IbSV8KRfMWf%2krZ5MYpZww4%2Fy4TSoe2qxshLVwEGyL6BxAyLwysVI4lEAN4nCOjUrsGom7qHrUWXrU5KXavO5c3w0ig5L1WuKcppCx5dXbEomCuw4MGJG4S%2BfwY4Nu3qLOweRMk5IlYH%2FPNubW9RAIKrxqGGksoNiGzmXCLFaj%2FG9940SnBIX%2B4VZB1DOzAdwd23gw3Q%2FSaZTlB%2BMKpbbmBNFwhvaeCbg7KOZb0dISSnCtQ5y0C92mjUFQI1%2Fwoblj1uTLK%2FFQBeKkz2wbRnjUzy3UDfWny1X5e9H7TFcJ2993MHq9OqhA5o7RxiM5HrdFe%2Fc%2F4ikwoB7b7ZKsUVkeGoJTinw2aM81fki86NEhDJp7Lxfs0yYwd8GlZqI36zUV2%2Fhm2fuhQRBSvVrBnu0azCaDw8fVm5vWzfHRyaiJb93YjzrlPBYQxeXMfqaxrkrmO%2BfGVYezXifx--FoCU8E2Lfmi4R%2Bun--DK7FYC5C9DC%2Fhug2bwVP3Q%3D%3D"
            },
            {
              "name": "referer",
              "value": "https://souin.example.com/dashboard"
            },
            {
              "name": "sec-ch-ua",
              "value": "\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Brave\";v=\"120\""
            },
            {
              "name": "sec-ch-ua-mobile",
              "value": "?1"
            },
            {
              "name": "sec-ch-ua-platform",
              "value": "\"Android\""
            },
            {
              "name": "sec-fetch-dest",
              "value": "document"
            },
            {
              "name": "sec-fetch-mode",
              "value": "navigate"
            },
            {
              "name": "sec-fetch-site",
              "value": "same-origin"
            },
            {
              "name": "sec-fetch-user",
              "value": "?1"
            },
            {
              "name": "sec-gpc",
              "value": "1"
            },
            {
              "name": "upgrade-insecure-requests",
              "value": "1"
            },
            {
              "name": "user-agent",
              "value": "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36"
            }
          ],
          "queryString": [],
          "cookies": [
            {
              "name": "_thai_document_session",
              "value": "IbSV8KRfMWf%2BEZHVoIGz2cx7eSYwkrZ5MYpZww4%2F4TSoe2qxsIzvLVwEGyL6BxAyLwysVI4lEAN4nCOjUrsGom7qX%2FB%2BrU5KXavO5c3w0ig5L1WuKcB5FppCx5dXbEomCuw4MGJG4S%2BfwY4Nu3qLOweRMk5IlYH%2FPNubW9RAIKrxqGGksoNiGzmXCLFaj%2FG9940SnBIX%2B4VZB1DOzAdwd23gw3Q%2FSaZTlB%2BMKpbbmBNFwhvaeCbg7KOZb0dISSnCtQ5y0C92mjUFQI1%2Fwoblj1uTLK%2FFQBeKkz2wbRnjUzy3UDfWny1X5e9H7TFcJ2993MHq9OqhA5o7RxiM5HrdFe%2Fc%2F4ikwoB7b7ZKsUVkeGoJTinw2aM81fki86NEhDJp7Lxfs0yYwd8GlZqI36zUV2%2Fhm2fuhQRBSvVrBnu0azCaDw8fVm5vWzHRyaiJb93YjzrlPBYQxeXMfqaxrwN8D5jI3krmO%2BfGVYezXifx--FoCU8E2Lfmi4R%2Bun--DK7FYC5C9DC%2Fhug2bwVP3Q%3D%3D",
              "path": "/",
              "domain": "souin.example.com",
              "expires": "1969-12-31T23:59:59.000Z",
              "httpOnly": true,
              "secure": false,
              "sameSite": "Lax"
            }
          ],
          "headersSize": -1,
          "bodySize": 0
        },
        "response": {
          "status": 200,
          "statusText": "",
          "httpVersion": "http/2.0",
          "headers": [
            {
              "name": "age",
              "value": "2264"
            },
            {
              "name": "alt-svc",
              "value": "h3=\":443\"; ma=2592000"
            },
            {
              "name": "cache-status",
              "value": "Souin; hit; ttl=84136; key="
            },
            {
              "name": "content-length",
              "value": "0"
            },
            {
              "name": "date",
              "value": "Thu, 21 Dec 2023 15:19:34 GMT"
            },
            {
              "name": "server",
              "value": "Caddy"
            }
          ],
          "cookies": [],
          "content": {
            "size": 0,
            "mimeType": "text/plain"
          },
          "redirectURL": "",
          "headersSize": -1,
          "bodySize": -1,
          "_transferSize": 108,
          "_error": null
        },
        "serverIPAddress": "[2a02:4f8:c3c:1393:5755:95a8:b7cd:21f7]",
        "startedDateTime": "2023-12-21T15:57:15.862Z",
        "time": 922.765999999191,
        "timings": {
          "blocked": 26.87999999921309,
          "dns": 148.13,
          "ssl": 208.81199999999995,
          "connect": 546.8190000000001,
          "send": 1.9639999999999418,
          "wait": 196.54300000042872,
          "receive": 2.4299999995491817,
          "_blocked_queueing": 22.69099999921309
        }
      }
    ]
  }
}

If I set default_cache_control to no-store, how come Souin caches content without a Cache-Control header present?

P.S. Sorry if my comments are too rapid; I typed what is in my mind at the moment. Please let me know if you mind, and I will try to delay them.

@fliespl
Copy link

fliespl commented Dec 26, 2023

@darkweak is this resolved or not? I am curious since issue is closed, yet other people are reporting issues.

@darkweak
Copy link
Owner

@midnight-wonderer @fliespl the possible fix would be to not store blank responses in cache, idk. Anyway, I'm reopening this issue.

@darkweak darkweak reopened this Dec 28, 2023
@darkweak
Copy link
Owner

@midnight-wonderer thank you for pointing the go pkg issue 👍

@rzv-me
Copy link

rzv-me commented Feb 1, 2024

I have been struggling with a similar problem as @frederichoule had: panic: Header called after Handler finished

The problem is that there is not check in CustomWriter.Header() if the request was cancelled, which is allowed in HTTP2.

I worked on a way to reproduce it, and I have this python script

import threading
import socket
from h2.connection import H2Connection
from h2.events import RequestReceived, StreamReset
from h2.config import H2Configuration
import ssl
from h2.errors import ErrorCodes

def root_function(url='localhost'):
    while True:
        try:
            # Create a TCP connection
            sock = socket.create_connection((url, 4433))
            print(f"Create a TCP connection")
            # Wrap the socket for TLS
            ctx = ssl.create_default_context()
            ctx.check_hostname = False
            ctx.verify_mode = ssl.CERT_NONE
            ctx.set_alpn_protocols(['h2'])
            sock = ctx.wrap_socket(sock, server_hostname=url)
            print(f"Wrap socket for TLS")
            # Make sure we're using HTTP/2
            assert sock.selected_alpn_protocol() == 'h2'

            # Create HTTP/2 connection
            config = H2Configuration(client_side=True)
            conn = H2Connection(config=config)
            conn.initiate_connection()
            sock.sendall(conn.data_to_send())

            # # Create a new stream
            # stream_id = conn.get_next_available_stream_id()
            # conn.send_headers(
            #     stream_id,
            #     [(':method', 'GET'), (':authority', url), (':path', '/'), (':scheme', 'https')],
            # )
            # sock.sendall(conn.data_to_send())

            # Read some data
            while True:
                stream_id = conn.get_next_available_stream_id()
                conn.send_headers(
                    stream_id,
                    [(':method', 'GET'), (':authority', url), (':path', '/mycachepath'), (':scheme', 'https')],
                )
                sock.sendall(conn.data_to_send())
                #data = sock.recv(65535)
                #if not data:
                    #break
                conn.reset_stream(stream_id, error_code=ErrorCodes.CANCEL)
                events = conn.receive_data(data)
                for event in events:
                     if isinstance(event, RequestReceived):
                        # Cancel the stream with error code for CANCEL
                         conn.reset_stream(event.stream_id, error_code=ErrorCodes.CANCEL)
                     elif isinstance(event, StreamReset):
                        print(f"Stream {event.stream_id} cancelled.")

                sock.sendall(conn.data_to_send())
        except Exception as e:
            print(f"An error occurred: {e}")


# Create 50 threads running root_function
threads = []
for i in range(1):
    thread = threading.Thread(target=root_function)
    thread.start()
    threads.append(thread)
    print(f"Created thread {i}")

# Keep the main thread alive
for thread in threads:
    thread.join()%

The caddy file is

{
    debug
    order cache before reverse_proxy
    cache {
        log_level debug
    }
}
localhost:4433 {
    tls internal
    encode gzip
    reverse_proxy /mycachepath https://[YOURBACKENDHOST] {
        header_up Host [YOURBACKENDHOST]
        header_up User-Real-IP {remote_host}
        header_up User-Real-Proto {scheme}
        header_up X-Forwarded-For {remote_host}
        header_up X-Forwarded-Port {server_port}
        header_up X-Forwarded-Proto {scheme}
        header_up X-Forwarded-Host {host}
        health_timeout 5s
        # header_down -Cache-Control
    }
    @matchCachePath path /mycachepath
    header /mycachepath >Cache-Control "must-revalidate, no-cache, no-store, private"
    cache @matchCachePath {
        ttl 3s
        stale 3s
        # log_level debug
        mode bypass
        key {
            disable_body
            disable_host
            disable_method
        }
    }
}

The git diff for the fix is this:

diff --git a/pkg/middleware/writer.go b/pkg/middleware/writer.go
index a786c8f..c108c98 100644
--- a/pkg/middleware/writer.go
+++ b/pkg/middleware/writer.go
@@ -43,6 +43,10 @@ type CustomWriter struct {
 func (r *CustomWriter) Header() http.Header {
        r.mutex.Lock()
        defer r.mutex.Unlock()
+
+       if (r.Req.Context().Err() != nil) {
+               return http.Header{}
+       }
        if r.headersSent {
                return http.Header{}
        }

The problem is that now I get another error

fatal error: concurrent map writes

goroutine 216 [running]:
github.com/caddyserver/caddy/v2.(*Replacer).Set(...)
	github.com/caddyserver/caddy/v2@v2.7.6/replacer.go:68
github.com/caddyserver/caddy/v2/modules/caddyhttp.(*HTTPErrorConfig).WithError(0x106887c28, 0x140157a9700, {0x106887c28?, 0x107bdcee0?})
	github.com/caddyserver/caddy/v2@v2.7.6/modules/caddyhttp/server.go:656 +0x1a4
github.com/caddyserver/caddy/v2/modules/caddyhttp.(*Server).ServeHTTP(0x140006d2580, {0x10689a450, 0x1400008a8c0}, 0x14015826500)
	github.com/caddyserver/caddy/v2@v2.7.6/modules/caddyhttp/server.go:368 +0x1038
net/http.serverHandler.ServeHTTP({0x0?}, {0x10689a450?, 0x1400008a8c0?}, 0x104266a40?)
	net/http/server.go:2938 +0xbc
net/http.initALPNRequest.ServeHTTP({{0x1068a0230?, 0x140157a7f80?}, 0x140157d2000?, {0x140004f64b0?}}, {0x10689a450, 0x1400008a8c0}, 0x14015826500)
	net/http/server.go:3546 +0x1b4
golang.org/x/net/http2.(*serverConn).runHandler(0x107c5b4c0?, 0x0?, 0x0?, 0x1068789e8?)
	golang.org/x/net@v0.19.0/http2/server.go:2368 +0xc0
created by golang.org/x/net/http2.(*serverConn).scheduleHandler in goroutine 232
	golang.org/x/net@v0.19.0/http2/server.go:2303 +0x208

And it is always in replacer.go:68

Hope this helps you

@darkweak
Copy link
Owner

darkweak commented Feb 1, 2024

@rzv-me what is your running version? Did you try with the 5a88d84c26128d43e37c614c83c3d3e53da2afd1 commit?

xcaddy build --with github.com/darkweak/souin/plugins/caddy@5a88d84c26128d43e37c614c83c3d3e53da2afd1 --with github.com/darkweak/souin@5a88d84c26128d43e37c614c83c3d3e53da2afd1

@rzv-me
Copy link

rzv-me commented Feb 1, 2024

Hi @darkweak,

I am building with the latest version from master, which seems to be the same as the commit hash you sent

Go version is 1.21.5 and caddy v2.7.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants