New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sentry stopped accepting transaction data #2876
Comments
Did you change the port? |
Yes, I have the relay port exposed to the host network. How did you manage to fix the problem? |
When I reverted the port change the problem was resolved. |
Nope, didn't help. Doesn't work even with default config. Thanks for the tip though |
Are there any logs in your web container that can help? Are you sure you are receiving the event envelopes? You should be able to see that activity in your nginx container. |
Same here, on the browser side, there is a request sent with an event type of "transaction", but there is no data displayed under "performance", and the number of transactions in the project is also 0. |
Problem solved, server time not match the sdk time. |
Are you on a nightly version of self-hosted? What does your sentry.conf.py look like? We've added some feature flags there to support the new performance features |
I'm using docker with the latest commit from this repository. Bottom of the page says I've updated sentry.conf.py to match the most recent version from this repo - now the only difference is in After that, errors have also disappeared: |
I can confirm that the clickhouse errors are due to the Rust workers, reverting the workers part of #2831 and #2861 Workers logs show that insert is done (is it ?): |
The error is caused by connection being prematurely closed. See #2900 |
errors are not logged to |
Okay so I'm able to replicate this issue on my instance (24.3.0). What happen is that Sentry does accept transaction/errors/profiles/replays/attachments data, but it doesn't record it on the statistics. So your stats of ingested events might be displayed as is there were no events being recorded, but actually the events are there -- it's processed by Snuba and you can view it on the web UI. Can anyone reading this confirm that that's what happened on your instances as well? (I don't want to ping everybody) If the answer to that 👆🏻 is "yes", that means something (a module, container, or something) that ingest the events didn't do data insertion correctly for it to be queried as statistics. I don't know for sure whether it's the responsibility of Snuba consumers (as we moved to A few solution (well not really but I hope this would get rid of this issue) for this is, either:
|
I didn't see any errors in the Issues tab. I had to rebuild a Server Snapshot to “fix” this problem. So it wasn't just the statistics that were affected. |
@DarkByteZero for the broken stats view, try to add one more snuba consumer from this PR #2909 |
After using the new snuba consumer from #2909 and reverting to the python consumer, everything is working now. My statistics view is now complete again, even retroactively. |
I have the same problem and sentry no longer accepts new issues. |
I did some changes on my relay, to add more caching |
All of a sudden my server stopped processing the issuances and started giving the error "envelope buffer capacity exceeded" and after restarting a few times it got better but I'm not sure if it's related to this issue or not. |
same error, that's why I found #1929 (comment) and applied the setting with some other ones |
same error on just installed 24.4.1
only git checkout, run install.sh , docker compose up -d
Flooded with errors as in UPD in issue description.
Replace 'rust-consumer' with 'consumer' (as in comments to issues somewhere) "solves" the problem without need to full version downgrade. |
I confirm that the fix mentioned by @combrs works: %s/rust-consumer/consumer/g on your docker-compose.yaml and the problem goes away |
For reference I started using keydb (by Snapchat) to replace redis, pretty much out of the box replacement. |
No tweaks for redis. It is not redis issue - transaction events pushed to redis, but looks like consumers dead, that is why memory growing. It is effect of the problem, not a reason |
And in Rust workers I am really not sure why this occurs Edit: this changed with getsentry/snuba#5838 |
Here is: # See: https://github.com/getsentry/sentry-docs/blob/master/docs/product/relay/options.mdx
relay:
upstream: "http://web:9000/"
host: 0.0.0.0
port: 3000
override_project_ids: false
logging:
level: WARN
http:
timeout: 60
connection_timeout: 60
max_retry_interval: 60
host_header: 'sentry.xxxxxx.xx'
processing:
enabled: true
kafka_config:
- {name: "bootstrap.servers", value: "kafka:9092"}
- {name: "message.max.bytes", value: 50000000} # 50MB
redis: redis://redis:6379
geoip_path: "/geoip/GeoLite2-City.mmdb"
cache:
envelope_buffer_size: 80000 # queue up to 1 million requests
eviction_interval: 120
project_grace_period: 3600 # One hour
envelope_expiry: 1200 # 20 minutes
batch_interval: 300
file_interval: 30
spool:
envelopes:
# path: /var/lib/sentry-envelopes/files
max_memory_size: 1GB
max_disk_size: 4GB
max_connections: 20
min_connections: 10 You will notice the jump from 2-3K lines/s to 13K lines/s. Was Sentry struggling to follow ? |
I noticed that restarting all containers does some kind of state reset. But after some time the relay seems unable to handle it's own load:
I would be very happy if anyone had a clue of where to search. The web logs report 200 OK in their logs. http:
timeout: 60
connection_timeout: 60
max_retry_interval: 60 In the cron logs I did find:
Seems that the Docker host needs Plus But
persists Seems like this was previously reported as #1929 With the help of this incredible tcpdump command (https://stackoverflow.com/a/16610385/5155484) I managed to see the reply web did: {"configs":{},"pending":["cbb152173d0b4451b3453b05b58dddee","084e50cc07ad4b9f862a3595260d7aa1"]} Request: POST /api/0/relays/projectconfigs/?version=3 HTTP/1.1 {"publicKeys":["cbb152173d0b4451b3453b05b58dddee","084e50cc07ad4b9f862a3595260d7aa1"],"fullConfig":true,"noCache":false} |
Self-Hosted Version
24.3.0.dev0
CPU Architecture
x86_x64
Docker Version
24.0.4
Docker Compose Version
24.0.4
Steps to Reproduce
Update to the latest master
Expected Result
Everything works fine
Actual Result
Performance page shows zeros for the time period since the update and until now:
Project page shows the correct info about transactions and errors:
Stats page shows 49k transactions of which 49k are dropped:
Same for errors:
Event ID
No response
UPD
there are a lot of errors in clickhouse container:
The text was updated successfully, but these errors were encountered: