You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.
2018/03/16 10:29:00 [dataprocessor.go:221 func1()] [E] DP getTargetsRemote: error unmarshaling body from mt-read00-12574-medium-ops-b-2445396050-vdcg2/getdata: "msgp: too few bytes left to r>
2018/03/16 10:29:00 [graphite.go:766 executePlan()] [E] HTTP Render msgp: too few bytes left to read object
[Macaron] 2018-03-16 10:29:00: Completed /render 500 Internal Server Error in 129.391075ms
The text was updated successfully, but these errors were encountered:
I deployed a “silent node” (carbon in, partition 9999) and added some debug statements
It turns out the buffers are coming back as nil 2018/05/09 21:05:26 [dataprocessor.go:223 func1()] [E] DEBUG len(buf)=0, is nil:true
It seems like we are getting nil buffers back from the peers when the request gets canceled. Adding more logging I see 2018/05/09 21:30:24 [dataprocessor.go:216 func1()] [E] DP getTargetsRemote: error with POST to metrictank-read-046-1/getdata: "500 Internal Server Error"
Looking at that time for metrictank-read-046-1 I see 2018/05/09 21:30:24 [cluster.go:191 getData()] [E] HTTP getData() start must be before end.
I think this is ccache corruption. For this particular repro request it was always the same instance that was breaking things. I sent a ccache/delete request and now the error is gone for this repro
This occurred for me during a schema update and was not related to the ccache at all on version 0.9.0. Once schemas were the same on all servers this went away.
The main issue of msgp: too few bytes left to read object is coming from here. This happens when the request to the peer is canceled because another peer has returned an error (so the buffer is nil and not eligible for unmarshaling). The fix for this is probably to just check if the request was canceled before unmarshaling.
This means that there is another problem that is causing the error to be returned. In my specific case it is some ccache corruption.
The text was updated successfully, but these errors were encountered: