Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime error: slice bounds out of range #321

Open
ekbfh opened this issue Jan 25, 2021 · 3 comments
Open

runtime error: slice bounds out of range #321

ekbfh opened this issue Jan 25, 2021 · 3 comments

Comments

@ekbfh
Copy link

ekbfh commented Jan 25, 2021

Hi!
Liftbridge 1.4.1

3liftbridge + 3nats servers, in cluster of 3 with min isr 2.

How i can debug what can produce this error and why liftbridge failed?

After long retention long i have panic:

янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Finished cleaning log /var/lib/liftbridge/streams/events.DGT/0"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Cleaning log /var/lib/liftbridge/streams/events.DGT/1 based on retention policy {Bytes:0 Messages:0 Age:24h0m0s}"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Finished cleaning log /var/lib/liftbridge/streams/events.DGT/1"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Cleaning log /var/lib/liftbridge/streams/events.DGT/2 based on retention policy {Bytes:0 Messages:0 Age:24h0m0s}"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Finished cleaning log /var/lib/liftbridge/streams/events.DGT/2"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Cleaning log /var/lib/liftbridge/streams/events.DGT/3 based on retention policy {Bytes:0 Messages:0 Age:24h0m0s}"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Finished cleaning log /var/lib/liftbridge/streams/events.DGT/3"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Cleaning log /var/lib/liftbridge/streams/dispose.DGT/0 based on retention policy {Bytes:0 Messages:0 Age:24h0m0s}"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Finished cleaning log /var/lib/liftbridge/streams/dispose.DGT/0"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Cleaning log /var/lib/liftbridge/streams/dispose.DGT/1 based on retention policy {Bytes:0 Messages:0 Age:24h0m0s}"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Finished cleaning log /var/lib/liftbridge/streams/dispose.DGT/1"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="api: Publish [stream=events.IVANOVO, partition=1]"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="api: Publish [stream=events.IVANOVO, partition=1]"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="api: Publish [stream=events.IVANOVO, partition=3]"
янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="api: Publish [stream=events.IVANOVO, partition=3]"

янв 25 12:17:04 mevent04 liftbridge[30762]: time="2021-01-25 12:17:04" level=debug msg="Updated log leader epoch. New: {epoch:5, offset:20078}, Previous: {epoch:0, offset:-1} for log [subject=liftbridge-default.cursors, stream=__cursors, partition=0]. Cache now contains 1 entry."
янв 25 12:17:04 mevent04 liftbridge[30762]: panic: runtime error: slice bounds out of range [:1635020668] with capacity 36
янв 25 12:17:04 mevent04 liftbridge[30762]: goroutine 421 [running]:
янв 25 12:17:04 mevent04 liftbridge[30762]: github.com/liftbridge-io/liftbridge/server/commitlog.SerializedMessage.Key(0xc002d2615c, 0x18, 0x24, 0x40, 0xc0045f2b10, 0x0)
янв 25 12:17:04 mevent04 liftbridge[30762]: /home/circleci/project/server/commitlog/message.go:108 +0xc3
янв 25 12:17:04 mevent04 liftbridge[30762]: github.com/liftbridge-io/liftbridge/server/commitlog.(*compactCleaner).cleanSegment(0xc000114990, 0xc00019b130, 0xc000e66000, 0xa7521, 0xc003fe89c0, 0xc000d60a50, 0x4, 0x0, 0x0)
янв 25 12:17:04 mevent04 liftbridge[30762]: /home/circleci/project/server/commitlog/compact_cleaner.go:143 +0x1e3
янв 25 12:17:04 mevent04 liftbridge[30762]: github.com/liftbridge-io/liftbridge/server/commitlog.(*compactCleaner).compact(0xc000114990, 0xa7521, 0xc0006bf040, 0x6, 0x8, 0x1, 0x1, 0xc003135d70, 0xb3a06f, 0xc0003f6e70, ...)
янв 25 12:17:04 mevent04 liftbridge[30762]: /home/circleci/project/server/commitlog/compact_cleaner.go:101 +0x22d
янв 25 12:17:04 mevent04 liftbridge[30762]: github.com/liftbridge-io/liftbridge/server/commitlog.(*compactCleaner).Compact(0xc000114990, 0xa7521, 0xc0006bf040, 0x6, 0x8, 0x6, 0x8, 0x0, 0x0, 0xc000151768, ...)
янв 25 12:17:04 mevent04 liftbridge[30762]: /home/circleci/project/server/commitlog/compact_cleaner.go:53 +0x159
янв 25 12:17:04 mevent04 liftbridge[30762]: github.com/liftbridge-io/liftbridge/server/commitlog.(*commitLog).clean(0xc000356a00, 0xc0006bf040, 0x6, 0x8, 0x1c18ad43, 0x3c0094deb277f9, 0xc000151658, 0xb47772, 0xbffbc1041c18ad43, 0x4773d10fca)
янв 25 12:17:04 mevent04 liftbridge[30762]: /home/circleci/project/server/commitlog/commitlog.go:784 +0x112
янв 25 12:17:04 mevent04 liftbridge[30762]: github.com/liftbridge-io/liftbridge/server/commitlog.(*commitLog).Clean(0xc000356a00, 0xc000151700, 0x0)
янв 25 12:17:04 mevent04 liftbridge[30762]: /home/circleci/project/server/commitlog/commitlog.go:735 +0x95
янв 25 12:17:04 mevent04 liftbridge[30762]: github.com/liftbridge-io/liftbridge/server/commitlog.(*commitLog).cleanerLoop(0xc000356a00)
янв 25 12:17:04 mevent04 liftbridge[30762]: /home/circleci/project/server/commitlog/commitlog.go:724 +0x151
янв 25 12:17:04 mevent04 liftbridge[30762]: created by github.com/liftbridge-io/liftbridge/server/commitlog.New
янв 25 12:17:04 mevent04 liftbridge[30762]: /home/circleci/project/server/commitlog/commitlog.go:146 +0x595
янв 25 12:17:04 mevent04 systemd[1]: liftbridge.service: main process exited, code=exited, status=2/INVALIDARGUMENT
@tylertreat
Copy link
Member

Something seems wrong with the message encoding. Specifically:

panic: runtime error: slice bounds out of range [:1635020668] with capacity 36

The log compactor is attempting to load the key for a message but it's reading the key bounds as 1635020668, which overflows the message. If you could post the disk contents for this particular log segment, we could see what the serialized message looks like.

@ekbfh
Copy link
Author

ekbfh commented Jan 25, 2021

Unfortunately, i don't have those files.
I append config:

clustering:
    min.insync.replicas: 2
    raft.bootstrap.peers:
    - mevent02
    - mevent03
    - mevent04
    replica.fetch.timeout: 7s
    server.id: mevent02
cursors:
    stream.auto.pause.time: 0
    stream.partitions: 1
data:
    dir: /var/lib/liftbridge
host: *.84
logging:
    level: debug
    raft: true
nats.servers:
- nats://*.84:4222
- nats://*.85:4222
- nats://*.86:4222
streams:
    compact.enabled: false
    retention.max:
        age: 24h

@tylertreat
Copy link
Member

tylertreat commented Jan 25, 2021

Indeed, it would seem there is something going wrong with the serialization of cursor messages. The cursor message key should be getting set to <cursorId>,<streamName>,<partitionId> (see here). Would need to see the cursor data to confirm what the issue is though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants