Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash: "double free or corruption (!prev)" #78

Open
antiops opened this issue Aug 22, 2023 · 2 comments
Open

Crash: "double free or corruption (!prev)" #78

antiops opened this issue Aug 22, 2023 · 2 comments

Comments

@antiops
Copy link

antiops commented Aug 22, 2023

I've been getting consistent crashes on the master server (in a master node + 1 replica setup). Both are on the latest version from the realeases page. Uptime is inconsistent. Sometimes its up for a day then crashes and sometimes it crashes within a few hours.

They're using basic configs so I might be missing an important thing that I do not know about.

The replica server has been running fine with no crashes.

Configs:

Master (config-main.toml)

db_path="/home/tik/redis/videos-replica.v2.db"
seq_map_path="/tmp/videos-main.cbor"

node_id=1

publish=true
replicate=false

Replica (config-replica.toml)

db_path="/home/rep/tik/videos.v2.db"
seq_map_path="/tmp/videos-replica-1.cbor"

node_id=2

publish=false
replicate=true

Details

Each instance is ran through the command line like

# Master
./marmot -config config-main.toml -cluster-addr 10.1.0.12:4223 -cluster-peers 'nats://10.1.0.1:14222/'

# Replica
./marmot -config config-replica.toml -cluster-addr 10.1.0.1:14222 -cluster-peers 'nats://10.1.0.12:4223/'

The database that it's using is 1.8GB with 4 tables of which only 1 (videos_clean) is being updated frequently. The master database is a replica itself to keep it separate from the production one, a script pushes changes to it every minute.

Below is the output from the most recent crash.

marmot-v0.8.5-master-crashlog.txt

@maxpert
Copy link
Owner

maxpert commented Aug 23, 2023

Reading crash logs:

goroutine 12606 [syscall]:
runtime.cgocall(0xdc4280, 0xc00057ed50)
        /opt/hostedtoolcache/go/1.20.7/x64/src/runtime/cgocall.go:157 +0x5c fp=0xc00057ed28 sp=0xc00057ecf0 pc=0x40601c
github.com/mattn/go-sqlite3._Cfunc_sqlite3_close_v2(0x7f1b701351f8)
        _cgo_gotypes.go:631 +0x4c fp=0xc00057ed50 sp=0xc00057ed28 pc=0x884f0c
github.com/mattn/go-sqlite3.(*SQLiteConn).Close.func1(0x0?)
        /home/runner/go/pkg/mod/github.com/mattn/go-sqlite3@v1.14.17/sqlite3.go:1772 +0x46 fp=0xc00057ed88 sp=0xc00057ed50 pc=0x8958c6
github.com/mattn/go-sqlite3.(*SQLiteConn).Close(0xc000502840)
        /home/runner/go/pkg/mod/github.com/mattn/go-sqlite3@v1.14.17/sqlite3.go:1772 +0x25 fp=0xc00057edb8 sp=0xc00057ed88 pc=0x8957c5
database/sql.(*driverConn).finalClose.func2()
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:644 +0x3c fp=0xc00057ede0 sp=0xc00057edb8 pc=0x7ee3dc
database/sql.withLock({0x12f7620, 0xc00037a6c0}, 0xc00057ee88)
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:3405 +0x8c fp=0xc00057ee20 sp=0xc00057ede0 pc=0x7fc86c
database/sql.(*driverConn).finalClose(0xc00037a6c0)
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:642 +0x116 fp=0xc00057eec8 sp=0xc00057ee20 pc=0x7ee296
database/sql.finalCloser.finalClose-fm()
        <autogenerated>:1 +0x2b fp=0xc00057eee0 sp=0xc00057eec8 pc=0x7fddcb
database/sql.(*driverConn).Close(0xc00037a6c0)
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:623 +0x13f fp=0xc00057ef28 sp=0xc00057eee0 pc=0x7ee15f
database/sql.(*DB).connectionCleaner(0xc00047e340, 0xc00027b000?)
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:1078 +0x23d fp=0xc00057efc0 sp=0xc00057ef28 pc=0x7efffd
database/sql.(*DB).startCleanerLocked.func1()
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:1048 +0x2a fp=0xc00057efe0 sp=0xc00057efc0 pc=0x7efd8a
runtime.goexit()
        /opt/hostedtoolcache/go/1.20.7/x64/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00057efe8 sp=0xc00057efe0 pc=0x46e9e1
created by database/sql.(*DB).startCleanerLocked
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:1048 +0x105

Sounds like there is some sort of connection cleanup in the SQLite library that's messing it up. You see it happen infrequently due to the race condition (guessing from startCleanerLocked). Seems like I might have to do some deeper digging into github.com/mattn/go-sqlite3

@maxpert
Copy link
Owner

maxpert commented Aug 23, 2023

Would it be OK for you to join the discord channel and DM me? I am trying to reproduce the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants