Skip to content
This repository has been archived by the owner on Jan 21, 2022. It is now read-only.

Is calling “DoesTorrentExist” twice on a torrent necessary? #274

Open
issuefiler opened this issue May 10, 2021 · 1 comment
Open

Is calling “DoesTorrentExist” twice on a torrent necessary? #274

issuefiler opened this issue May 10, 2021 · 1 comment

Comments

@issuefiler
Copy link

issuefiler commented May 10, 2021

FATAL   Could not add new torrent to the database
ERROR: duplicate key value violates unique constraint

exists, err := database.DoesTorrentExist(infoHash[:])
if err != nil {
zap.L().Fatal("Could not check whether torrent exists!", zap.Error(err))
} else if !exists {
metadataSink.Sink(result)
}
case md := <-metadataSink.Drain():
if err := database.AddNewTorrent(md.InfoHash, md.Name, md.Files); err != nil {

What’s the point of having the DoesTorrentExist call there when it allows multiple torrents to be Sinked before the AddNewTorrent call? With that, I still need to call DoesTorrentExist again in AddNewTorrent, otherwise it’ll eventually crash on identical torrents that have been double-Sinked.


You’re aware of this

// Although we check whether the torrent exists in the database before asking MetadataSink to
// fetch its metadata, the torrent can also exists in the Sink before that:
//
// If the torrent is complete (i.e. its metadata) and if its waiting in the channel to be
// received, a race condition arises when we query the database and seeing that it doesn't
// exists there, add it to the sink.
//
// Do NOT try to be clever and attempt to use INSERT OR IGNORE INTO or INSERT OR REPLACE INTO
// without understanding their consequences fully:
//
// https://www.sqlite.org/lang_conflict.html
//
// INSERT OR IGNORE INTO
// INSERT OR IGNORE INTO will ignore:
// 1. CHECK constraint violations
// 2. UNIQUE or PRIMARY KEY constraint violations
// 3. NOT NULL constraint violations
//
// You would NOT want to ignore #1 and #2 as they are likely to indicate programmer errors.
// Instead of silently ignoring them, let the program err and investigate the causes.
//
// INSERT OR REPLACE INTO
// INSERT OR REPLACE INTO will replace on:
// 1. UNIQUE or PRIMARY KEY constraint violations (by "deleting pre-existing rows that are
// causing the constraint violation prior to inserting or updating the current row")
//
// INSERT OR REPLACE INTO will abort on:
// 2. CHECK constraint violations
// 3. NOT NULL constraint violations (if "the column has no default value")
//
// INSERT OR REPLACE INTO is definitely much closer to what you may want, but deleting
// pre-existing rows means that you might cause users loose data (such as seeder and leecher
// information, readme, and so on) at the expense of /your/ own laziness...
if exist, err := db.DoesTorrentExist(infoHash); exist || err != nil {
return err
}


  • It should either prevent the double-Sinking of the same torrent with the DoesTorrentExist call at the AddNewTorrent removed, or let AddNewTorrent solely do the DoesTorrentExist check. DoesTorrentExist calls are expensive, taking about 50 milliseconds.
@issuefiler issuefiler changed the title Double-sink due to race condition Double-sinking identical torrents (race condition) May 10, 2021
@issuefiler
Copy link
Author

issuefiler commented May 10, 2021

Most of incoming info-hashes don’t actually make it to ms.flush (NewLeech’s OnSuccess). Why bother checking duplicate info-hashes with lots of expensive database calls(The query itself is cheap, but the client overhead is not negligible.), when most of them can’t make it to AddNewTorrent and AddNewTorrent also has the dup-checking logic that is more reliable? Is it to save bandwidth?

connect: dial: dial tcp4 ...:7746: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
doExHandshake: metadata too big or its size is less than or equal zero
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
doExHandshake: readExMessage: readMessage: readExactly rLengthB: EOF
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout
connect: dial: dial tcp4 ...:...: i/o timeout

@issuefiler issuefiler changed the title Double-sinking identical torrents (race condition) Is calling “DoesTorrentExist” twice on a torrent necessary? May 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant