Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handoff to regular replication does not work? #39

Open
nicozimmermann94 opened this issue Dec 17, 2015 · 16 comments
Open

Handoff to regular replication does not work? #39

nicozimmermann94 opened this issue Dec 17, 2015 · 16 comments

Comments

@nicozimmermann94
Copy link

Hey!
I have a remote db with 50,000 docs. When the user starts the webapp for the first time he has to download all docs, so I use pouchdb-load together with express-pouchdb-replication-stream to deliver the stream. The import works but after importing and then enable syncing, pouchdb makes 2000+ requests which takes several minutes. I guess it checks each document, so the proxy function does not work as intended. Here is my code:
http://pastebin.com/GciHNH8u
And here is the sync function:
http://pastebin.com/B2s65X9k

@nicozimmermann94
Copy link
Author

Found the error!
I had to add since: 'now' to my sync options!

@nolanlawson
Copy link
Member

That doesn't sound right. You shouldn't need to do that. I'm going to reopen this issue because it sounds like we have a regression.

@willholley
Copy link

I can't easily reproduce this. @nicozimmermann94 is the database that you're generating the dump from the same one you're syncing to (exactly the same URL)?

@nicozimmermann94
Copy link
Author

@willholley Yes it was the exact same URL.
EDIT:
Well not the exact same URL: on the same server I ran a node instance which used localhost instead of the domain I used on the client side. Here is the server code: http://pastebin.com/NLpnPiuj

@willholley
Copy link

@nicozimmermann94 I attempted to reproduce this using the code you posted and a local CouchDB 1.6.1 instance (via the proxy you posted). I still don't see a problem with handoff. Are you able to share a HAR / network trace of the issue occurring?

@iamyojimbo
Copy link

Yes exactly the same setup for me and adding since: 'now' fixed it for me too.

@pocketarc
Copy link

Just went through this; what seems to happen is that it has to write the remote checkpoint for every change, so when loading 50K documents it has to make tens of thousands of requests. It's all just requests to _local/[random id] and _changes.

It's not actually trying to do redo any documents/changes, just writing the remote checkpoint thousands of times. Seems like this could be solved by only putting the latest checkpoint once and stopping there, instead of starting from scratching and adding checkpoints one by one. I figured that's what proxy would do.

The since option does it, and instead of "now" I used remote_db.info() to get the remote update_seq and use that as the value for since (because my dumps are not live, and may be slightly out of date by the time a user goes to import them).

That is a perfect workaround for me, but it seems like proxy could handle it by itself. If this is something @nolanlawson is interested in resolving, I'd love a chance to go in, sort it out, and come up with a PR for it.

@willholley
Copy link

@BrunoDEBARROS a checkpoint for every change sounds wrong - the since value should ensure those existing documents are skipped when the replication starts. If it's re-comparing the documents that pouchdb-load inserted, that implies the checkpoint used to create the dump isn't valid for the remote database. This is certainly possible - sequence values are unique to a database instance so this kind of thing could happen if the dump was created using different parameters / source databases to the one you're ultimately syncing against.

What environment are you syncing against and how are you initiating the load/sync?

@pocketarc
Copy link

pocketarc commented Jul 26, 2017

If you add the since, the problem goes away; that works as expected. It's just that without it, it does trigger the remote checkpoints, as it did for the two other people in this issue. PouchDB is not re-comparing any documents.

It's just that without the since, with just proxy, it does the remote checkpointing for every document. @nolanlawson said that that seemed wrong, so I figured I'd throw in my 2 cents and offer to help.


Having said that, here's the requested info:

To create the dump, I just pouchdb-dump https://couchdb.example.com/my_db > dump.txt. The server is a CouchDB 2.0.0 instance.

To load it on the browser, I just do:

db = new PouchDB('my_db');
db.load("https://example.com/dump.txt", {proxy: "https://couchdb.example.com/my_db"})

The CouchDB server itself is hidden behind nginx, following these instructions.

@willholley
Copy link

thanks @BrunoDEBARROS. So it's possible that the checkpoint being written to CouchDB is incorrect / not being picked up. The other point to make is that the handoff only valid for remote -> local replication, so if local -> remote replication is initiated (or db.sync is used) then all the local documents will need to be compared with the remote.

@pocketarc
Copy link

In my code I do db.sync(remote), not remote.sync(db), so after reading your message, I swapped them around to see if it'd work, and I'm afraid the issue is still there.

If this doesn't seem worth chasing, it doesn't matter; the since option resolves it. But it does seem odd that it's not being picked up correctly. The DB hasn't been wiped since the dump was generated, so there doesn't seem to be any reason for things not to match up. When I have a bit of free time I'm going to try and look into this, just to see if I can pin down the why or if it's something wrong on my end.

@willholley
Copy link

@BrunoDEBARROS the issue is sync vs replicate. Sync is essentially shorthand for local.replicate.to(remote); and local.replicate.from(remote). Replication is inherently uni-directional, so sync requires independent local -> remote and remote -> local replications in parallel. pouchdb-load only provides handoff for the remote -> local component, so you need to explicitly provide a since value to the local -> remote replication in order to skip comparison of all the local documents with the remote.

@pocketarc
Copy link

That makes perfect sense, and also explains why myself and the other posters were having problems; we're all doing sync (OP says they're doing sync and even includes source code) and stumbled across this. Regarding local.replicate.from(remote) though, it's important to note that that is explicitly mentioned as being OK in the documentation.

Thanks for helping figure out what's going on though! 👍 Seems like this is just a matter of updating the documentation then, since it doesn't mention anything but proxy, and anyone trying to do sync would be bit by this.

@nerumo
Copy link

nerumo commented Sep 20, 2017

If I use the since parameter in local -> remote, none of my changes on local get synced anymore. It only does it the comparison went through. I tried now and the lastUpdateSeq from the remote_db.info(). Is it relevant if stated local.replicate.to(remote) or remote.replicate.from(local)?

@nerumo
Copy link

nerumo commented Sep 21, 2017

Ok, I found two issues:

  • the filter of the pouchdb load wasn't the exact same as in the remote->local operation. After that, the handover worked
  • the local->remote operation wasn't working with the remote last_update_seq, it uses the local last_update_seq to work. So I have to remember local last_update_seq after the pouchdb load and apply it as since param for every local->remote update. btw 'now' can't work, since it wouldn't pick up your latest local change that isn't synced to the remote, it just seems to work.

suggestion: the pouchdb load should also write the local->remote checkpoint if the proxy option is specified (or even a separate option)

@saadramay
Copy link

is there any update regarding this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants