Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dogecoin node has stopped syncing #3394

Open
Swornatoya opened this issue Jan 2, 2024 · 24 comments
Open

Dogecoin node has stopped syncing #3394

Swornatoya opened this issue Jan 2, 2024 · 24 comments
Labels
triage Issues under investigation

Comments

@Swornatoya
Copy link

Swornatoya commented Jan 2, 2024

The node has stopped syncing with the below logs:

2024-01-02 05:21:41 connect() to 207.180.214.77:22556 failed after select(): Connection refused (111)
2024-01-02 05:21:53 connect() to 185.193.17.199:22556 failed after select(): Connection refused (111)
2024-01-02 05:23:49 version handshake timeout from 17374
2024-01-02 05:24:02 connect() to 24.95.90.120:22556 failed after select(): Connection refused (111)
2024-01-02 05:24:06 version handshake timeout from 17375
2024-01-02 05:24:29 version handshake timeout from 17376
2024-01-02 05:26:27 connect() to 164.92.248.247:22556 failed after select(): Connection refused (111)
2024-01-02 05:26:37 version handshake timeout from 17377
2024-01-02 05:27:14 connect() to 209.97.133.232:22556 failed after select(): Connection refused (111)
2024-01-02 05:27:15 connect() to 139.162.9.238:22556 failed after select(): Connection refused (111)
2024-01-02 05:27:16 version handshake timeout from 17378
2024-01-02 05:27:28 version handshake timeout from 17379
2024-01-02 05:27:29 version handshake timeout from 17380
2024-01-02 05:27:43 connect() to 159.69.55.240:22556 failed after select(): Connection refused (111)
2024-01-02 05:27:58 version handshake timeout from 17381
2024-01-02 05:28:04 version handshake timeout from 17382
2024-01-02 05:28:23 connect() to 87.249.132.174:22556 failed after select(): Connection refused (111)
2024-01-02 05:28:23 connect() to 191.96.37.131:22556 failed after select(): Connection refused (111)

Below is the docker compose file we are using to start the node:

    node-dogecoin:
        image: dogecoin:latest
        container_name: dogecoin_container
        restart: always
        ports:
            - "3332:22555"
            - "22556:22556"
        environment:
            PRINTTOCONSOLE: 1
            RPCUSER: "****"
            RPCPASSWORD: "***"
        volumes:
            - /***/node-dogecoin/data:/dogecoin/.dogecoin
            - /***/node-dogecoin/dogecoin-data:/data
        command:
            - "-onlynet=ipv4"
            - "-reindex"
            - "-txindex"
            - "-rpcworkqueue=128"
            - "-rpcthreads=128"
            - "-dbcache=1024"
            - "-rpcallowip=0.0.0.0/0"
            - "-rpcbind=127.0.0.1"
            - "-rpcbind=dogecoin_container"
            - "-blocksonly=1"

Verified and the ports are not blocked. Please help resolve this issue

@DemonRx
Copy link
Contributor

DemonRx commented Jan 2, 2024

Do you see any connections when you run:

netstat -antp | grep -i doge ?

@Swornatoya
Copy link
Author

@DemonRx : I tried running the above command but could not see any response.

@DemonRx
Copy link
Contributor

DemonRx commented Jan 2, 2024

@DemonRx : I tried running the above command but could not see any response.

The daemon isn't running

@Swornatoya
Copy link
Author

@DemonRx : Could you please let me know how can we start running the daemon. And earlier just with the above setup the node was syncing fine.

@patricklodder
Copy link
Member

if you pass -reindex into the default startup command, every time you start up your node will parse the entire chain and reindex it. So you'll want to drop that.

@Swornatoya
Copy link
Author

@patricklodder : I tried re-running the node again by removing the reindex but that did not help the node to sync and i still see the same error logs as mention above.

@patricklodder patricklodder added the triage Issues under investigation label Jan 2, 2024
@patricklodder
Copy link
Member

Alright, let's triage this step by step. For starters:

  1. What dogecoind version are you using in your docker image and from what source? Can you share a Dockerfile so that we can try to reproduce your image?
  2. What does the head of the log say when you start the node - i.e. the first 100 lines?
  3. What's the output of dogecoin-cli getnetworkinfo?

@Swornatoya
Copy link
Author

  1. We are using the below Dockerfile

FROM debian:latest

ARG VERSION=latest
ENV USER=dogecoin
ENV DATADIR=/${USER}/.dogecoin

# Update root env, to behave like user when using executables
ENV PATH=$PATH:/${USER}/bin
ENV HOME=/${USER}

RUN apt update

# Install dogecoin from github releases (https://github.com/dogecoin/dogecoin/releases)
RUN apt install -y wget gosu python3 man

RUN wget https://github.com/dogecoin/dogecoin/releases/download/v${VERSION}/dogecoin-${VERSION}-x86_64-linux-gnu.tar.gz && \ mkdir ${USER} && \ tar -xvf dogecoin-${VERSION}-x86_64-linux-gnu.tar.gz -C ${USER} --strip-components=1

# User configuration
WORKDIR /${USER}

RUN adduser ${USER} --home /${USER} --disabled-password --gecos "" && \ chown -R ${USER}:${USER} .

COPY docker-entrypoint.py /entrypoint.py

ENTRYPOINT ["python3", "/entrypoint.py"]

  1. Logs on starting the node

2024-01-02 16:58:48 Dogecoin version v1.14.5.0-31afd1331 2024-01-02 16:58:48 InitParameterInteraction: parameter interaction: -blocksonly=1 -> setting -whitelistrelay=0 2024-01-02 16:58:48 Assuming ancestors of block 77e3f4a4bcb4a2c15e8015525e3d15b466f6c022f6ca82698f329edef7d9777e have valid signatures. 2024-01-02 16:58:48 ParameterInteraction: parameter interaction: -blocksonly=1 -> setting -walletbroadcast=0 2024-01-02 16:58:48 Default data directory /dogecoin/.dogecoin 2024-01-02 16:58:48 Using data directory /dogecoin/.dogecoin 2024-01-02 16:58:48 Using config file /dogecoin/.dogecoin/dogecoin.conf 2024-01-02 16:58:48 Using at most 125 automatic connections (1048576 file descriptors available) 2024-01-02 16:58:48 Using 32 MiB out of 32 requested for signature cache, able to store 1048576 elements 2024-01-02 16:58:48 Using 4 threads for script verification 2024-01-02 16:58:48 scheduler thread start 2024-01-02 16:58:48 HTTP: creating work queue of depth 128 2024-01-02 16:58:48 Config options rpcuser and rpcpassword will soon be deprecated. Locally-run instances may remove rpcuser to use cookie-based auth, or may be replaced with rpcauth. Please see share/rpcuser for rpcauth auth generation. 2024-01-02 16:58:48 HTTP: starting 128 worker threads 2024-01-02 16:58:48 Using BerkeleyDB version Berkeley DB 5.3.28: (September 9, 2013) 2024-01-02 16:58:48 Using wallet wallet.dat 2024-01-02 16:58:48 init message: Verifying wallet... 2024-01-02 16:58:48 CDBEnv::Open: LogDir=/dogecoin/.dogecoin/database ErrorFile=/dogecoin/.dogecoin/db.log 2024-01-02 16:58:48 Bound to [::]:22556 2024-01-02 16:58:48 Bound to 0.0.0.0:22556 2024-01-02 16:58:48 Cache configuration: 2024-01-02 16:58:48 * Using 128.0MiB for block index database 2024-01-02 16:58:48 * Using 8.0MiB for chain state database 2024-01-02 16:58:48 * Using 888.0MiB for in-memory UTXO set (plus up to 286.1MiB of unused mempool space) 2024-01-02 16:58:48 init message: Loading block index... 2024-01-02 16:58:48 Opening LevelDB in /dogecoin/.dogecoin/blocks/index 2024-01-02 16:58:48 Opened LevelDB successfully 2024-01-02 16:58:48 Using obfuscation key for /dogecoin/.dogecoin/blocks/index: 0000000000000000 2024-01-02 16:58:48 Opening LevelDB in /dogecoin/.dogecoin/chainstate 2024-01-02 16:58:48 Opened LevelDB successfully 2024-01-02 16:58:48 Using obfuscation key for /dogecoin/.dogecoin/chainstate: 6528c669df551b38

  1. When I run the command dogecoin-cli getnetworkinfo from inside the container. I get the below response
    error: Could not locate RPC credentials. No authentication cookie could be found, and no rpcpassword is set in the configuration file (/dogecoin/.dogecoin/dogecoin.conf)

@patricklodder
Copy link
Member

Dockerfile

Is the entry point file taken from https://github.com/dogecoin/docker/blob/main/1.14.5/bullseye/entrypoint.py or somewhere else?

No authentication cookie could be found, and no rpcpassword is [..]

you'll probably need to add -rpcuser=<RPCUSER> and -rpcpassword=<RPCPASSWORD> to your cli command (before getnetworkinfo).

@Swornatoya
Copy link
Author

@patricklodder : Below is the response after running the command dogecoin-cli -rpcuser=** -rpcpassword=** getnetworkinfo from the dogecoin container

{ "version": 1140500, "subversion": "/Shibetoshi:1.14.5/", "protocolversion": 70015, "localservices": "0000000000000005", "localrelay": false, "timeoffset": 0, "networkactive": true, "connections": 1, "networks": [ { "name": "ipv4", "limited": false, "reachable": true, "proxy": "", "proxy_randomize_credentials": false }, { "name": "ipv6", "limited": true, "reachable": false, "proxy": "", "proxy_randomize_credentials": false }, { "name": "onion", "limited": true, "reachable": false, "proxy": "", "proxy_randomize_credentials": false } ], "relayfee": 0.00100000, "incrementalfee": 0.00010000, "softdustlimit": 0.01000000, "harddustlimit": 0.00100000, "localaddresses": [ ], "warnings": "" }

The entrypoint and Dockerfile were taken from https://github.com/dogecoin/dogecoin/pull/2337/files

@Swornatoya
Copy link
Author

@patricklodder : Any update on this?

@DemonRx
Copy link
Contributor

DemonRx commented Jan 4, 2024

@patricklodder : Below is the response after running the command dogecoin-cli -rpcuser=** -rpcpassword=** getnetworkinfo from the dogecoin container

{ "version": 1140500, "subversion": "/Shibetoshi:1.14.5/", "protocolversion": 70015, "localservices": "0000000000000005", "localrelay": false, "timeoffset": 0, "networkactive": true, "connections": 1, "networks": [ { "name": "ipv4", "limited": false, "reachable": true, "proxy": "", "proxy_randomize_credentials": false }, { "name": "ipv6", "limited": true, "reachable": false, "proxy": "", "proxy_randomize_credentials": false }, { "name": "onion", "limited": true, "reachable": false, "proxy": "", "proxy_randomize_credentials": false } ], "relayfee": 0.00100000, "incrementalfee": 0.00010000, "softdustlimit": 0.01000000, "harddustlimit": 0.00100000, "localaddresses": [ ], "warnings": "" }

The entrypoint and Dockerfile were taken from https://github.com/dogecoin/dogecoin/pull/2337/files

Looks like you're online since the daemon responded. Check if the blocks are synced against an explorer

@patricklodder
Copy link
Member

Looks like you're online since the daemon responded

Yes, that's also what log said, but there is only 1 connection, which can be (and based on logs is likely) pending handshake. So that's no bueno.

I'm still trying to reproduce.

@DemonRx
Copy link
Contributor

DemonRx commented Jan 6, 2024

@Swornatoya

### Retrieve list of Active Dogecoin Nodes and insert into dogecoin.conf ###
### Prepends "addnode=" to each IP ###


#! /bin/bash
wget -q https://api.blockchair.com/dogecoin/nodes -O - | egrep -o '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' > dogecoinNodeList.txt && sed -i "s/^/addnode=/" dogecoinNodeList.txt 
cat dogecoinNodeList.txt | tee -a  <add/path/to/dogecoin.conf>

Here's a simple script you can poll the blockchair API for current list of active Doge nodes and add them to your dogecoin.conf

Make sure to change it to the path your conf file is in, or just manually add it.

@patricklodder
Copy link
Member

patricklodder commented Jan 6, 2024

Here's a simple script you can poll the blockchair API for current list of active Doge nodes and add them to your dogecoin.conf

Recommend against this because this gives a third party the power to eclipse attack your node - also see some of the discussion in #3253.

Have you found a problem with the usual peer discovery methods? The posted log doesn't show that there's an issue with that.

Edit: after building the Dockerfile from above with v1.14.5.0-31afd1331 and with parameters from above, omitting -reindex=1 and -rpcbind directives, adding -dnsseed=0, I still get connections because there are, despite being over 2 years old, still multiple valid fixed seeds in 1.14.5. This is definitely not a seed error.

@patricklodder
Copy link
Member

Any update on this?

@Swornatoya I am after many attempts unable to reproduce this on any of my (x86_64) systems with 1.14.5.

  1. Have you eliminated Docker's bridge configuration and/or any firewall rules that would prevent bytes from coming in? Are you able to reproduce this issue on another host too?
  2. You could rename peers.dat from the datadir, if this helps it would mean that you have a corrupted address database (which could have been a malicious effort; note that it would be a lot safer to use 1.14.6 for this, because it has additional protection against eclipse attacks.)

@Swornatoya
Copy link
Author

@patricklodder : I tried renaming the peers.dat but that did not help.
I was trying to setup a new node and placed the Dockerfile and dockerentrypoint.py and then tried to build the image with command : sudo docker build ./ -t dogecoin:v1.14.5 and also I passed the version as 1.14.5 in the Dockerfile I getting the below errors:

#0 1.208  28200K .......... .......... .......... .......... .......... 99% 21.4M 0s
#0 1.211  28250K .......... .......... .......... .......... .......... 99%  193M 0s
#0 1.211  28300K .......... .......... .......... .......... .......... 99%  206M 0s
#0 1.211  28350K .......... ......                                     100%  259M=0.2s
#0 1.211
#0 1.211 2024-02-01 09:34:30 (132 MB/s) - 'dogecoin-1.14.5-x86_64-linux-gnu.tar.gz' saved [29046800/29046800]
#0 1.211
#0 1.323 sha256sum: dogecoin-1.14.5-x86_64-linux-gnu.tar.gz: FAILED
#0 1.323 WARNING: 1 computed checksum did NOT match
------
Dockerfile:28
--------------------
  27 |     # see https://github.com/docker-library/official-images#security
  28 | >>> RUN cd /tmp && \
  29 | >>>     wget https://github.com/dogecoin/dogecoin/releases/download/v${VERSION}/dogecoin-${VERSION}-x86_64-linux-gnu.tar.gz && \
  30 | >>>     echo "${VERSION_HASH} dogecoin-${VERSION}-x86_64-linux-gnu.tar.gz" | sha256sum -c && \
  31 | >>>     tar -xvf dogecoin-${VERSION}-x86_64-linux-gnu.tar.gz --strip-components=1 && \
  32 | >>>     cp share/man/man1/*.1 /usr/share/man/man1 && \
  33 | >>>     cp bin/dogecoin* /usr/local/bin && \
  34 | >>>     chown ${USER}:${USER} /usr/local/bin/dogecoin* && \
  35 | >>>     chmod 4555 /usr/local/bin/dogecoin* && \
  36 | >>>     rm -rf /tmp/*
  37 |
--------------------
ERROR: failed to solve: process "/bin/sh -c cd /tmp &&     wget https://github.com/dogecoin/dogecoin/releases/download/v${VERSION}/dogecoin-${VERSION}-x86_64-linux-gnu.tar.gz &&     echo \"${VERSION_HASH} dogecoin-${VERSION}-x86_64-linux-gnu.tar.gz\" | sha256sum -c &&     tar -xvf dogecoin-${VERSION}-x86_64-linux-gnu.tar.gz --strip-components=1 &&     cp share/man/man1/*.1 /usr/share/man/man1 &&     cp bin/dogecoin* /usr/local/bin &&     chown ${USER}:${USER} /usr/local/bin/dogecoin* &&     chmod 4555 /usr/local/bin/dogecoin* &&     rm -rf /tmp/*" did not complete successfully: exit code: 1

Please let me know if I am missing anything here. I copied the Dockerfile and dockerentrypoint.py from https://github.com/dogecoin/dogecoin/blob/a21cc8870c14b752cf75b9f8b356c84d358c8ef1/contrib/docker/Dockerfile
Nothing is being blocked by the firewall and also getting the same error with 1.14.6

@Dogihub
Copy link

Dogihub commented Feb 22, 2024

Hi all,
I am facing exactly the same issue. My node sync seems stopped on the block height 4954439 each time

I am using a configuration (Dockerfile & docker-compose) that has works many times on some previous VPS i had (ubuntu) but with my last VPS (ubuntu also) it is totally stuck on the same block each time (i have retried the operation many times, i have tried to delete the peers.dat and rerun the node but nothing change)

Here are the logs, i have taken 2 differents screenshots showing the same results (sync stopped on block height 4954439)

image

image

my dockerfile :


FROM debian:latest
ARG version
RUN apt update && apt upgrade -y && apt install -y curl gpg ca-certificates tar dirmngr
RUN curl -o dogecoin.tar.gz -Lk https://github.com/dogecoin/dogecoin/releases/download/v${version}/dogecoin-${version}-x86_64-linux-gnu.tar.gz
RUN tar -xvf dogecoin.tar.gz
RUN rm dogecoin.tar.gz
RUN install -m 0755 -o root -g root -t /usr/local/bin dogecoin-${version}/bin/*
EXPOSE 22555
EXPOSE 22556
CMD ["dogecoind", "-printtoconsole"]
LABEL name="dogecoin-node" version="${version}" description="Dogecoin fullnode container based off Debian"

my docker-compose file :

version: '3.8'

services:
  dogecoin-node:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        version: '1.14.6'
    image: dogecoin-node:latest
    ports:
      - "22555:22555"
      - "22556:22556"
    volumes:
      - /app/dogecoin-node/volume:/root/.dogecoin
      - /app/dogecoin-node/dogecoin.conf:/root/.dogecoin/dogecoin.conf
    command: ["dogecoind", "-conf=/root/.dogecoin/dogecoin.conf"]
    restart: unless-stopped

my dogecoin.conf :

printtoconsole=1
server=1
listen=1
paytxfee=0.01
maxconnections=150
rpcuser=user
rpcpassword=pass
rpcport=22555
maxmempool=8000
rpcworkqueue=100
txindex=1
rpcallowip=127.0.0.1
dbcache=16384

Do you know how to unlock the situation ? i am on it for few days
thanks for your help

@patricklodder
Copy link
Member

Do you know how to unlock the situation ?

I just did a full mainnet sync myself with v1.14.6.0-3a29ba6d4 (but using lxc for containerization and not docker) and it syncs up:

2024-02-22 16:42:15 UpdateTip: new best=263730509c5fb3d74a425158e8dc0072cd6f531f158cf3853a9557e891f42fea
height=5100780 version=0x00620104 log2_work=75.930246 tx=270882355 date='2024-02-22 16:41:32' 
progress=1.000000 cache=54.1MiB(121852tx)

But it's odd that 2 different dockerfiles have a similar issue.

If you can specify:

  1. Base OS version. On Ubuntu: sudo lsb_release -r
  2. Docker version: docker --version
  3. The contents of /etc/docker/daemon.json if there is any (if there is none, also fine)

Then I can try to reproduce these errors.

@Dogihub
Copy link

Dogihub commented Feb 22, 2024

Thanks @patricklodder for your attention.

On my side :

  • OS : Ubuntu Release: 22.04
  • Docker version : Docker version 24.0.5, build 24.0.5-0ubuntu1~22.04.1
  • Daemon.json : don't have this file on the system

Another information : when the sync is stuck with the log i have showed, it is impossible to stop the node properly. I was connected in the container direclty with "docker exec -it ..." and i had run the "dogecoin-cli stop" command.

The dogecoind process continue to run but never stop by itself after 30 min. After that i have to force the container to stop using "docker stop" and when i have run the node again using the "docker-compose up" command, the sync has restarted at 80% (it was at more than 98% when the sync was stopped so it is really frustrating)

And each time the sync stopped on the block height 4954439.

I really don't understand what happened

@patricklodder
Copy link
Member

I will set up a fresh ubuntu 22.04 and test overnight.

@patricklodder
Copy link
Member

@Dogihub I cannot reproduce your error; I synched past 4954439 with your Dockerfile:

2024-02-23 11:35:23 UpdateTip: new best=3ac8ddaea330e0ef5872cd659e628ea345bcab516dbc0b8fbc8da9c5b6ea4573 height=5101840 version=0x00620104 log2_work=75.931572 tx=272134307 date='2024-02-23 11:35:03' progress=1.000000 cache=1715.4MiB(7272205tx)

I was able to reproduce an OOM situation due to this line in your config (I read over it initially), but this probably didn't happen for you:

maxmempool=8000

Having a 16GB dbcache + an 8GB mempool means you need at least 28 GB RAM available. (The maximum effective queue size for most of the network is 24 * 60 * 1MB = 1440 MB so an 8GB mempool with default expiry makes no sense as a setting, because it'll just expire any transactions aged 24h after an inflow volume of 1440MB/day, in the best case). For me, because I ran a 20GB RAM host with no swap, dogecoind got OOM killed around block 4.5M when my cache started taking the extra space from mempool.

However: If you have swap space on your host, this is where things may get really slow and dogecoind may become unresponsive.

All in all, I think that:

  1. Your issue is not related to the one from OP. The outcome may look the same, but the cause is probably different.
  2. You need to tune your memory settings a bit: 16GB dbcache can work fine (I run this on large production nodes too) but I'd recommend just keeping the default 300MB mempool, or, if you're looking at retaining unconfirmed historical transactions for a specific use-case, increment mempoolexpiry by 1 for every 60 MB you add to maxmempool over 1440MB. Please make sure you have enough available RAM, at least dbcache + maxmempool + 4GB, and make sure your host doesn't ever swap the cache.

@Dogihub
Copy link

Dogihub commented Feb 23, 2024

@patricklodder thanks for your answer.
My VPS has 120GO of RAM so i was thinking it was widely enough but i take your advice and i am trying right now with the following dogecoin.conf (in fact i have just removed the maxmempool parameter to use the default one):

printtoconsole=1
server=1
listen=1
paytxfee=0.01
maxconnections=150
rpcuser=user
rpcpassword=pass
rpcport=22555
rpcworkqueue=100
txindex=1
rpcallowip=127.0.0.1
dbcache=16384

I keep you in touch

@Dogihub
Copy link

Dogihub commented Feb 24, 2024

@patricklodder unfortunatly it seems to be again the same things on another block height this time :
image

Logs is stuck on this for 3 hours now

I just try to stop the process properly with dogecoin-cli stop and log is now stuck on this :

dogecoin-node_1  | 2024-02-24 07:07:10 tor: Thread interrupt
dogecoin-node_1  | 2024-02-24 07:07:10 addcon thread exit
dogecoin-node_1  | 2024-02-24 07:07:10 torcontrol thread exit
dogecoin-node_1  | 2024-02-24 07:07:10 scheduler thread interrupt

I force the container stop by using "docker stop" :

root@vmi1632568:/app/dogecoin-node# docker stop 237e42da79f7
Error response from daemon: cannot stop container: 237e42da79f7: tried to kill container, but did not receive an exit event

I just run another try without the dbcache parameter this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Issues under investigation
Projects
None yet
Development

No branches or pull requests

4 participants