Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FD-862 After factomd restart, loading blocks stucks while committing entry #663

Open
ilzheev opened this issue Mar 5, 2019 · 20 comments
Open

Comments

@ilzheev
Copy link
Contributor

ilzheev commented Mar 5, 2019

Have:
fully synced factomd node, v6.2.0, mainnet, default config

Reproduce issue:
Restart factomd
Wait until factomd API started, but node is still in ignore (I waited 1-2 mins after restart)
Commit entry

Issue:

  1. DBlockHeight, EntryHeight, EntryBlockHeight stop progressing
    e.g. (DBlock=181339/182487, EntryBlock=181339/181339)
    i.e. Leader block (182487) shown as latest factom block, but all other heights are stucked immediately after commit action
  2. Node can't go out of ignore.
    Waiting about 40 minutes, node is still in ignore, heights are not progressed

Additional information 1:
Tested without factomd.conf file
Tested with factomd.conf file (default from this repo)
Tested with startdelay=600 & startdelay=0 flags
Issue exists

Additional information 2:
Used FactomProject/factom lib to commit entry.
Code is correct & works as expected on fully synced factomd node — creates entry in the chain.
Don't know if it's important, but here are used factom lib functions:

	// commit entry
	commitResult, err := factom.CommitEntry(entry, c.GetEC())
	if err != nil {
		return nil, err
	}
	log.Info("Commit: ", commitResult)

	// reveal entry
	revealResult, err := factom.RevealEntry(entry)
	if err != nil {
		return nil, err
	}
	log.Info("Reveal: ", revealResult)
@ilzheev ilzheev changed the title After factomd restart, loading blocks stucks when commit entry After factomd restart, loading blocks stucks while committing entry Mar 5, 2019
@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 5, 2019

Bug reproduced 10 times from 10 attempts.

@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 5, 2019

There are logs from factomd, nothing strange:

//////////////////////// Copyright 2017 Factom Foundation
//////////////////////// Use of this source code is governed by the MIT
//////////////////////// license that can be found in the LICENSE file.
Arguments
 &{AckbalanceHash:true EnableNet:true WaitEntries:false ListenTo:0 Cnt:1 Net:tree Fnet: DropRate:0 Journal: Journaling:false Follower:false Leader:true Db: CloneDB: PortOverride:0 Peers: NetworkName: NetworkPortOverride:0 ControlPanelPortOverride:0 LogPort:6060 BlkTime:0 FaultTimeout:0 RuntimeLog:false Exclusive:false ExclusiveIn:false Prefix: Rotate:false TimeOffset:0 KeepMismatch:false StartDelay:0 Deadline:1000 CustomNet:[227 176 196 66] CustomNetName: RpcUser: RpcPassword: FactomdTLS:false FactomdLocations: MemProfileRate:524288 Fast:true FastLocation: Loglvl:none Logjson:false Svm:false PluginPath: TorManage:false TorUpload:false Sim_Stdin:true ExposeProfiling:false UseLogstash:false LogstashURL:localhost:8345 Sync2:-1 DebugConsole: StdoutLog: StderrLog: DebugLogRegEx:faulting|badMsgs ConfigPath:/root/.factom/private/factomd.conf CheckChainHeads:true FixChainHeads:true ControlPanelSetting: WriteProcessedDBStates:true}
Go compiler version: go1.10.8
Using build: bd587f91433b766aa34126006759bd4a846b71a3
Version: 6.2.0
Start time: 2019-03-05 17:13:37.936799403 +0000 UTC m=+0.036482405
factom config: /root/.factom/private/factomd.conf
Reading from '/root/.factom/private/factomd.conf'
Cannot open custom config file,
Starting with default settings.
open /root/.factom/private/factomd.conf: no such file or directory



Network : MAIN
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Net Sim Start!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Listening to Node 0
>>>>>>>>>>>>>>>>
The Instance ID of this node is 83905bcd23c57a39
Database:/root/.factom/m2/main-database/ldb/MAIN/factoid_level.dbtime="2019-03-05T17:13:38Z" level=info msg="Checking Chainheads starting at height: 182480" tool=chainheadtool
time="2019-03-05T17:13:39Z" level=info msg="Currently on 180000 out of 182480 at 1778.086p/s. 3707 Eblocks, 0 done. 204 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:41Z" level=info msg="Currently on 175000 out of 182480 at 2155.636p/s. 15943 Eblocks, 0 done. 1042 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:43Z" level=info msg="Currently on 170000 out of 182480 at 2337.912p/s. 24673 Eblocks, 0 done. 1810 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:45Z" level=info msg="Currently on 165000 out of 182480 at 2472.552p/s. 32424 Eblocks, 0 done. 2696 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:46Z" level=info msg="Currently on 160000 out of 182480 at 2623.805p/s. 40746 Eblocks, 0 done. 3477 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:48Z" level=info msg="Currently on 155000 out of 182480 at 2741.232p/s. 47857 Eblocks, 0 done. 4442 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:49Z" level=info msg="Currently on 150000 out of 182480 at 2814.827p/s. 148967 Eblocks, 0 done. 5442 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:52Z" level=info msg="Currently on 145000 out of 182480 at 2571.979p/s. 1005639 Eblocks, 0 done. 5690 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:54Z" level=info msg="Currently on 140000 out of 182480 at 2550.434p/s. 1496724 Eblocks, 0 done. 5885 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:56Z" level=info msg="Currently on 135000 out of 182480 at 2634.115p/s. 1766299 Eblocks, 0 done. 6054 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:57Z" level=info msg="Currently on 130000 out of 182480 at 2738.531p/s. 2005045 Eblocks, 0 done. 6203 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:58Z" level=info msg="Currently on 125000 out of 182480 at 2836.565p/s. 2235716 Eblocks, 0 done. 6506 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:13:59Z" level=info msg="Currently on 120000 out of 182480 at 2951.970p/s. 2432880 Eblocks, 0 done. 6693 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:00Z" level=info msg="Currently on 115000 out of 182480 at 3063.871p/s. 2603931 Eblocks, 0 done. 7306 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:00Z" level=info msg="Currently on 110000 out of 182480 at 3182.600p/s. 2746262 Eblocks, 0 done. 7310 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:01Z" level=info msg="Currently on 105000 out of 182480 at 3289.689p/s. 2909878 Eblocks, 0 done. 7314 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:02Z" level=info msg="Currently on 100000 out of 182480 at 3386.983p/s. 3086011 Eblocks, 0 done. 7318 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:03Z" level=info msg="Currently on 95000 out of 182480 at 3494.401p/s. 3242869 Eblocks, 0 done. 7542 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:04Z" level=info msg="Currently on 90000 out of 182480 at 3547.424p/s. 3587980 Eblocks, 0 done. 7559 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:05Z" level=info msg="Currently on 85000 out of 182480 at 3543.814p/s. 4166422 Eblocks, 0 done. 7616 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:07Z" level=info msg="Currently on 80000 out of 182480 at 3548.408p/s. 4717214 Eblocks, 0 done. 7713 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:07Z" level=info msg="Currently on 75000 out of 182480 at 3626.052p/s. 4976833 Eblocks, 0 done. 7721 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:08Z" level=info msg="Currently on 70000 out of 182480 at 3741.594p/s. 5053888 Eblocks, 0 done. 7760 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:09Z" level=info msg="Currently on 65000 out of 182480 at 3740.846p/s. 5661173 Eblocks, 0 done. 7768 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:10Z" level=info msg="Currently on 60000 out of 182480 at 3755.707p/s. 6230977 Eblocks, 0 done. 7786 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:11Z" level=info msg="Currently on 55000 out of 182480 at 3841.334p/s. 6454255 Eblocks, 0 done. 7812 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:11Z" level=info msg="Currently on 50000 out of 182480 at 3931.549p/s. 6618841 Eblocks, 0 done. 7827 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:12Z" level=info msg="Currently on 45000 out of 182480 at 4020.902p/s. 6766523 Eblocks, 0 done. 7830 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:12Z" level=info msg="Currently on 40000 out of 182480 at 4104.880p/s. 6938645 Eblocks, 0 done. 7835 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:13Z" level=info msg="Currently on 35000 out of 182480 at 4187.271p/s. 7092400 Eblocks, 0 done. 7841 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:13Z" level=info msg="Currently on 30000 out of 182480 at 4271.561p/s. 7201084 Eblocks, 0 done. 7846 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:14Z" level=info msg="Currently on 25000 out of 182480 at 4354.920p/s. 7328347 Eblocks, 0 done. 7852 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:14Z" level=info msg="Currently on 20000 out of 182480 at 4447.483p/s. 7382601 Eblocks, 0 done. 7860 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:15Z" level=info msg="Currently on 15000 out of 182480 at 4534.830p/s. 7440824 Eblocks, 0 done. 7872 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:15Z" level=info msg="Currently on 10000 out of 182480 at 4621.488p/s. 7490706 Eblocks, 0 done. 7890 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:15Z" level=info msg="Currently on 5000 out of 182480 at 4717.715p/s. 7497230 Eblocks, 0 done. 7895 ChainHeads so far. 0 Are bad" tool=chainheadtool
time="2019-03-05T17:14:16Z" level=info msg="7905 Chains found in 37.923265 seconds" tool=chainheadtool
time="2019-03-05T17:14:16Z" level=info msg="Chainhead Check Complete. 0 Errors corrected while checking for bad heads" tool=chainheadtool
               Build bd587f91433b766aa34126006759bd4a846b71a3
         balancehash true
        FNode 0 Salt 83905bcd23c57a39
           enablenet true
         waitentries false
                node 0
              prefix 
          node count 1
            net spec "tree"
         Msgs droped 0
             journal ""
            database "LDB"
 database for clones ""
               peers ""
           exclusive "false"
        exclusive_in "false"
          block time 6
          runtimeLog false
              rotate false
          timeOffset 0
        keepMismatch false
          startDelay 0
             Network MAIN
           customnet e3b0c442 ()
       deadline (ms) 1000
                 tls false
            selfaddr 
             rpcuser ""
Start 2nd Sync at ht 182000
        faultTimeout 120
             rpcpass is blank
            TCP port "8088"
          pprof port "6060"
  Control Panel port "8090"
Starting API server
              FNode0 Loading Block  181000 / 182480. Blocks per second  9280.48
Starting Control Panel on http://localhost:8090/
Using NetworkName "MAIN"
              FNode0 Loading Block  182000 / 182480. Blocks per second    46.77

@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 5, 2019

Making experiment No. 2
Fully synced node restarted
DBlocks scanned till latest factom block (right now 182495)
EntryBlocks are still being scanned…
Node is in ignore.
Commit + reveal entry

Result:
Bug not reproduced in this case.

So, the problem exists only when factomd scans directory blocks after restart and new commit accidentally arrived.

@carryforward
Copy link
Contributor

hmm, can you try reproducing this on v6.2.1-rc2 @ilzheev? There have been an incredible number of changes in this area between 6.2.0 and 6.2.1. There was a monthlong delay going from 6.1.1 to 6.2.1 with getting the 2nd pass to download the blockchain. (this is what you had found using the latest docker image earlier.) Part of that fixing the blockchain download was fixing the 2nd pass catchup.

when your factomd node is in ignore mode, it will not ask for things that are missing. This means it will not download new blocks. Ignore mode will only get messages that are recieved without prompting. This is enough to get federated servers booting, but isn't really good for getting follower nodes up and running quickly. your main problem is likely that you are in ignore mode, which doesn't help much as a follower.

There is another change from 6.2.0 to 6.2.1 which you will notice. factomd no longer shows you how far behind it is when catching up with the blockchain. This was a security thing, and was prompted by some weirdness that we saw on the testnet a while back.

@carryforward
Copy link
Contributor

FD-820_release_candidate_butter...FD-824_release_candidate_kraft

Github isn't even able to show all the changes that are between 6.2.0 and 6.2.1

@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 6, 2019

Problem appears on 6.2.1-rc2 as well.
I've made a short screen record:
https://www.youtube.com/watch?v=_rE95qIQC_g

--

INFO[0009] Commit: 17f103a1290b7bb952909e15c3eacd2f545c4fe65bb4155cde6db087c870f913 
INFO[0010] Reveal: 1733aa9e5340f75ec8125c400f6bccf38db7e2230b5660624488e94bfe42313d 

--
Node stucked on 182551 height.

{"jsonrpc":"2.0","id":0,"result":{"directoryblockheight":182551,"leaderheight":182569,"entryblockheight":182551,"entryheight":182551}}

@carryforward
Copy link
Contributor

ok, this is interesting. This might be a thing. Veena is looking to see how to replicate this in a controlled environment. The tracking ticket number is FD-862. If there are any patches for this bug they would go against this branch:

https://github.com/FactomProject/factomd/tree/FD-862_boot_stall_with_API_access

@carryforward carryforward changed the title After factomd restart, loading blocks stucks while committing entry FD-862 After factomd restart, loading blocks stucks while committing entry Mar 6, 2019
@VeenaGondkar
Copy link
Contributor

VeenaGondkar commented Mar 7, 2019

@ilzheev I am trying to replicate the issue in my environment but haven't had any success yet. Meanwhile it will be helpful for us to debug further if you share debug logs with us.

You will have to start factomd with --debuglog=.* to collect these logs. so the command will look like this factomd --debuglog=.*
This will create around 25 different log files. If you can reproduce the issue and share the log files with us, it would be extremely helpful to fix the issue.

@carryforward
Copy link
Contributor

carryforward commented Mar 7, 2019

There is a set of really extensive logging that can be turned on. It saves hundreds of gigabytes of text files to the harddrive where you ran the program from. I don't know how well it works in a docker environment, but it shouldn't be too hard to change the docker file around to save these kind of files.

you can run factomd using command line flags that veena had mentioned. The parameter takes a regex to determine which logs to save. I run it like this:
factomd -stderrlog=err.txt -stdoutlog=out.txt -debuglog="."

this gets all the log files. here is a list of the file names created with one of my simple local tests.

-rw-r--r--  1 user user     1489 Mar  7 13:50 err.txt
-rw-r--r--  1 user user      252 Mar  7 13:50 fnode0_ackchange.txt
-rw-r--r--  1 user user      247 Mar  7 13:51 fnode0_apilog.txt
-rw-r--r--  1 user user   108201 Mar  7 14:11 fnode0_balancehash.txt
-rw-r--r--  1 user user  3382589 Mar  7 14:11 fnode0_dbsig-eom.txt
-rw-r--r--  1 user user   392249 Mar  7 14:11 fnode0_dbsig.txt
-rw-r--r--  1 user user  4748516 Mar  7 14:11 fnode0_dbstateprocess.txt
-rw-r--r--  1 user user  4024694 Mar  7 14:11 fnode0_duplicatesend.txt
-rw-r--r--  1 user user   682551 Mar  7 14:11 fnode0_election.txt
-rw-r--r--  1 user user   456586 Mar  7 14:11 fnode0_entrycredits_trans.txt
-rw-r--r--  1 user user 17313013 Mar  7 14:11 fnode0_entrysync.txt
-rw-r--r--  1 user user  1019204 Mar  7 14:11 fnode0_executemsg.txt
-rw-r--r--  1 user user   205648 Mar  7 14:11 fnode0_factoids_trans.txt
-rw-r--r--  1 user user   330341 Mar  7 14:11 fnode0_factoids.txt
-rw-r--r--  1 user user   906703 Mar  7 14:11 fnode0_holding.txt
-rw-r--r--  1 user user    67593 Mar  7 13:50 fnode0_inmsgqueue.txt
-rw-r--r--  1 user user   118089 Mar  7 14:11 fnode0_missing_messages.txt
-rw-r--r--  1 user user   994661 Mar  7 14:11 fnode0_msgqueue.txt
-rw-r--r--  1 user user  9760693 Mar  7 14:11 fnode0_networkoutputs.txt
-rw-r--r--  1 user user  3121283 Mar  7 14:11 fnode0_processlist.txt
-rw-r--r--  1 user user   771304 Mar  7 14:11 fnode0_processstatus.txt
-rw-r--r--  1 user user  1631448 Mar  7 14:11 fnode0_process.txt
-rw-r--r--  1 user user     3713 Mar  7 14:11 graphdata.txt
-rw-r--r--  1 user user      122 Mar  7 13:51 marshalsizes.txt
-rw-r--r--  1 user user     4157 Mar  7 13:51 out.txt

To only show election processing and network inputs, network outputs, and api commands run factomd like this.

factomd -stderrlog=err.txt -stdoutlog=out.txt -debuglog="faulting|network|api"

It will only save the subset of the files which match that regular expression.

@carryforward
Copy link
Contributor

please share how you mounted the docker container to get these log files when you succeed.

@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 7, 2019

I rebuild it & run manually:

docker run -d --name "factomd" \
    -v "factom_database:/root/.factom/m2" \
    -v "factom_keys:/root/.factom/private" \
    -p "8088:8088" \
    -p "8090:8090" \
    -p "8108:8108" \
    -l "name=factomd" \
    factominc/factomd:v6.2.1-rc2-alpine \
    -startdelay=0 \
    -faulttimeout=120 \
    -config=/root/.factom/private/factomd.conf \
    -debuglog=.*

I see in docker logs factomd:

Creating fnode0_dbstateprocess.txt
Creating fnode0_factoids.txt
Creating fnode0_entrycredits.txt
State move between non-sequential heights from 0 to 181000
Creating fnode0_missing_messages.txt
Creating fnode0_entrysync.txt
               Build fa68b2d739edd435b7097424879cf6f82efb6bd7
           Node name 
         balancehash true
        FNode 0 Salt 33f30ab3a45b4b8e
           enablenet true
         waitentries false
                node 0
              prefix 
          node count 1
        FastSaveRate 1000
            net spec "tree"
         Msgs droped 0
             journal ""
            database "LDB"
 database for clones ""
               peers ""
           exclusive "false"
        exclusive_in "false"
          block time 6
          runtimeLog false
              rotate false
          timeOffset 0
        keepMismatch false
          startDelay 0
             Network MAIN
           customnet e3b0c442 ()
       deadline (ms) 1000
                 tls false
            selfaddr 
             rpcuser ""
         corsdomains "[]"
Start 2nd Sync at ht 182000
        faultTimeout 120
             rpcpass is blank
            TCP port "8088"
          pprof port "6060"
  Control Panel port "8090"
Starting API server
Creating graphdata.txt
Creating fnode0_election.txt
Starting Control Panel on http://localhost:8090/
Creating fnode0_inmsgqueue.txt
Creating fnode0_msgqueue.txt
Creating fnode0_factoids_trans.txt
Creating fnode0_entrycredits_trans.txt
Using NetworkName "MAIN"
Creating fnode0_balancehash.txt
Creating fnode0_networkinputs.txt
Creating fnode0_apilog.txt
Creating fnode0_inmsgqueue2.txt
Creating fnode0_holding.txt
Creating fnode0_networkoutputs.txt
Creating fnode0_duplicatesend.txt

But I can not locate this files.
They don't appear in Ubuntu.

@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 7, 2019

Also tried with flags -stderrlog=err.txt -stdoutlog=out.txt, but the result is the same

@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 7, 2019

@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 7, 2019

fnode0_apilog.txt
commit + reveal at 20:30:39-40 UTC

    42471 20:30:39.819  181350-:-0 request {"jsonrpc":"2.0","id":11,"params":{"message":"00016959d9cc504cbaa7730b9642772fbf45cda41e6f53c12bb0822f32a9c992fbf734c453f41e01d0514c67b997de0bb495ccdc1887bdc867004ffa44f1b269418c8352b623b67579693d553f1a4c17955066dee4924b6836e690f15c3175338a299b7fdb2c35ef33cd5b59d3dd38422b282feb50691baa1f44331f9238ddef0b38b39b1c0ed70c"},"method":"commit-entry"} 
    42472 20:30:39.819  181350-:-0 response {"jsonrpc":"2.0","id":11,"result":{"message":"Entry Commit Success","txid":"9226c21e1d448e613f12d19a60253cde289a998abf4b57b59862b283798d6328","entryhash":"4cbaa7730b9642772fbf45cda41e6f53c12bb0822f32a9c992fbf734c453f41e"}} 
    43035 20:30:40.036  181353-:-0 request {"jsonrpc":"2.0","id":12,"params":{"entry":"00ad7b36e5f295455c1e8ee9b7e2c71f0aea6956b022c22be1ea6464f85a046c5c0004000223365465737420636f6e74656e74"},"method":"reveal-entry"} 
    43060 20:30:40.042  181353-:-0 response {"jsonrpc":"2.0","id":12,"result":{"message":"Entry Reveal Success","entryhash":"4cbaa7730b9642772fbf45cda41e6f53c12bb0822f32a9c992fbf734c453f41e","chainid":"ad7b36e5f295455c1e8ee9b7e2c71f0aea6956b022c22be1ea6464f85a046c5c"}} 

And fnode0_balancehash.txt
Logs stopped at dblock 181352.
The same block as stucked in Control Panel:
image

@carryforward
Copy link
Contributor

It looks like the logging directory can't be changed beyond the directory that it gets launched in.

myfile := getTraceFile(name)

Thank you for the logs though. They will be helpful.

@carryforward
Copy link
Contributor

hmm, veena couldn't replicate this issue on her setup. She couldn't replicate it with her mainnet setup nor on a local node. This sounds like a good opportunity for @ThomasMeier to get experience a) replicating the issue b) looking at the logs that the issue made c) finding a good way to fix the issue.

The first step is to get it replicated in a debugger. Please see what it takes to recreate it.

Pull requests can go here: https://github.com/FactomProject/factomd/tree/FD-862_boot_stall_with_API_access

I'm looking forward to you showing off your development chops to the community @ThomasMeier.

@ilzheev
Copy link
Contributor Author

ilzheev commented Mar 8, 2019

@carryforward @ThomasMeier
I used CommitEntry() factom lib call for commiting entry, not factomd API call.
https://github.com/FactomProject/factom/blob/d04eb4befbf1869f00aa5cf860c8bb944e08855f/entry.go#L204
Maybe that's the reason Veena couldn't reproduced the issue.

I use default conf mainnet follower node.
And it is not a pokemon bug, I caught it in 100% of tests.

@carryforward
Copy link
Contributor

good news, @stackdump and @VeenaGondkar were able to reproduce this bug reliably.

@carryforward
Copy link
Contributor

this reproducible with this branch. https://github.com/FactomProject/factomd/commits/FD-862_ec_commit_on_restart

@VeenaGondkar
Copy link
Contributor

@ilzheev Thanks for the key information. I am able to reproduce the issue when I call factom lib calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants