Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running Ghost sample crashes with Segmentation fault from time to time #3484

Open
stellarspot opened this issue Apr 26, 2019 · 14 comments
Open
Labels

Comments

@stellarspot
Copy link
Contributor

I run the code sample-ghost-notebook.scm and it crashes with Segmentation fault error sometimes.

Here is the result of call catchsegv guile samples/ghost/sample-ghost-notebook.scm

[2019-04-26 15:03:52:974] [WARN] [GHOST] Did you forget to link a goal to the rule?
[2019-04-26 15:03:53:008] [WARN] [GHOST] Did you forget to link a goal to the rule?
[2019-04-26 15:03:53:067] [INFO] [say] (softmegacorp is a fantastic company !)
[2019-04-26 15:03:53:110] [INFO] [say] (i work in SoftMegaCorp company.)
[2019-04-26 15:03:54:138] [INFO] [say] (hardmegacorp is a fantastic company !)
*** Segmentation fault
Register dump:

 RAX: 00007f2e6b6134a8   RBX: 00000000012c9200   RCX: 00007f2e61ef3440
 RDX: 0000000000000006   RSI: 0000000000000000   RDI: 000000000111fe80
 RBP: 0000000000000000   R8 : 00000000012c9200   R9 : 00007f2e6b613138
 R10: 00007f2e72e969a0   R11: 00007f2e66425fe0   R12: 00007f2e61ef34e0
 R13: 00007f2e61ef3600   R14: 00007f2e663fde30   R15: 00007f2e61ef35f0
 RSP: 00007f2e61ef3488

 RIP: 00007f2e6b613450   EFLAGS: 00010246

 CS: 0033   FS: 0000   GS: 0000

 Trap: 0000000e   Error: 00000015   OldMask: 00000000   CR2: 6b613450

 FPUCW: 0000037f   FPUSW: 00000420   TAG: 00007f2e
 RIP: 68bb9217   RDP: 00000000

 ST(0) ffff ffffffffb5829040   ST(1) ffff c0a15421076a650b
 ST(2) ffff e000000000000000   ST(3) ffff 8000000000000000
 ST(4) ffff 96b81373055aef41   ST(5) ffff 8000000000000000
 ST(6) ffff 8000000000000000   ST(7) 8000 8000000000000000
 mxcsr: 1fa0
 XMM0:  000000000000000000000000eccccccd XMM1:  000000000000000000000000eccccccd
 XMM2:  000000000000000000000000eccccccd XMM3:  000000000000000000000000eccccccd
 XMM4:  000000000000000000000000eccccccd XMM5:  000000000000000000000000eccccccd
 XMM6:  000000000000000000000000eccccccd XMM7:  000000000000000000000000eccccccd
 XMM8:  000000000000000000000000eccccccd XMM9:  000000000000000000000000eccccccd
 XMM10: 000000000000000000000000eccccccd XMM11: 000000000000000000000000eccccccd
 XMM12: 000000000000000000000000eccccccd XMM13: 000000000000000000000000eccccccd
 XMM14: 000000000000000000000000eccccccd XMM15: 000000000000000000000000eccccccd

Backtrace:
/usr/local/lib/opencog/libsmob.so(_ZTISt23_Sp_counted_ptr_inplaceIN7opencog10FloatValueESaIS1_ELN9__gnu_cxx12_Lock_policyE2EE+0x0)[0x7f2e6b613450]

Memory map:

00400000-00401000 r-xp 00000000 00:4e 1692 /usr/local/bin/guile
00600000-00601000 r--p 00000000 00:4e 1692 /usr/local/bin/guile
00601000-00602000 rw-p 00001000 00:4e 1692 /usr/local/bin/guile
00f77000-01a97000 rw-p 00000000 00:00 0 [heap]
7f2e527fd000-7f2e527fe000 ---p 00000000 00:00 0
7f2e527fe000-7f2e52ffe000 rw-p 00000000 00:00 0
7f2e52ffe000-7f2e52fff000 ---p 00000000 00:00 0
7f2e52fff000-7f2e537ff000 rw-p 00000000 00:00 0
7f2e537ff000-7f2e53800000 ---p 00000000 00:00 0
7f2e53800000-7f2e54000000 rw-p 00000000 00:00 0
7f2e54000000-7f2e54049000 rw-p 00000000 00:00 0
7f2e54049000-7f2e58000000 ---p 00000000 00:00 0
7f2e587f9000-7f2e587fa000 ---p 00000000 00:00 0
7f2e587fa000-7f2e58ffa000 rw-p 00000000 00:00 0
7f2e58ffa000-7f2e58ffb000 ---p 00000000 00:00 0
7f2e58ffb000-7f2e597fb000 rw-p 00000000 00:00 0
7f2e597fb000-7f2e597fc000 ---p 00000000 00:00 0
7f2e597fc000-7f2e59ffc000 rw-p 00000000 00:00 0
7f2e59ffc000-7f2e59ffd000 ---p 00000000 00:00 0
7f2e59ffd000-7f2e5a7fd000 rw-p 00000000 00:00 0
7f2e5a7fd000-7f2e5a7fe000 ---p 00000000 00:00 0
7f2e5a7fe000-7f2e5affe000 rw-p 00000000 00:00 0
7f2e5affe000-7f2e5afff000 ---p 00000000 00:00 0
7f2e5afff000-7f2e5b7ff000 rw-p 00000000 00:00 0
7f2e5b7ff000-7f2e5b800000 ---p 00000000 00:00 0
7f2e5b800000-7f2e5c000000 rw-p 00000000 00:00 0
7f2e5c000000-7f2e5c021000 rw-p 00000000 00:00 0
7f2e5c021000-7f2e60000000 ---p 00000000 00:00 0
7f2e60303000-7f2e606f5000 rw-p 00000000 00:00 0
7f2e606f5000-7f2e606f6000 ---p 00000000 00:00 0
7f2e606f6000-7f2e60ef6000 rw-p 00000000 00:00 0
7f2e60ef6000-7f2e60ef7000 ---p 00000000 00:00 0
7f2e60ef7000-7f2e616f7000 rw-p 00000000 00:00 0
7f2e616f7000-7f2e616f8000 ---p 00000000 00:00 0
7f2e616f8000-7f2e61f28000 rw-p 00000000 00:00 0
7f2e61f28000-7f2e61f38000 r--p 00000000 00:4e 3699 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/stimulation.scm.go
7f2e61f38000-7f2e61f39000 rw-p 00010000 00:4e 3699 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/stimulation.scm.go
7f2e61f39000-7f2e61f79000 r--p 00000000 00:4e 3698 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/cs-parse.scm.go
7f2e61f79000-7f2e61f83000 rw-p 00040000 00:4e 3698 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/cs-parse.scm.go
7f2e61f83000-7f2e61f95000 r--p 0004a000 00:4e 3698 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/cs-parse.scm.go
7f2e61f95000-7f2e61fa5000 r--p 00000000 00:4e 3697 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/matcher.scm.go
7f2e61fa5000-7f2e61fa7000 rw-p 00010000 00:4e 3697 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/matcher.scm.go
7f2e61fa7000-7f2e61fa9000 r--p 00012000 00:4e 3697 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/matcher.scm.go
7f2e61fa9000-7f2e61fb9000 r--p 00000000 00:4e 3696 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/translator.scm.go
7f2e61fb9000-7f2e61fbc000 rw-p 00010000 00:4e 3696 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/translator.scm.go
7f2e61fbc000-7f2e61fc2000 r--p 00013000 00:4e 3696 /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/translator.scm.go
@stellarspot
Copy link
Contributor Author

Here is the stack trace from LD_PRELOAD=/lib/x86_64-linux-gnu/libSegFault.so guile samples/ghost/sample-ghost-notebook.scm

[2019-04-26 15:28:46:364] [WARN] [GHOST] Did you forget to link a goal to the rule?
[2019-04-26 15:28:46:399] [WARN] [GHOST] Did you forget to link a goal to the rule?
[2019-04-26 15:28:46:454] [INFO] [say] (softmegacorp is a fantastic company !)
[2019-04-26 15:28:46:493] [INFO] [say] (i work in SoftMegaCorp company.)
[2019-04-26 15:28:47:575] [INFO] [say] (hardmegacorp is a fantastic company !)
*** Segmentation fault
Register dump:

 RAX: 00007f36d46144a8   RBX: 00007f36caef17e0   RCX: 0000000000000000
 RDX: 0000000000cbacc0   RSI: 0000000000000000   RDI: 0000000000cbacc0
 RBP: 00007f36caef1c20   R8 : 0000000000000000   R9 : 00007f36caef1f00
 R10: 0000000000000001   R11: 0000000000000283   R12: 00007f36caef1790
 R13: 0000000000cd6580   R14: 00007f36caef1c30   R15: 0000000000000001
 RSP: 00007f36caef1748

 RIP: 00007f36d440acc3   EFLAGS: 00010202

 CS: 0033   FS: 0000   GS: 0000

 Trap: [2019-04-26 15:28:47:617] [INFO] [say] (alice work in HardMegaCorp company.)
0000000e   Error: 00000004   OldMask: 00000000   CR2: 00000008

 FPUCW: 0000037f   FPUSW: 00000420   TAG: 00007f36
 RIP: d1bba217   RDP: 00000000

 ST(0) ffff ffffffffb5829040   ST(1) ffff c0a15421076a650b
 ST(2) ffff e000000000000000   ST(3) ffff 8000000000000000
 ST(4) ffff 96b81373055aef41   ST(5) ffff a000000000000000
 ST(6) ffff a000000000000000   ST(7) b000 b000000000000000
 mxcsr: 1fa0
 XMM0:  0000000000000000000000007f6f6e20 XMM1:  0000000000000000000000007f6f6e20
 XMM2:  0000000000000000000000007f6f6e20 XMM3:  0000000000000000000000007f6f6e20
 XMM4:  0000000000000000000000007f6f6e20 XMM5:  0000000000000000000000007f6f6e20
 XMM6:  0000000000000000000000007f6f6e20 XMM7:  0000000000000000000000007f6f6e20
 XMM8:  0000000000000000000000007f6f6e20 XMM9:  0000000000000000000000007f6f6e20
 XMM10: 0000000000000000000000007f6f6e20 XMM11: 0000000000000000000000007f6f6e20
 XMM12: 0000000000000000000000007f6f6e20 XMM13: 0000000000000000000000007f6f6e20
 XMM14: 0000000000000000000000007f6f6e20 XMM15: 0000000000000000000000007f6f6e20

Backtrace:
/usr/local/lib/opencog/libsmob.so(_ZNSt23_Sp_counted_ptr_inplaceIN7opencog10FloatValueESaIS1_ELN9__gnu_cxx12_Lock_policyE2EE14_M_get_deleterERKSt9type_info+0x3)[0x7f36d440acc3]
/usr/local/lib/opencog/libatombase.so(_ZNK7opencog4Node9to_stringERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x280)[0x7f36d24e6a20]
/usr/local/lib/opencog/libatomspace.so(_ZNK7opencog4Atom9to_stringB5cxx11Ev+0x3f)[0x7f36d41c361f]
/usr/local/lib/opencog/libexecution.so(+0x19d4f)[0x7f36d3893d4f]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog14EvaluationLink15do_eval_scratchEPNS_9AtomSpaceERKNS_6HandleES2_b+0x214)[0x7f36d3892f74]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog14EvaluationLink11do_evaluateEPNS_9AtomSpaceERKNS_6HandleEb+0x10)[0x7f36d3895490]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog12Instantiator9walk_treeERKNS_6HandleEb+0xc07)[0x7f36d38a1137]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog12Instantiator11instantiateERKNS_6HandleERKSt3mapIS1_S1_St4lessIS1_ESaISt4pairIS2_S1_EEEb+0x6ab)[0x7f36d38a2f4b]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog12Instantiator7executeERKNS_6HandleEb+0x4b)[0x7f36d38a30cb]
/usr/local/lib/opencog/libexec.so(+0x26f3)[0x7f36cdf9c6f3]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog12FunctionWrap14as_wrapper_v_hENS_6HandleE+0x22)[0x7f36d43f3842]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog15SchemePrimitiveISt10shared_ptrINS_5ValueEENS_12FunctionWrapEJNS_6HandleEEE6invokeEP17scm_unused_struct+0x68)[0x7f36d43f3d88]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog16PrimitiveEnviron7do_callEP17scm_unused_structS2_+0x28)[0x7f36d43f4c88]
/usr/local/lib/libguile-2.2.so.1(+0xc663b)[0x7f36dbed263b]
/usr/local/lib/libguile-2.2.so.1(scm_call_n+0x182)[0x7f36dbed56b2]
/usr/local/lib/libguile-2.2.so.1(scm_primitive_eval+0x27)[0x7f36dbe568e7]
/usr/local/lib/libguile-2.2.so.1(scm_eval+0x53)[0x7f36dbe56943]
/usr/local/lib/libguile-2.2.so.1(+0xc663b)[0x7f36dbed263b]
/usr/local/lib/libguile-2.2.so.1(scm_call_n+0x182)[0x7f36dbed56b2]
/usr/local/lib/libguile-2.2.so.1(+0xb85a9)[0x7f36dbec45a9]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog10SchemeEval11do_scm_evalEP17scm_unused_structPFS2_PvE+0xc7)[0x7f36d43f0ae7]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog10SchemeEval12do_apply_scmERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKNS_6HandleE+0x1ec)[0x7f36d43f13bc]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog10SchemeEval7apply_vERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_6HandleE+0x27)[0x7f36d43f1547]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog19ExecutionOutputLink10do_executeEPNS_9AtomSpaceERKNS_6HandleES5_b+0x6d9)[0x7f36d389ac49]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog19ExecutionOutputLink7executeEPNS_9AtomSpaceEb+0x575)[0x7f36d389ce35]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog12Instantiator9walk_treeERKNS_6HandleEb+0x96c)[0x7f36d38a0e9c]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog12Instantiator13walk_sequenceERSt6vectorINS_6HandleESaIS2_EERKS4_b+0x69e)[0x7f36d38a012e]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog12Instantiator9walk_treeERKNS_6HandleEb+0xd5e)[0x7f36d38a128e]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog12Instantiator11instantiateERKNS_6HandleERKSt3mapIS1_S1_St4lessIS1_ESaISt4pairIS2_S1_EEEb+0x6ab)[0x7f36d38a2f4b]
/usr/local/lib/opencog/libopenpsi.so(_ZN7opencog17OpenPsiImplicator5implyERKNS_6HandleERNS_12OpenPsiRulesE+0x1c7)[0x7f36ccb6b407]
/usr/local/lib/opencog/libopenpsi.so(_ZN7opencog10OpenPsiSCM5implyERKNS_6HandleE+0x43)[0x7f36ccb75183]
/usr/local/lib/opencog/libopenpsi.so(_ZN7opencog15SchemePrimitiveINS_6HandleENS_10OpenPsiSCMEIRKS1_EE6invokeEP17scm_unused_struct+0x68)[0x7f36ccb764a8]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog16PrimitiveEnviron7do_callEP17scm_unused_structS2_+0x28)[0x7f36d43f4c88]
/usr/local/lib/libguile-2.2.so.1(+0xc663b)[0x7f36dbed263b]
/usr/local/lib/libguile-2.2.so.1(scm_call_n+0x182)[0x7f36dbed56b2]
/usr/local/lib/libguile-2.2.so.1(scm_primitive_eval+0x27)[0x7f36dbe568e7]
/usr/local/lib/libguile-2.2.so.1(scm_eval+0x53)[0x7f36dbe56943]
/usr/local/lib/libguile-2.2.so.1(+0xc663b)[0x7f36dbed263b]
/usr/local/lib/libguile-2.2.so.1(scm_call_n+0x182)[0x7f36dbed56b2]
/usr/local/lib/libguile-2.2.so.1(+0xb85a9)[0x7f36dbec45a9]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog10SchemeEval11do_scm_evalEP17scm_unused_structPFS2_PvE+0xc7)[0x7f36d43f0ae7]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog10SchemeEval12do_apply_scmERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKNS_6HandleE+0x1ec)[0x7f36d43f13bc]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog10SchemeEval14c_wrap_apply_vEPv+0x1b)[0x7f36d43f185b]
/usr/local/lib/libguile-2.2.so.1(+0x4368a)[0x7f36dbe4f68a]
/usr/local/lib/libguile-2.2.so.1(+0xc663b)[0x7f36dbed263b]
/usr/local/lib/libguile-2.2.so.1(scm_call_n+0x182)[0x7f36dbed56b2]
/usr/local/lib/libguile-2.2.so.1(+0xb85a9)[0x7f36dbec45a9]
/usr/local/lib/libguile-2.2.so.1(+0x43c90)[0x7f36dbe4fc90]
/usr/local/lib/libguile-2.2.so.1(scm_c_with_continuation_barrier+0x45)[0x7f36dbe4fd75]
/usr/local/lib/libgc.so.1(GC_call_with_stack_base+0x22)[0x7f36db5c5f12]
/usr/local/lib/libguile-2.2.so.1(scm_with_guile+0x38)[0x7f36dbec3468]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog10SchemeEval7apply_vERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_6HandleE+0xd3)[0x7f36d43f15f3]
/usr/local/lib/opencog/libexecution.so(_Z17do_eval_with_argsPN7opencog9AtomSpaceERKNS_6HandleERKSt6vectorIS2_SaIS2_EEb+0x2fb)[0x7f36d389579b]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog14EvaluationLink15do_eval_scratchEPNS_9AtomSpaceERKNS_6HandleES2_b+0x418)[0x7f36d3893178]
/usr/local/lib/opencog/libexecution.so(+0x19e6a)[0x7f36d3893e6a]
/usr/local/lib/opencog/libexecution.so(+0x1a185)[0x7f36d3894185]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog14EvaluationLink15do_eval_scratchEPNS_9AtomSpaceERKNS_6HandleES2_b+0x214)[0x7f36d3892f74]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog14EvaluationLink15do_eval_scratchEPNS_9AtomSpaceERKNS_6HandleES2_b+0x38e)[0x7f36d38930ee]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog14EvaluationLink15do_eval_scratchEPNS_9AtomSpaceERKNS_6HandleES2_b+0x46a)[0x7f36d38931ca]
/usr/local/lib/opencog/libexecution.so(_ZN7opencog14EvaluationLink11do_evaluateEPNS_9AtomSpaceERKNS_6HandleEb+0x10)[0x7f36d3895490]
/usr/local/lib/opencog/libexec.so(+0x249b)[0x7f36cdf9c49b]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog12FunctionWrap14as_wrapper_p_hENS_6HandleE+0x22)[0x7f36d43f3812]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog15SchemePrimitiveISt10shared_ptrIKNS_10TruthValueEENS_12FunctionWrapEJNS_6HandleEEE6invokeEP17scm_unused_struct+0x64)[0x7f36d43f3f34]
/usr/local/lib/opencog/libsmob.so(_ZN7opencog16PrimitiveEnviron7do_callEP17scm_unused_structS2_+0x28)[0x7f36d43f4c88]
/usr/local/lib/libguile-2.2.so.1(+0xc663b)[0x7f36dbed263b]
/usr/local/lib/libguile-2.2.so.1(scm_call_n+0x182)[0x7f36dbed56b2]
/usr/local/lib/libguile-2.2.so.1(scm_call_with_unblocked_asyncs+0x38)[0x7f36dbe46278]
/usr/local/lib/libguile-2.2.so.1(+0xc663b)[0x7f36dbed263b]
/usr/local/lib/libguile-2.2.so.1(scm_call_n+0x182)[0x7f36dbed56b2]
/usr/local/lib/libguile-2.2.so.1(+0xb6f56)[0x7f36dbec2f56]
/usr/local/lib/libguile-2.2.so.1(+0x4368a)[0x7f36dbe4f68a]
/usr/local/lib/libguile-2.2.so.1(+0xc663b)[0x7f36dbed263b]
/usr/local/lib/libguile-2.2.so.1(scm_call_n+0x182)[0x7f36dbed56b2]
/usr/local/lib/libguile-2.2.so.1(+0xb85a9)[0x7f36dbec45a9]
/usr/local/lib/libguile-2.2.so.1(+0x43c90)[0x7f36dbe4fc90]
/usr/local/lib/libguile-2.2.so.1(scm_c_with_continuation_barrier+0x45)[0x7f36dbe4fd75]
/usr/local/lib/libguile-2.2.so.1(+0xb707c)[0x7f36dbec307c]
/usr/local/lib/libgc.so.1(GC_call_with_stack_base+0x22)[0x7f36db5c5f12]
/usr/local/lib/libguile-2.2.so.1(+0xb670d)[0x7f36dbec270d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f36dbbf66ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f36db92c41d]

Memory map:

00400000-00401000 r-xp 00000000 00:4e 1692                               /usr/local/bin/guile
00600000-00601000 r--p 00000000 00:4e 1692                               /usr/local/bin/guile
00601000-00602000 rw-p 00001000 00:4e 1692                               /usr/local/bin/guile
00acb000-015de000 rw-p 00000000 00:00 0                                  [heap]
7f36bb7ff000-7f36bb800000 ---p 00000000 00:00 0 
7f36bb800000-7f36bc000000 rw-p 00000000 00:00 0 
7f36bc000000-7f36bc04d000 rw-p 00000000 00:00 0 
7f36bc04d000-7f36c0000000 ---p 00000000 00:00 0 
7f36c07f9000-7f36c07fa000 ---p 00000000 00:00 0 
7f36c07fa000-7f36c0ffa000 rw-p 00000000 00:00 0 
7f36c0ffa000-7f36c0ffb000 ---p 00000000 00:00 0 
7f36c0ffb000-7f36c17fb000 rw-p 00000000 00:00 0 
7f36c17fb000-7f36c17fc000 ---p 00000000 00:00 0 
7f36c17fc000-7f36c1ffc000 rw-p 00000000 00:00 0 
7f36c1ffc000-7f36c1ffd000 ---p 00000000 00:00 0 
7f36c1ffd000-7f36c27fd000 rw-p 00000000 00:00 0 
7f36c27fd000-7f36c27fe000 ---p 00000000 00:00 0 
7f36c27fe000-7f36c2ffe000 rw-p 00000000 00:00 0 
7f36c2ffe000-7f36c2fff000 ---p 00000000 00:00 0 
7f36c2fff000-7f36c37ff000 rw-p 00000000 00:00 0 
7f36c37ff000-7f36c3800000 ---p 00000000 00:00 0 
7f36c3800000-7f36c4000000 rw-p 00000000 00:00 0 
7f36c4000000-7f36c4021000 rw-p 00000000 00:00 0 
7f36c4021000-7f36c8000000 ---p 00000000 00:00 0 
7f36c82ea000-7f36c86f2000 rw-p 00000000 00:00 0 
7f36c86f2000-7f36c86f3000 ---p 00000000 00:00 0 
7f36c86f3000-7f36c8ef3000 rw-p 00000000 00:00 0 
7f36c8ef3000-7f36c8ef4000 ---p 00000000 00:00 0 
7f36c8ef4000-7f36c96f4000 rw-p 00000000 00:00 0 
7f36c96f4000-7f36c96f5000 ---p 00000000 00:00 0 
7f36c96f5000-7f36c9ef5000 rw-p 00000000 00:00 0 
7f36c9ef5000-7f36c9ef6000 ---p 00000000 00:00 0 
7f36c9ef6000-7f36ca6f6000 rw-p 00000000 00:00 0 
7f36ca6f6000-7f36ca6f7000 ---p 00000000 00:00 0 
7f36ca6f7000-7f36caf17000 rw-p 00000000 00:00 0 
7f36caf17000-7f36caf27000 r--p 00000000 00:4e 3699                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/stimulation.scm.go
7f36caf27000-7f36caf28000 rw-p 00010000 00:4e 3699                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/stimulation.scm.go
7f36caf28000-7f36caf68000 r--p 00000000 00:4e 3698                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/cs-parse.scm.go
7f36caf68000-7f36caf72000 rw-p 00040000 00:4e 3698                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/cs-parse.scm.go
7f36caf72000-7f36caf84000 r--p 0004a000 00:4e 3698                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/cs-parse.scm.go
7f36caf84000-7f36caf94000 r--p 00000000 00:4e 3697                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/matcher.scm.go
7f36caf94000-7f36caf96000 rw-p 00010000 00:4e 3697                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/matcher.scm.go
7f36caf96000-7f36caf98000 r--p 00012000 00:4e 3697                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/matcher.scm.go
7f36caf98000-7f36cafa8000 r--p 00000000 00:4e 3696                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/translator.scm.go
7f36cafa8000-7f36cafab000 rw-p 00010000 00:4e 3696                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/translator.scm.go
7f36cafab000-7f36cafb1000 r--p 00013000 00:4e 3696                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/translator.scm.go
7f36cafb1000-7f36cafc1000 r--p 00000000 00:4e 3695                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/terms.scm.go
7f36cafc1000-7f36cafc4000 rw-p 00010000 00:4e 3695                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/terms.scm.go
7f36cafc4000-7f36cafc9000 r--p 00013000 00:4e 3695                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/terms.scm.go
7f36cafc9000-7f36cafd9000 r--p 00000000 00:4e 3694                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/utils.scm.go
7f36cafd9000-7f36cafdc000 rw-p 00010000 00:4e 3694                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/utils.scm.go
7f36cafdc000-7f36cafe0000 r--p 00013000 00:4e 3694                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/utils.scm.go
7f36cafe0000-7f36caff0000 r--p 00000000 00:4e 3693                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/test.scm.go
7f36caff0000-7f36caff3000 rw-p 00010000 00:4e 3693                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/test.scm.go
7f36caff3000-7f36caff6000 r--p 00013000 00:4e 3693                       /home/stellarspot/.cache/guile/ccache/2.2-LE-8-3.A/usr/local/share/guile/site/2.2/opencog/ghost/test.scm.go

@linas
Copy link
Member

linas commented Apr 26, 2019

The opencog.log should also contain stack trace, its formatted a bit more nicely, making it easier to read. The above suggests a null-pointer deref.

@linas
Copy link
Member

linas commented Apr 26, 2019

BTW, the correct English word is "examples" not "samples".

@linas
Copy link
Member

linas commented Apr 26, 2019

HMm. OK, I can reproduce the crash, but somehow you disabled the normal crash-processing code? I'm not able to debug this in the usual way, something is preventing a normal crash dump...

@linas
Copy link
Member

linas commented Apr 27, 2019

SO, in gdb:

Thread 19 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe247a700 (LWP 5495)]
0x00007fffeda86530 in typeinfo for std::_Sp_counted_ptr_inplace<opencog::FloatValue, std::allocator<opencog::FloatValue>, (__gnu_cxx::_Lock_policy)2> ()
   from /usr/local/lib/opencog/libsmob.so
(gdb) bt
#0  0x00007fffeda86530 in typeinfo for std::_Sp_counted_ptr_inplace<opencog::FloatValue, std::allocator<opencog::FloatValue>, (__gnu_cxx::_Lock_policy)2> ()
   from /usr/local/lib/opencog/libsmob.so
Backtrace stopped: Cannot access memory at address 0x7fffe24765d8

which is confusing. I don't understand where the backtrace printing is coming from. Who is priting that?

@linas
Copy link
Member

linas commented Apr 27, 2019

Hmm, I see this:

#1  0x00007fffeb710989 in opencog::TruthValue::isDefaultTV (
    this=0x555556274630)
    at /home/linas/src/novamente/src/atomspace-git/opencog/atoms/truthvalue/TruthValue.cc:81
(gdb) print dtv
$11 = std::shared_ptr (expired, weak 0) 0x555555f2c530

So DEFAULT_TV() is somehow a pointer to something that has been freed .... since almost all atoms have the DEFAULT_TV, this should be impossible.

@linas
Copy link
Member

linas commented Apr 27, 2019

Assorted checks in various places show consistently, reproducibly, that both TV and AV use-counts become too low, leading to access of deleted memory and other crazy symptoms. How, exactly that comes to be remains mysterious to me, but is presumably because the std::shared_ptr is not thread-safe, and there's some thread race condition that causes some increment/decrement to fail.

@linas
Copy link
Member

linas commented Apr 27, 2019

Found it. But only after 8+ hours of difficult debugging. Someone .. not sure who -- either this example demo, or ghost itself, has issued the string (quit) to the guile command line, while the atomspace is still running. First, @stellarspot please verify that it is not your code that is doing this. Next, it would be up to @amebel to verify that ghost itself is not doing this anywhere.

What is happening is that (quit) is calling the exit handlers, and is freeing static memory, including the DEFAULT_TV truth value, even while other threads are still running and doing something or other. Those other threads access the freed memory, and crash in any one of a number of different ways.

Here's how I debugged this, in case you ever want to do the same thing. First, notice that the use-count of DEFAULT_TV() drops to zero. So, whoo is decrementing it? How can this be printed out?

So edit /usr/include/c++/6/bits/shared_ptr_base.h and near line 148, make it look like this:

      void
      _M_release() noexcept
      {
long cnt = __atomic_load_n(&_M_use_count, __ATOMIC_RELAXED); \\ <<<linas
grbstk(this, cnt);  // <<< linas
        // Be race-detector-friendly.  For more info see bits/c++config.
        _GLIBCXX_SYNCHRONIZATION_HAPPENS_BEFORE(&_M_use_count);
        if (__gnu_cxx::__exchange_and_add_dispatch(&_M_use_count, -1) == 1)
          {
            _GLIBCXX_SYNCHRONIZATION_HAPPENS_AFTER(&_M_use_count);
            _M_dispose();

and the custom routine grbstk() points at the guilty party:

 namespace std {
void grbstk(void* p, long cnt)
{
if (p != opencog::dflt) return;
long tid = std::hash<std::thread::id>{}(std::this_thread::get_id());
// printf("duuuude ---> %p %p %ld\n", opencog::dflt, p, cnt);
printf("duuuude ---> dtor cnt=%ld tid=%lx\n", cnt, tid);
fflush(stdout);
assert(2 < cnt);
}
}

We now have a stack trace of whodunnit:


#9  0x00007ffff6fec940 in __run_exit_handlers (status=0,
    listp=0x7ffff73505d8 <__exit_funcs>,
    run_list_atexit=run_list_atexit@entry=true,
run_dtors=run_dtors@entry=true)
    at exit.c:83
#10 0x00007ffff6fec99a in __GI_exit (status=<optimized out>) at
exit.c:105
#11 0x00007ffff7af1ecf in c_handler (d=0x7fffffffe120,
tag=0x55555596f260,
    args=0x304) at ../../libguile/continuations.c:433
#12 0x00007ffff7b714d1 in vm_regular_engine (thread=0x2,
vp=0x55555595df30,
    registers=0x0, resume=-151085057) at ../../libguile/vm-engine.c:786

Why the heck would c_handler() cause finalizers to run? Lets see: So libguile/continuations.c:433 looks like this:

static SCM
c_handler (void *d, SCM tag, SCM args)
{
  struct c_data *data;

  /* If TAG is `quit', exit() the process.  */
  if (scm_is_eq (tag, scm_from_latin1_symbol ("quit")))
    exit (scm_exit_status (args));

  data = (struct c_data *)d;
  data->result = NULL;
  return SCM_UNSPECIFIED;
}

OK, right. Someone types in (quit) into guile, while ghost was still running.

@linas
Copy link
Member

linas commented Apr 27, 2019

Anyway .. I'm done debugging. I leave it up to you to figure out who is telling guile to quit and force-run exit handlers (thus shutting down not just truth values, but all of the atomspace) while the atomspace is still in use.

@stellarspot
Copy link
Contributor Author

I believe that the problem is the following.

(ghost-run) calls (psi-run ghost-component) and it runs a loop by call-with-new-thread to spawn a new thread.
When interaction with GHOST is finished guile frees the Atomspace but the loop in the separate thread still uses it.
Here is a simple code which always reproduces the segmentation fault on my side:

(use-modules
 (opencog)
 (opencog nlp)
 (opencog nlp relex2logic)
 (opencog openpsi)
 (opencog ghost)
 (opencog ghost procedures)
)

(ghost-parse "
r: (where [do does] _* work) '_0 work in SomeCompany company.
")

(ghost-run)

There is the method (ghost-halt) which aim is to stop the running loop. It actually sets the false value to the (Predicate "run-loop") node.

However, ghost-halt has the same lack. It sets the flag to terminate the loop and after that guile can frees Atomspace while the run-loop is still running. When the run-loop tries to read a value from the predicate node it is possible that it does not exists anymore.

I can implement a solution there psi-halt sets the false value to the predicate node and then joins to the loop thread to wait it is finished.
Here is a scratch code that shows the main idea:

(define run #t)

(define (repl)
 (display "run-loop\n")
 (if run
  (begin
   (sleep 1)
   (repl))
  (display "Exit!\n")))

(define repl-thread
 (call-with-new-thread
  (lambda () (repl))))

(sleep 3)

(set! run #f)
(join-thread repl-thread 5)
(sleep 3)

The solution requires that (ghost-halt) is always called before exiting from guile.
I have not found yet a way which Scheme provides to close thread before exiting (may be some hooks or something like this).

@linas
Copy link
Member

linas commented Apr 29, 2019

I'm guessing that ghost starts one or more threads. So, ghost-halt should call join-thread on every thread that it starts.

The problem with exit-hooks is that they are run in arbitrary order. There is no particular way to make one exit hook run before another; there's no priority ordering. To be clear: even if you did use atexit() to call (ghost-halt), there is no guarantee that ghost-halt will run before the atomspace exit handlers.

@leungmanhin
Copy link
Contributor

@stellarspot, just want to make sure, are you getting this segfault only when exiting Guile when the loop was still running? Or does it crash even without exiting Guile?

@stellarspot
Copy link
Contributor Author

I'm guessing that ghost starts one or more threads. So, ghost-halt should call join-thread on every thread that it starts.

I prepared the suggested fix #3487

It also requires that a user always needs to call psi-halt for each created by psi-run function thread

@stellarspot
Copy link
Contributor Author

@stellarspot, just want to make sure, are you getting this segfault only when exiting Guile when the loop was still running? Or does it crash even without exiting Guile?

I have not seen the crash during working in Guile, only when it is exiting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants