Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go API for YottaDB #205

Open
ksbhaskar opened this issue Apr 20, 2018 · 1 comment
Open

Go API for YottaDB #205

ksbhaskar opened this issue Apr 20, 2018 · 1 comment

Comments

@ksbhaskar
Copy link
Member

Final Release Note

Description

Go is a popular language. YottaDB would benefit from a Go API implemented as a wrapper for the C Simple API as well as the call-in interface to M.

Draft Release Note

@shabiel
Copy link
Contributor

shabiel commented Apr 20, 2018 via email

chathaway-codes pushed a commit that referenced this issue Oct 16, 2018
…IGALRM to specific thread (needed by YottaDB Go API)

The "posix_thread_timer_id" global variable notes down the thread-id that gets the SIGALRM signal
whenever a timer pops. Once the Go API is implemented (in a separate commit), it needs to ensure
this global variable is appropriately initialized before using it in a multi-threaded environment.

Misc changes
-------------
* Fix source server (an example of a process that does a "fork" and calls "timer_create" in both parent
  and child processes) to startup fine without EINVAL errors due to inherited "posix_thread_timer_id".
  This is done by observing that all places which set "process_id" global variable have this issue
  and therefore introducing a SET_PROCESS_ID macro which takes care of setting "process_id" as well
  as clearing "posix_thread_timer_id" (and a related global variable "posix_timer_created").

* Remove CITPNESTED message (nixed in r1.20 as part of #188)

* sr_unix/gt_timers.c : Code changes needed to get v61000/setitimer_fail subtest to pass
  The main change is to change an assert(FALSE) to an assert(WBTEST_ENABLED(WBTEST_SETITIMER_ERROR))
  in case timer_settime() fails or if the white-box test case is defined (which it is by the
  v61000/setitimer_fail subtest). This way we do not get a core file in the test. And save_errno
  was set to EINVAL in the white-box test failure case. This way we verify ENO22 shows up in the
  v61000/setitimer_fail subtest reference file. While at this, a VARLSTCNT usage was found to be
  incorrect so it was fixed.
chathaway-codes pushed a commit that referenced this issue Nov 21, 2018
Module modification/addition details:

sr_aarch64/error.si - Modify assembler ESTABLISH macro so it properly decrements rts_error_level
        global variable on error return or MUM_TSTART.

sr_armv7l/error.si - Modify assembler ESTABLISH macro so it properly decrements rts_error_level
        global variable on error return or MUM_TSTART.

sr_linux/gen_threadgbl_asm.cmake - Fix macro so it writes all all entries instead of overwriting
        the last record over for each record in gtm_threadgbl_asm_access.txt.

sr_linux/platform.cmake - Added all the new entry points for SimpleThreadAPI (e.g. ydb_get_st())

sr_port/alias_funcs.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/callg.h - Add function pointer type and new routine callg_nc() which, like callg(), passes
        parameters on to the specified function but callg_nc() does not include the count of parms
        as the first parameter.

sr_port/callg_nc.c - As described above, performs same as callg() but does not pass arg count.
        This is needed to be able to call ydb_lock_s() which has a variadic plist but no count.

sr_port/cre_private_code_copy.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/db_csh_ref.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/dse_dmp.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/dse_f_blk.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/dump_lockhist.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/gbldefs.c - Removed 0 initializers since default is static vars are already initialized.
        Added thread_mutex_holder_rtn and thread_mutex_holder_line vars for debugging purposes.
        Added several vars for use by SimpleThreadAPI (stmWorkQueue[], stmTPWorkQueue, stmTPToken,
        simpleThreadAPI_active, noThreadAPI_active.

sr_port/gtm_env_init.c - Change trace table code to DEBUG only.

sr_port/gtm_malloc_src.h - Moved DEFERRED_SIGNAL_HANDLING_CHECK macro to after the pthread mutex
        unlock because that's where it should be.

sr_port/gtm_threadgbl_defs.h - Fix a couple formatting issues. Add values used by SimpleThreadAPI
        that ultimately will be thread private (curWorkQHeadIndx, curWorkQHead, rts_error_depth).

sr_port/gtm_threadgbl_init.c - Initialize curWorkQHeadIndx since this value has an in-use value
        that starts at 0, initialize it to -1 to signify not-in-use.

sr_port/gtmmsg.h - Change return type to int from void.

sr_port/gvcst_data.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/gvcst_get.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/gvcst_kill.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/gvcst_order.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/gvcst_query.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/gvcst_queryget.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/gvcst_reservedDB_funcs.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/gvcst_reversequery.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/hashtab_implementation.h - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/io_init.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iorm_wtff.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iosocket_connect.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iosocket_flush.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iosocket_iocontrol.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iosocket_listen.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iosocket_write.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iosocket_wteol.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iosocket_wtff.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iott_wteol.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/iott_wtff.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/jnl_file_close.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/jobinterrupt_init.c - Move thread access macro to more appropriate position.

sr_port/libydberrors2.h - Regenerated.

sr_port/lke_showlock.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/lv_val.h - Remove DEBUG_ALIAS (defines CURRENT_PC macro) logic and make it general
        purpose and putting it in mdef.h.

sr_port/mdb_condition_handler.h - Add gtm_pthread.h, trace_table.h, and caller_id.h includes.

sr_port/mdef.h - Add CURRENT_PC macro removed from lv_val.h.

sr_port/mdq.h - Fix a bunch of formatting issues.

sr_port/mlk_lock.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/mlk_nocrit_unlock.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/mlk_unlock.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/mlk_unpend.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/mtables.c - Fix spelling issue.

sr_port/mu_int_reg.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/mupip_backup.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/mupip_freeze.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/mur_forward.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/mur_multi_rehash.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/op_gvkill.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/op_gvput.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/op_gvzwithdraw.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/op_lock2.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/op_tstart.c - Rename TP_MAX_NEST to TP_MAX_LEVEL.

sr_port/op_xnew.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/op_zshow.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/op_zwritesvn.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr)port/parse_trctbl_groups.c - Make trace table stuff DEBUG-only.

sr_port/performcaslatchcheck - Uniform change of process_id to type uint4 (what most of them are).

sr_port/region_init.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/relqueopi.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/repl_filter.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/setup_error.c - Add comment as to expanded use of this routine.

sr_port/shmpool.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/sockint_stats.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/stp_gcol_src.h - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/tp.h - Rename TP_MAX_NEST to TP_MAX_LEVEL.

sr_port/tp_restart.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/trace_table.h - Add a pthread lock to trace table additions so it works with threads.

sr_port/trace_table_types.h - Add a new SimpleThreadAPI (STAPI) group and trace points.

sr_port/trans_code.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_port/wcs_backoff.c - Uniform change of process_id to type uint4 (what most of them are).

sr_port/ydb_dmp_tracetbl.c - Routine to dump the trace table at process exit if enabled.

sr_port/ydberrors.h - Regenerated.

sr_port/ydberrors.msg - Remove the /ansi option on existing messages, add new messages for STAPI.
        (New messages are UNKNOWNSYSERR, STRUCTNOTALLOCD, PARMOFLOW, NODEEND, INVLNPAIRLIST,
        INVTPTRANS, INVAPIMODE). Note some of these errors are used ONLY in the Golang wrapper
        but are kept here for ease of use so we don't have to have 2 error systems in Golang.

sr_port/ydb_errors_ansi.h - Not needed - removed.

sr_port/ydberrors_ctl.c - Regenerated.

sr_port_cm/gtcm_exi_handler.c - Add threadgbl initialization needed by ESTABLISH* macros now plus
        some minor standards issues fixed.

sr_unix/add_inter.c - Uniform change of process_id to type uint4 (what most of them are).

sr_unix/dse_main.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/errorsp.h - Add trace table entries to ESTABLISH* macros and do decrement of rts_error_depth
        if appropriate.

sr_unix/file_input.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/gds_rundown.c - Add threadgbl initialization needed by ESTABLISH* macros now plus fix a
        broken delcaration.

sr_unix/go_load.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/gt_timers.c - Put #ifndef YDB_USE_POSIX_TIMERS around piece of code no longer needed when
        using POSIX timers but is needed otherwise.

sr_unix/gtm_dump_core.c - Change the kill() call to a pthread_kill call so we can send the current
        thread the SIGQUIT.

sr_unix/gtm_exit_handler.c - Add a call to dump the trace table to the console if one exists.

sr_unix/gtm_getmsg.c - Add a non-zero return code if the error message number is not known.

sr_unix/gtm_image_exit.c - Remove unneeded continuation lines normally only used in a macro.

sr_unix/gtm_main.c - Initialize noThreadAPI_active to TRUE since we are about to start up mumps.

sr_unix/gtm_multi_thread.h - Add some debugging to PTHREAD_MUTEX_* macros, removed comments about
        gtm_malloc/free as the macros have much wider usage that just those. Added a test that the
        SimpleThreadAPI was active so even if ASYNCIO was not active, the macros would still operate
        as intended. Removed an assert that thread_mutex_initialized was in use because that's not
        a necessary prereq for using this routine any more.

sr_unix/gtm_startup.c - Initialize the STAPI work queue and mutex early to prevent race conditions.

sr_unix/gtm_threadgbl_asm_access.txt - Add rts_error_depth to list for access by assembler

sr_unix/gtm_trigger.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/gtmci.c - Add checks and logic so ydb_init() can only be run by one thread at a time. Add
        protection against mixing api modes (either all threaded or all non-threaded). Add logic in
        ydb_exit() to take down the pthread resources we created.

sr_unix/gtmci_ch.c - Release mutex lock as part of handling error.

sr_unix/gtmrecv.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/gtmsource_readfiles.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/gtmsource_srv_latch.c - Uniform change of process_id to type uint4 (what most of them are).

sr_unix/gv_trigger.c - Uniform change of process_id to type uint4 (what most of them are) and add
        threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/init_gtm.c - Rename gtm_startup_active to ydb_init_complete.

sr_unix/iorm_close.c - Get rid of some blank lines.

sr_unix/iorm_get.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iorm_open.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iorm_readfl.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iorm_write.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iorm_wteol.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iosocket_tls.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iott_close.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iott_edit.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iott_flush.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iott_open.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iott_rdone.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iott_readfl.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/iott_write.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/libyottadb.h - Add STAPI routine definitions, add additional macros and types for STAPI.

sr_unix/libyottadb_dbg.h - Isolate debugging macros and their enablement defines.

sr_unix/libyottadb_int.h - Define new structures and macros for STAPI support.

sr_unix/libyottadb_rtns.h - Add utilities since they use the LIBYOTTADB_INIT macro now. Add new
        routines ydb_message_s and ydb_call_variadic_plist_func_s().

sr_unix/lke_main.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/mu_extract.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/mu_rndwn_all.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/mu_size_arsample.c - Uniform change of process_id to type uint4 (what most of them are).

sr_unix/mu_size_impsample.c - Uniform change of process_id to type uint4 (what most of them are).

sr_unix/mu_size_scan.c - Uniform change of process_id to type uint4 (what most of them are).

sr_unix/mupip_exit_handler.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/mutex.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/op_fnrandom.c - Uniform change of process_id to type uint4 (what most of them are) and
        remove declarations of library routines defined already in system hdrs.

sr_unix/op_fnzpeek.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/pipeint_stats.c - Uniform change of process_id to type uint4 (what most of them are).

sr_unix/rts_error.c - Add a depth nest limit that prevents an error loop from eating the stack.

sr_unix/sapi_return_subscr_nodes.c - fix error in comment.

sr_unix/send_msg.c - Add same depth nest limit that rts_error has.

sr_unix/trigger_fill_xecute_buffer.c - Add threadgbl initialization needed by ESTABLISH*
        macros now.

sr_unix/trigger_source_read_andor_verify.c - Add threadgbl initialization needed by ESTABLISH*
        macros now.

sr_unix/trigger_trgfile.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/util_exit_handler.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix/ydb_call_variadic_plist_func_s.c - An undocumented part of the SimpleAPI used currently
        by Golang to drive ydb_lock_s() because cgo does not support variadic argument calls.

sr_unix/ydb_call_variadic_plist_func_st.c - SimpleThreadAPI version of call-variadic-plist-func.

sr_unix/ydb_child_init.c - Change in arguments in LIBYOTTADB_INIT() macro.

sr_unix/ydb_data_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add VERIFY_NON_THREADED_API
        macro.

sr_unix/ydb_data_st.c - SimpleThreadAPI version of ydb_data_s().

sr_unix/ydb_delete_excl_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_delete_excl_st.c - SimpleThreadAPI version of ydb_delete_excl_s().

sr_unix/ydb_delete_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_delete_st.c - SimpleThreadAPI version of ydb_delete_s().

sr_unix/ydb_free.c - Make sure YDB runtime is initialized.

sr_unix/ydb_free_t.c - SimpleThreadAPI version of ydb_free().

sr_unix/ydb_get_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_get_st.c - SimpleThreadAPI version of ydb_get_s().

sr_unix/ydb_hiber_start.c - Make sure YDB runtime is initialized.

sr_unix/ydb_hiber_start_wait_any.c - Make sure YDB runtime is initialized.

sr_unix/ydb_incr_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_incr_st.c - SimpleThreadAPI version of ydb_incr_s().

sr_unix/ydb_lock_decr_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_lock_decr_st.c - SimpleThreadAPI version of ydb_lock_decr_s().

sr_unix/ydb_lock_incr_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_lock_incr_st.c - SimpleThreadAPI version of ydb_lock_incr_s().

sr_unix/ydb_lock_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_lock_st.c - SimpleThreadAPI version of ydb_lock_s() (note is not used in Golang
        wrapper because cgo cannot call a variadic routine so it is implemented a different
        way but this is the correct call to use with C SimpleThreadAPI and perhaps other
        languages.

sr_unix/ydb_malloc.c - Make sure YDB runtime is initialized.

sr_unix/ydb_malloc_t.c - SimpleThreadAPI version same as ydb_malloc().

sr_unix/ydb_message.c - SimpleAPI fetch an error message given the error number.

sr_unix/ydb_message_t.c - SimpleThreadAPI version same as ydb_message_s().

sr_unix/ydb_node_next_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_node_next_st.c - SimpleThreadAPI version of ydb_node_next_s().

sr_unix/ydb_node_previous_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_node_previous_st.c - SimpleThreadAPI version of ydb_node_previous_s().

sr_unix/ydb_set_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_set_st.c SimpleThreadAPI version of ydb_set_s().

sr_unix/ydb_simpleapi_ch.c - Add code to release pthread lock if holding it.

sr_unix/ydb_stdout_stderr_adjust.c - Make sure YDB runtime is initialized.

sr_unix/ydb_stdout_stderr_adjust_t.c - SimpleThread api version of ydb_stdout_stderr_adjust().

sr_unix/ydb_stm_args.c - SimpleThreadAPI - Stores arguments in a callblk and queues it on
        the appropriate queue for subsequent execution.

sr_unix/ydb_stm_freecallblk.c - SimpleThreadAPI - Puts a callblk (stm_que_ent) back on the
        free queue for reuse.

sr_unix/ydb_stm_getcallblk.c - SimpleThreadAPI - Either gets a callblk from the free queue
        or allocates a new one.

sr_unix/ydb_stm_init_work_queue.c - SimpleThreadAPI - Routine to initialize a work queue
        which includes allocating work queue, initializing the queue in it, and creating
        both the associated mutex and condition variable used for waiting on the queue.

sr_unix/ydb_stm_thread.c - SimpleThreadAPI - This routine picks up work items from either the
        the main work queue, or if a TP transaction is in progress, from the TP work queue.

sr_unix/ydb_stm_tpthread.c - SimpleThreadAPI - Routine that picks up work items (only ever
        a TP level request) from one of the TP queues associated with a given TP level.
        There is one of these for each TP level that has been created.

sr_unix/ydb_str2zwr_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_str2zwr_st.c - SimpleThreadAPI version of ydb_str2zwr_s().

sr_unix/ydb_subscript_next_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_subscript_next_st.c SimpleThreadAPI version of ydb_subscript_next_s().

sr_unix/ydb_subscript_previous_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_subscript_previous_st.c - SimpleThreadAPI version of ydb_subscript_previous_s().

sr_unix/ydb_thread_is_main.c - Make sure YDB runtime is initialized.

sr_unix/ydb_timer_cancel.c - Make sure YDB runtime is initialized.

sr_unix/ydb_timer_cancel_t.c - SimpleThreadAPI version of ydb_timer_cancel().

sr_unix/ydb_timer_start.c - Make sure YDB runtime is initialized.

sr_unix/ydb_timer_start_t.c - SimpleThreadAPI version of ydb_timer_start().

sr_unix/ydb_tp_s.c - Most of the guts of this routine are removed to ydb_tp_s_common.c (see
        in added modules below.

sr_unix/ydb_tp_s_common.c - The main guts of SimpleAPI TP call.

sr_unix/ydb_tp_sst.c - Internal routine to drive ydb_tp_s_common.

sr_unix/ydb_tp_st.c - SimleThreadAPI version of ydb_tp_s().

sr_unix/ydb_zwr2str_s.c - Change in arguments in LIBYOTTADB_INIT() macro, Add
        VERIFY_NON_THREADED_API macro.

sr_unix/ydb_zwr2str_st.c - SimpleThreadAPI version of ydb_zwr2str_s().

sr_unix/yottadb_symbols.exp - Add all new STAPI entry points and a couple new SimpleAPI
        entry points too (eliminate a few duplicates).

sr_unix_cm/omi_prc_def.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_get.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_incr.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_kill.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_lock.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_next.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_qry.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_rord.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_set.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_sete.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_setp.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_unla.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_unlc.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/omi_prc_unlk.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/rc_fnd_file.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/rc_prc_getp.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/rc_prc_kill.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/rc_prc_lock.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/rc_prc_set.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_unix_cm/rc_prc_setf.c - Add threadgbl initialization needed by ESTABLISH* macros now.

sr_x86_64/dm_start.c - Update ESTABLISH macro usage to use new 2nd label parm needed.

sr_x86_64/error.si - Modify assembler ESTABLISH macro so it properly decrements rts_error_level
        global variable on error return or MUM_TSTART.
chathaway-codes pushed a commit that referenced this issue Nov 21, 2018
…CK being called during exit handling

When a C program that spawned off multiple threads that used the SimpleThreadAPI (e.g. ydb_tp_st() etc.)
was deadlocked (due to a code issue), pressing Ctrl-C (SIGINT) did nothing so pressing Ctrl-\ (SIGQUIT)
to terminate the C program caused a MAXRTSERRDEPTH fatal error and resulted in a core dump.

Below is the actual output.

^C^\%YDB-F-MAXRTSERRDEPTH Error loop detected - aborting image with coreQuit (core dumped)

The corresponding C-stack follows.

(gdb) where
 #0  __pthread_kill (threadid=<optimized out>, signo=3) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
 #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:72
 #2  rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df52090) at sr_unix/rts_error.c:144
 #3  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #4  rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df52270) at sr_unix/rts_error.c:146
 #5  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #6  rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df52450) at sr_unix/rts_error.c:146
 #7  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #8  rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df52630) at sr_unix/rts_error.c:146
 #9  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #10 rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df52810) at sr_unix/rts_error.c:146
 #11 rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #12 rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df529f0) at sr_unix/rts_error.c:146
 #13 rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #14 rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df52bd0) at sr_unix/rts_error.c:146
 #15 rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #16 rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df52db0) at sr_unix/rts_error.c:146
 #17 rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #18 rts_error_va (csa=0x0, argcnt=7, var=0x7fb28df52f90) at sr_unix/rts_error.c:146
 #19 rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #20 send_msg_va (csa=0x0, arg_count=8, var=0x7fb28df53570) at sr_unix/send_msg.c:125
 #21 send_msg_csa (csa=0x0, arg_count=8) at sr_unix/send_msg.c:84
 #22 generic_signal_handler (sig=3, info=0x7fb28df53830, context=0x7fb28df53700) at sr_unix/generic_signal_handler.c:244
 #23 <signal handler called>
 #24 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb2880180a8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 #25 __pthread_cond_wait_common (abstime=0x0, mutex=0x7fb288018040, cond=0x7fb288018080) at pthread_cond_wait.c:502
 #26 __pthread_cond_wait (cond=0x7fb288018080, mutex=0x7fb288018040) at pthread_cond_wait.c:655
 #27 ydb_stm_thread (parm=0x0) at sr_unix/ydb_stm_thread.c:80
 #28 start_thread (arg=0x7fb28df54700) at pthread_create.c:463
 #29 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The primary error was at #20 in send_msg_va() inside the PTHREAD_MUTEX_LOCK_IF_NEEDED macro.
The actual assert that failed inside the macro was the following.

sr_unix/gtm_multi_thread.h
---------------------------
     99                 /* We should never use pthread_* calls inside a signal/timer handler. Assert that */                    \
    100                 assert(!in_nondeferrable_signal_handler);                                                               \

We were in a signal handler handling a non-deferrable signal (Ctrl-\ aka SIGQUIT) and are about to do
a pthread_mutex_lock() library call which is a no-no.

If we are in an exit handler, it is possible for send_msg() to be needed (to log the signal that was received
etc.) but it is safer to not do any pthread activity since we cannot be sure if we are exiting while inside
a signal handler or not. Therefore the fix for this is to check if "process_exiting" global variable is TRUE
and if so, we skip all pthread* calls in the PTHREAD_MUTEX_LOCK_IF_NEEDED and PTHREAD_MUTEX_UNLOCK_IF_NEEDED
macros.
chathaway-codes pushed a commit that referenced this issue Nov 21, 2018
A test C program with one or more threads each of which use ydb_tp_st() to do TP transactions
deadlocked once in a while when the database has concurrent udpates that caused the TP
transaction to restart.

Below is the C-stack of the hung TP worker thread.

Thread 5 (Thread 0x7f5d1d7ad700 (LWP 94507)):
 #0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 #1  __GI___pthread_mutex_lock (mutex=0x7f5d222252c0 <thread_mutex>) at ../nptl/pthread_mutex_lock.c:78
 #2  rts_error_va (csa=0x0, argcnt=1, var=0x7f5d1d7ac5a0) at sr_unix/rts_error.c:146
 #3  rts_error_csa (csa=0x0, argcnt=1) at sr_unix/rts_error.c:101
 #4  ydb_tp_s_common (stapi=1, tptoken=15090, tpfn=0x55c68a782bc0 <gvnset>, tpfnparm=0x7f5d2000fed0, transid=0x0, namecount=0, varnames=0x0) at sr_unix/ydb_tp_s_common.c:223
 #5  ydb_tp_sst (tptoken=15090, tpfn=0x55c68a782bc0 <gvnset>, tpfnparm=0x7f5d2000fed0, transid=0x0, namecount=0, varnames=0x0) at sr_unix/ydb_tp_sst.c:32
 #6  ydb_stm_tpthreadq_process (curTPWorkQHead=0x7f5d1001ba40) at sr_unix/ydb_stm_tpthread.c:125
 #7  ydb_stm_tpthread (parm=0x0) at sr_unix/ydb_stm_tpthread.c:80
 #8  start_thread (arg=0x7f5d1d7ad700) at pthread_create.c:463
 #9  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) f 4
 #4  ydb_tp_s_common (stapi=1, tptoken=15090, tpfn=0x55c68a782bc0 <gvnset>, tpfnparm=0x7f5d2000fed0, transid=0x0, namecount=0, varnames=0x0) at sr_unix/ydb_tp_s_common.c:223
223                     INVOKE_RESTART;

It notices a restartable situation and therefore wants to signal that but in order to do that it
goes through the function rts_error_va() (frame #2) which tries to do a PTHREAD_MUTEX_LOCK_IF_NEEDED
and that is hung because the pthread mutex lock is held by some other thread.

Turns out this thread was obtained by this same thread in a prior restart (in rts_error_va) but
it was never released. Since all SimpleAPI and SimpleThreadAPI calls to rts_error_va() go through
ydb_simpleapi_ch(), that is where we should be releasing the pthread lock. We do that already but
towards the end of the function.  And in case of a TP restart (ERR_TPRETRY error code), we return
before reaching that point.

The main fix is to sr_unix/ydb_simpleapi_ch.c to move the PTHREAD_MUTEX_UNLOCK_IF_NEEDED macro to
the beginning of ydb_simpleapi_ch(). The pthread unlock logic that was already there did not use
the macro (code duplication) and is now removed.

While at this, various other issues were noticed and fixed.

* sr_unix/ydb_stm_thread.c
  1) It did not initialize the posix_timer_thread_id to be the worker thread.
  2) It did not initialize the thread_mutex global variable (which is what is used by the
     PTHREAD_MUTEX_LOCK_IF_NEEDED and PTHREAD_MUTEX_UNLOCK_IF_NEEDED macros). This is now done
     using a new INITIALIZE_THREAD_MUTEX_IF_NEEDED macro.

* sr_unix/gtm_multi_thread.c : This is the only place where the thread_mutex global variable was
  previously initialized. This now invokes the new INITIALIZE_THREAD_MUTEX_IF_NEEDED macro.

* sr_port/gtm_malloc_src.h : was_holder was being initialized incorrectly before invoking the
  PTHREAD_MUTEX_UNLOCK_IF_NEEDED macro just before issuing a MEMORY error. The consequences of this
  issue are that the pthread mutex lock would not be released by gtm_malloc() in case of a MEMORY error
  in case the call is done by when multi-threading is turned on. Since multi-threading was turned on
  only for MUPIP JOURNAL ROLLBACK/RECOVER, this is a non-issue until now when multi-threading is used
  due to support of ydb_malloc_t() in the SimpleThreadAPI.

* sr_unix/gtm_multi_thread.h : Defines the new INITIALIZE_THREAD_MUTEX_IF_NEEDED macro.
chathaway-codes pushed a commit that referenced this issue Nov 26, 2018
These changes:
  1. Add trace table entries to aid in debugging
  2. Fix some formatting errors.
  3. Switch from using naked library calls to using EINTR loop wrapper macros
  4. Fix an issue with an assert in ydb_stm_getcallblk.c
  5. Get rid of SEMWAKE trace table code - merge with new REQCOMPLT code that contains
     additional information.
  6. Nix ydb_tp_sst() entry point allowing ydb_stm_tpthread() to call ydb_tp_s_common()
     directly eliminating a pointless middle-man.
chathaway-codes pushed a commit that referenced this issue Nov 27, 2018
…ops while a V4 block is dirty and SimpleThreadAPI is active

The simplethreadapi/lockst subtest failed as follows when the test framework randomly chose V4 format blocks.

%YDB-F-MAXRTSERRDEPTH Error loop detected - aborting image with core

And produced a core with the following C-stack

 #0  __pthread_kill (threadid=<optimized out>, signo=3) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
 #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:72
 #2  send_msg_va (csa=0x0, arg_count=9, var=0x7f25a0f7ff40) at sr_unix/send_msg.c:123
 #3  send_msg (arg_count=9) at sr_unix/send_msg.c:74
 #4  gtm_assert2 (condlen=17, condtext=0x7f25a62dc0d6 "!timer_in_handler", file_name_len=45, file_name=0x7f25a62dc0a8 "sr_unix/send_msg.c", line_no=125) at sr_port/gtm_assert2.c:34
 #5  send_msg_va (csa=0x0, arg_count=9, var=0x7f25a0f80570) at sr_unix/send_msg.c:125
 #6  send_msg (arg_count=9) at sr_unix/send_msg.c:74
 #7  gtm_assert2 (condlen=17, condtext=0x7f25a62dc0d6 "!timer_in_handler", file_name_len=45, file_name=0x7f25a62dc0a8 "sr_unix/send_msg.c", line_no=125) at sr_port/gtm_assert2.c:34
 .
 .
 #32 send_msg_va (csa=0x0, arg_count=9, var=0x7f25a0f83d20) at sr_unix/send_msg.c:125
 #33 send_msg (arg_count=9) at sr_unix/send_msg.c:74
 #34 gtm_assert2 (condlen=17, condtext=0x7f25a6305658 "!timer_in_handler", file_name_len=51, file_name=0x7f25a6305320 "sr_port/gtm_malloc_src.h", line_no=685) at sr_port/gtm_assert2.c:34
 #35 gtm_malloc (size=4096) at sr_port/gtm_malloc_src.h:685
 #36 wcs_wtstart (region=0x5558a88e1170, writes=0, cr_list_ptr=0x0, cr2flush=0x0) at sr_unix/wcs_wtstart.c:538
 #37 wcs_stale (tid=93839273365872, hd_len=8, region=0x5558a8979628) at sr_port/t_end_sysops.c:1387
 #38 timer_handler (why=14) at sr_unix/gt_timers.c:815
 #39 <signal handler called>
 #40 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x5558a88e2ea8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 #41 __pthread_cond_wait_common (abstime=0x0, mutex=0x5558a88e2e40, cond=0x5558a88e2e80) at pthread_cond_wait.c:502
 #42 __pthread_cond_wait (cond=0x5558a88e2e80, mutex=0x5558a88e2e40) at pthread_cond_wait.c:655
 #43 ydb_stm_thread (parm=0x0) at sr_unix/ydb_stm_thread.c:92
 #44 start_thread (arg=0x7f25a0f85700) at pthread_create.c:463
 #45 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The primary issue is failure in the assert "assert(!timer_in_handler)" inside the PTHREAD_MUTEX_LOCK_IF_NEEDED
macro at frame # 35 (gtm_malloc). This is because we are about to make a pthread_mutex_lock() call which should
not be done while inside a signal handler (according to the man pages). To fix this issue, we skip flushing
that particular cache-record in wcs_wtstart() thereby avoiding the need to call gtm_malloc() (and in turn
the pthread_mutex_lock() call) while inside the timer handler.
chathaway-codes pushed a commit that referenced this issue Nov 27, 2018
…o ydb_lock_incr_s() on 32-bit YottaDB build (ARM); Add hex codes in comments for YDB_TP_RESTART/YDB_TP_ROLLBACK/YDB_NODE_END/YDB_LOCK_TIMEOUT/YDB_NOTOK etc. in libyottadb.h

The 64-bit nanosecond timeout passed as the first parameter to ydb_lock_incr_st() is stored
as the first 2 parameters in the 32-bit callblk.args[] array. So in ydb_stm_threadq_process(),
callblk->args[0] and callblk->args[1] should be used as the least-significant and most-significant
32-bit values respectively (instead of callblk->args[1] and callblk->args[2] respectively).

Using the wrong callblk->args[] array parameters leads to the actual timeout passed to
ydb_lock_incr_s() being completely different from the original timeout passed to ydb_lock_incr_st()
causing the simplethreadapi/lockst subtest to fail on 32-bit ARM platform with the following
signature (timeout is 0 seconds instead of the expected 1 second).

44c44
< Lock timeout test PASSED for ydb_lock_incr_st() : Timeout expected = 1 second. Actual timeout = 1 seconds
---
> Lock timeout test PASSED for ydb_lock_incr_st() : Timeout expected = 1 second. Actual timeout = 0 seconds
chathaway-codes pushed a commit that referenced this issue Nov 27, 2018
…ydb_malloc_t/ydb_free_t instead of ydb_malloc/ydb_free in YDB_MALLOC_BUFFER/YDB_FREE_BUFFER)
chathaway-codes pushed a commit that referenced this issue Nov 29, 2018
To allow build tools to automatically pick up the YottaDB and the
correct C and Library flags used for compilation a yottadb.pc file
is generated and placed in /usr/share/pkgconfig.
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…s it can be invoked while another thread is in a signal handler
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…n intrpt_ok_state in local variable before asserting on it

If asserts involving global variable "intrpt_ok_state" fail, it is possible that the actual value of the
global variable at the time of the assert failure changes before a core file is dumped. To avoid this,
we note down the global into a local variable and assert on it. The core file will keep the local
variable value intact.
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…ed if <ydb_trace_groups> env var is defined)
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…hread (fix bug that intended to block previously but did not)

The SIGPROCMASK macro that was previously being in turn invoked the INSIDE_THREADED_CODE macro to
determine whether sigprocmask() or pthread_sigmask() needed to be called. But that macro was true only
if the "multi_thread_in_use" global variable was TRUE which is not the case for SimpleThreadAPI. So
we were incorrectly invoking sigprocmask() even though we were in a multi-threaded environment.
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…e SimpleThreadAPI before calling malloc() to avoid issues with the engine concurrently running in the MAIN or TP worker threads
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…t (or else ydb_tp_st() can return garbage values)
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…mutex; Simplify flow in ydb_init() for handling concurrent invocations from multiple threads
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…NEEDED as no-ops in case of simpleThreadAPI_active (<ydb_engine_threadsafe_mutex> is locked by SimpleThreadAPI)
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…_csa is not usable (condition handler stack is not yet setup)

There are various types of changes in this commit.

1) The main change is to not use "rts_error_csa" when gtmci_ch is not yet established as a condition
   handler in ydb_init() (because the condition handler stack is not yet set up by err_init()). Instead
   we use a FPRINTF of the actual error and return the error status. The FPRINTF code is mostly copied
   over from sr_unix/dlopen_libyottadb.c where we have a similar situation (cannot use rts_error).

2) All error return code paths before the condition handler "gtmci_ch" has been ESTABLISHED in
   ydb_init() are now fixed to release the thread-level lock "ydb_engine_threadsafe_mutex" that
   is obtained at function entry.

3) In ydb_exit(), usages "ydb_init_complete" is guaranteed to be TRUE for most of the function since we
   return right away if it is FALSE at function entry. So various code paths that relied on this
   variable have been simplified based on this variable being TRUE.

4) In sr_unix/gtmci.c, there are 4 functions where gtmci_ch is established as a condition handler. They
   are ydb_cij(), ydb_ci_exec(), ydb_init() and ydb_exit(). Out of these, only the ydb_init() and
   ydb_exit() invocations do a lock/unlock of "ydb_engine_threadsafe_mutex". But gtmci_ch() unconditionally
   does an unlock. It should do the unlock only if invoked from the last two functions. So a new
   global variable "ydb_engine_threadsafe_mutex_holder" is introduced to identify if we are in
   ydb_init() or ydb_exit() (this variable will be set to a non-zero thread id in that case) and so
   gtmci_ch() does the unlock only if this variable matches the currently running thread id.
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…32-bit YDB builds) for later use by ydb_lock_incr_st()

This fixes the simplethreadapi/lockst subtest failure only on the ARM boxes.
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…loc/free (instead of ydb_malloc/ydb_free) as they are multi-thread safe (less error prone to use with SimpleThreadAPI); Also removes the need for a separate YDB_MALLOC_BUFFER_T/YDB_FREE_BUFFER_T so they are now removed
chathaway-codes pushed a commit that referenced this issue Nov 30, 2018
…_stm_getcallblk(); Removes need for a thread level lock in MAIN SimpleThreadAPI worker thread

Currently malloc() (i.e. ydb_malloc()) invocations from ydb_stm_getcallblk() are the only YottaDB
engine function invocations that can concurrently happen in other threads while the MAIN or TP
worker threads are running.  And since ydb_malloc() is not multi-thread safe, there was a need to
get a thread-level lock in the MAIN and TP worker threads. This lock already exists in the MAIN
worker thread but not in the TP worker thread but it is needed since ydb_tp_s_common (invoked from
the TP worker thread) can invoke malloc().

But instead of adding a thread-level lock in the TP worker thread, this commit changes ydb_stm_getcallblk()
to use the system malloc() which is multi-thread safe. This means the YottaDB engine will always run
only in one thread (the MAIN worker thread or the TP worker thread). And so there is no need for any
thread-level locks in either of the threads. This means the lock/release sequence in the MAIN worker
thread can also be (and is now) removed.  Should improve performance of SimpleThreadAPI.
chathaway-codes pushed a commit that referenced this issue Dec 5, 2018
chathaway-codes pushed a commit that referenced this issue Dec 5, 2018
…in multi-threaded YottaDB process using SimpleThreadAPI; Invoke ydb_child_init() automatically as part of fork in YottaDB process using SimpleAPI
chathaway-codes pushed a commit that referenced this issue Dec 20, 2018
chathaway-codes pushed a commit that referenced this issue Dec 21, 2018
…o issue an error); Nix NOTSUPSTAPI message

The simplethreadapi/tp subtest failed once with the following signature in the
tp5_TPTIMEOUT.c section of the test.

%YDB-F-ASSERT, Assert failed in sr_unix/gt_timers.c line 725 for expression
	(gtm_is_main_thread() || gtm_jvm_process || exit_handler_active && (DUMMY_SIG_NUM == why))

with the following C-stack

(gdb) where
 #0  __pthread_kill (threadid=<optimized out>, signo=3) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
 #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:72
 #2  gtm_fork_n_core () at sr_unix/gtm_fork_n_core.c:148
 #3  ch_cond_core () at sr_unix/ch_cond_core.c:64
 #4  rts_error_va (csa=0x0, argcnt=7, var=0x7fe3a13888b0) at sr_unix/rts_error.c:194
 #5  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #6  timer_handler (why=0) at sr_unix/gt_timers.c:725
 #7  check_for_deferred_timers () at sr_unix/gt_timers.c:1179
 #8  deferred_signal_handler () at sr_port/deferred_signal_handler.c:49
 #9  rts_error_va (csa=0x0, argcnt=4, var=0x7fe3a1388c70) at sr_unix/rts_error.c:194
 #10 rts_error_csa (csa=0x0, argcnt=4) at sr_unix/rts_error.c:101
 #11 ydb_hiber_start (sleep_nsec=1000000) at sr_unix/ydb_hiber_start.c:46
 #12 gvnset (tptoken=1) at tp5_TPTIMEOUT.c:77
 #13 ydb_stm_tpthreadq_process (curTPWorkQHead=0x13dac40) at sr_unix/ydb_stm_tpthread.c:197
 #14 ydb_stm_tpthread (parm=0x0) at sr_unix/ydb_stm_tpthread.c:78
 #15 start_thread (arg=0x7fe3a1389700) at pthread_create.c:333
 #16 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

This was a process that had already made SimpleThreadAPI calls but is now making a SimpleAPI call
(ydb_hiber_start()) and so a NOTSUPSTAPI error is about to be issued. But since this call is happening
in the user-defined callback function inside a TP transaction, it is the TP worker thread (not the
MAIN worker thread) that is executing the "ydb_hiber_start". This means the rts_error_csa invocation
is running in the TP worker thread while the YottaDB engine is concurrently being modified by the
MAIN worker thread. A no-no since the YottaDB engine is not multi-threaded.

To fix this issue, the VERIFY_NON_THREADED_API macro is now fixed to do a "return YDB_ERR_INVAPIMODE".
This means ydb_hiber_start() would return a lot sooner thereby not requiring an "rts_error" invocation.

But while doing this change, noticed a few issues. The VERIFY_NON_THREADED_API is used from a few
functions that do not return any value (sr_unix/ydb_free.c and sr_unix/ydb_timer_cancel.c) so a new
macro VERIFY_NON_THREADED_API_NORETVAL is created which is very similar except it does a plain "return".
sr_unix/ydb_malloc.c needed special handling since it was returning a "void *" and so a new
VERIFY_NON_THREADED_API_RETNULL macro is created for that purpose.

While at this, noticed that the VERIFY_NON_THREADED_API macro was not resetting
TREF(libyottadb_active_rtn) in case of an INVAPIMODE return (since this macro is usually invoked
after a LIBYOTTADB_INIT) so fixed it to do so.

Note that the VERIFY_THREADED_API macro stayed the same in that it did not do this reset since it is
usually invoked before the LIBYOTTADB_INIT macro. But two exceptions to this rule were found,
sr_unix/ydb_cip_helper.c and sr_unix/ydb_tp_s_common.c. They are now fixed so the VERIFY_THREADED_API
macro invocation happens before the LIBYOTTADB_INIT macro.

Another issue that was noticed is that "ydb_ci" and "ydb_cip" were not doing a VERIFY_NON_THREADED_API
check like other SimpleAPI calls do so a new sr_unix/ydb_ci.c and sr_unix/ydb_cip.c were created to
do this before invoking ydb_ci_exec(). And the existing ydb_ci() and ydb_cip() function definitions
in sr_unix/gtmci.c were removed. A new VERIFY_NON_THREADED_API_DO_NOT_SHUTOFF_ACTIVE_RTN macro was
introduced for this purpose since we do not want to do a LIBYOTTADB_INIT in these functions (to avoid
unnecessary SIMPLEAPINEST errors).

With all these changes, the NOTSUPSTAPI error (currently issued in sr_unix/ydb_hiber_start_wait_any.c
and sr_unix/ydb_hiber_start.c) was no longer necessary since an INVAPIMODE error would have been issued
before this error codepath  is reached in all callers. So this error message is now removed.
chathaway-codes pushed a commit that referenced this issue Dec 24, 2018
…h multiple threads in user-defined callback function

A test case created multiple threads (or go-routines in the GoWrapper) inside the user-defined
callback function that is invoked for a TP transaction. Each of the thread in turn did a
TP sub-transaction i.e. invoking ydb_tp_st() in a nested fashion using the same callback
function as before. And limited the thread creation in the callback function once the TP
depth reached a small value greater than 1. With this test case, one noticed failures with
SIG-11, INVTPTRANS errors and hangs all of which exposed a few design issues in the current
SimpleThreadAPI implementation.

Symptoms
--------
1) The SIG-11 showed up in at least two places.

   ydb_stm_args.c:		if (tptoken != stmWorkQueue[TREF(curWorkQHeadIndx)]->tptoken)
   ydb_stm_tpthread.c:	curTPWorkQHead = stmWorkQueue[TREF(curWorkQHeadIndx)];

   In each case, stmWorkQueue[TREF(curWorkQHeadIndx)] was NULL.
   This is possible for example because in ydb_stm_thread.c, we do a TREF(curWorkQHeadIndx)++
   first and only then allocate the queue for that index by invoking a ydb_stm_init_work_queue(().
   Since TREF(curWorkQHeadIndx) is a global variable, other threads which want to queue
   requests (i.e. coming through ydb_stm_args.c) and which invoke the user-defined callback
   function (i.e. coming through ydb_stm_tprehad.c) could catch this global variable in that
   inconsistent state (the timing window when the global variable is bumped but the queue has
   not yet been allocated) and get a SIG-11.

2) The INVTPTRANS error is another consequence of the way a YottaDB request is scheduled
   in ydb_stm_args(). If two user-defined callback functions are executing inside of a YottaDB
   TP transaction in different threads (say T1 and T2) and invoking ydb_tp_st() concurrently,
   both will be using the same tptoken (that of the callback function invoked as part of the
   outer TP transaction). It is possible that the request from T1 initiates a sub-transaction
   and that gets serviced in the MAIN and TP worker threads (i.e. doing an op_tstart() resulting
   in dollar_tlevel = 2) before the ydb_tp_st() request from T2 is even scheduled. This means
   the ydb_stm_args() call from T2 (for the ydb_tp_st()) will come in with a tptoken corresponding
   to the TP of dollar_tlevel=1 whereas the MAIN worker thread has already shifted its global
   variables to correspond to the TP worker thread corresponding to dollar_tlevel=2 and so the
   tptoken will not match since each sub-transaction currently causes a new tptoken (stmTPToken
   global variable).

3) While I don't have a definitive explanation for this, I suspect the hang symptoms are related
   to the fact that we currently have only ONE TP queue that the MAIN worker thread reads from. And
   since each ydb_tp_st() request in the non-tp queue translates to a sequence of requests in the
   TP queue (TSTART, TCOMMIT, TRESTART etc.) it is possible that the sequence of these requests
   from T1 and T2 get interleaved in the TP queue. This means it is possible for the requests in
   the TP queue to be T1_tstart, T2_tstart, T1_commit, T2_commit. For one, the MAIN worker thread
   is going to execute op_tstart() twice in sequence one on behalf of T1 and one on behalf of T2
   resulting in dollar_tlevel jumping up by two instead of executing the TP for T1 and T2 one after
   the other. Not yet sure what the implications of this out-of-order execution are but clearly
   this is a case of issues waiting to happen. What is desired is the sub-transaction started by
   one thread (T1 or T2) is started and committed before the sub-transaction of the other thread
   is even started. This will avoid all confusion and lead to a serialized execution of the TP
   transaction.

Fixes
------
To address the above failures, the following design changes are made.

a) TREF(curWorkQHeadIndx) is now removed as it is definitely not maintained in a multi-thread-safe
   manner.

b) There is now an array of TP queues (i.e. stmTPWorkQueue[] is an array of pointers to queues,
   one corresponding to each TP depth. The MAIN worker thread switches its current queue to
   the appropriate TP queue based on the current value of dollar_tlevel. So the MAIN worker thread
   starts servicing at the non-TP queue stmWorkQueue[0] and once it starts a TP transaction, it
   switches to the queue stmTPWorkQueue[0] and once it starts a nested TP transaction, it switches
   to the queue stmTPWorkQueue[1] etc. and so on.

c) When scheduling a request in ydb_stm_args(), it is not scheduled in the current queue of the
   MAIN worker thread but is instead against the queue corresponding to the tptoken that the request
   comes in with. That is, if the MAIN worker thread is currently servicing requests from the TP
   queue corresponding to dollar_tlevel=2, it is possible the incoming request is from a callback
   function corresponding to dollar_tlevel=1. In this case, the incoming request should be queued in
   the TP queue stmTPWorkQueue[0] and not stmTPWorkQueue[1] which the MAIN worker thread is currently
   servicing.  Related to this, the tptoken (stmTPToken) is now a counter that is bumped only once
   per outer TP transaction. It stays the same for sub-transactions. But when the user-callback
   function is invoked, the TP depth (dollar_tlevel, which can be a max of 127) is bitwise ORed
   into the most significant 7-bits of the 64-bit tptoken. This way all tptokens corresponding
   to the same TP transaction across different sub-transactions all have the same last 57-bits.
   It is only the first 7-bits that could be different. With this design, the incoming request is
   scheduled in the appropriate queue by looking at the incoming tptoken and without relying on
   any multi-thread-unsafe global variables.

d) The call block now has an additional "tptoken" field (which was not maintained previously even
   though it was passed in as a parameter) which is needed for (c).

e) Even when a sub-transaction is finished, the TP worker thread schedules a TPCOMPLT request.
   This lets the MAIN worker thread know to switch to the outer TP queue (corresponding to
   dollar_tlevel - 1). The queue switching across the TP queues always happens in the MAIN worker
   thread (which there is only one thread) so global variable updates are multi-thread safe.
   Previously the TPCOMPLT request was done only when the outermost TP finished and this could
   cause interestingly issues since the MAIN worker thread has no way of knowing when a specific
   TP callback function (more than one could be concurrently running as part of the current
   transaction at different dollar_tlevels) is done.

f) Although not directly related, while making these changes, it was noticed that "stmTPToken" was
   defined as a "uintptr_t" in sr_unix/ydb_stm_tpthread.c. This is now fixed to instead be "uint64_t".
   While this is not an issue for the 64-bit builds, it makes a difference to 32-bit platforms.
   Coincidentally enough, the 32-bit platforms were previously failing mysteriously in various tests
   (that use the test framework com/simplethreadapi_imptp.c) with SIG-11s but none of those were
   observed in last night's testing with this fix so suspect this type change actually fixed that
   otherwise-hard-to-debug issue.

With all these changes, it does not matter how many threads are concurrently running as part of an
active TP transaction at different dollar_tlevels in user-code, YottaDB is going to serialize them
in some order so one sub-transaction starts and finishes before another sub-transaction request is
even scheduled.

Note that this design still does not handle the case where the user-defined callback function does
not wait for all concurrent threads to finish before returning to its caller. This is something
that the application/user needs to ensure if they are spawning multiple threads/go-routines inside
a TP transaction. Will be documented in the MLPG (Multi Language Programmers Guide).
chathaway-codes pushed a commit that referenced this issue Jan 4, 2019
…g is always accurate in a multi-threaded environment

This address the issue where a multi-threaded program invokes some SimpleThreadAPI function
(e.g. ydb_get_st) and gets an error returned back. And then invokes ydb_zstatus() to see the error
string corresponding to the returned error but since ydb_zstatus() looks at a process-wide-global
ISV $zstatus, it could correspond to an error from another thread that also got an error in the
YottaDB engine soon after our ydb_get_st() call.

The fix is to pass in a "errstr" parameter to all SimpleThreadAPI calls. This is a ydb_buffer_t
pointer which can be NULL in which case no error string is filled in. If errstr is non-NULL, and
an error is encountered during a SimpleThreadAPI invocation (i.e. ydb_simpleapi_ch is invoked),
errstr->buf_addr is filled with the $zstatus string (upto errstr->len_alloc bytes including a
terminating null byte which means it could be truncated too) by ydb_simpleapi_ch(). This avoids
the concurrency issue since at the time ydb_simpleapi_ch() is invoked, the MAIN worker thread is
running the YottaDB engine and servicing only one request even though the process is multi-threaded.

A new global variable TREF(stapi_errstr) is set to point to the user-passed errstr parameter
in the MAIN worker thread (ydb_stm_thread.c) before it invokes a SimpleAPI function. This is
relied upon by ydb_simpleapi_ch.c as an indication to fill in the error string. TREF(stapi_errstr)
is reset to NULL soon after the request is serviced in the MAIN worker thread.
chathaway-codes pushed a commit that referenced this issue Jan 8, 2019
chathaway-codes pushed a commit that referenced this issue Jan 8, 2019
…I and LIBAIO changes

The SimpleThreadAPI changes (#205) apply to sr_x86_64/sr_armv7l/sr_aarch64.

The LIBAIO changes (#358) apply only to sr_armv7l/sr_aarch64.

While at this, fixed the CACHELINE_PAD macro to not generate a semi-colon and changed all
macro callers to instead insert the semi-colon at the end to be more consistent with other
similar macro usages. This is a cosmetic change.
chathaway-codes pushed a commit that referenced this issue Jan 8, 2019
chathaway-codes pushed a commit that referenced this issue Jan 10, 2019
…ck of all threads in core for better debugging (only for DEBUG builds)
chathaway-codes pushed a commit that referenced this issue Jan 10, 2019
… pid etc.) before forwarding signal to another thread

This ensures a SIG-3 sent by a pid shows up as KILLBYSIGUINFO message (which corresponds to an
externally generated signal and has sending pid details) instead of a KILLBYSIGSINFO1 message
(which is an internally generated signal in the process and has no sending pid details).

A new global variable "exi_signal_forwarded" records a non-zero value (the signal number) in case
forwarding happened in generic_signal_handler(). And records the incoming "info" and "context" from
the OS into global variables "exi_siginfo" and "exi_context". When the signal gets forwarded and
generic_signal_handler() gets invoked again, we use this new global variable to detect that this is
a forwarded situation and use the global variables as is instead of using the input "info" and
"context" in that signal handler invocation. This way we preserve the original signal handler
information even in case of a forwarded signal. Only signals in generic_signal_handler() currently
benefit from this scheme as all other signal handlers in YottaDB do not use either "info" or "context".
But this change meant a change to the FORWARD_SIG_TO_MAIN_THREAD_IF_NEEDED macro interface and so
all callers had to change a bit.
chathaway-codes pushed a commit that referenced this issue Jan 10, 2019
…ThreadAPI is active

This issue was exposed by a failure in the dual_fail_extend/dual_fail2_mustop_sigquit subtest.
This test terminates processes by sending them a SIGQUIT/SIG-3 or SIGTERM/SIG-15 signal.
But since one of the threads (the MAIN worker thread) in this multi-threaded process was inside wcs_wtstart() in a
non-interruptable code zone (DEFER_INTERRUPTS had been done), the exit handler invoked in
another concurrently running thread decided to defer the exit until the ENABLE_INTERRUPTS
happened in the worker thread. When the ENABLE_INTERRUPTS did happen, the worker thread invoked
exit handling code while it was already inside a timer handler. And since this particular test
was running with GDSV4 format blocks, wcs_wtstart() could not flush such blocks (since it required
a call to gtm_malloc() which meant a pthread_mutex_lock() call while inside a timer handler which is
a no-no) and so wcs_flu() was not able to flush any blocks as part of exit handling causing it to
fail an assert. Below is the C-stack corresponding to the assert failure.

(gdb) where
 #0  __pthread_kill (threadid=<optimized out>, signo=3) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
 #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:72
 #2  gtm_fork_n_core () at sr_unix/gtm_fork_n_core.c:148
 #3  ch_cond_core () at sr_unix/ch_cond_core.c:64
 #4  rts_error_va (csa=0x0, argcnt=7, var=0x7f59dccc02a0) at sr_unix/rts_error.c:194
 #5  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #6  wcs_flu (options=519) at sr_unix/wcs_flu.c:587
 #7  gds_rundown (cleanup_udi=1) at sr_unix/gds_rundown.c:608
 #8  gv_rundown () at sr_port/gv_rundown.c:123
 #9  gtm_exit_handler () at sr_unix/gtm_exit_handler.c:204
 #10 __run_exit_handlers (status=-3, listp=0x7f59e2319718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
 #11 __GI_exit (status=<optimized out>) at exit.c:139
 #12 gtm_image_exit (status=-3) at sr_unix/gtm_image_exit.c:27
 #13 deferred_exit_handler () at sr_unix/deferred_exit_handler.c:111
 #14 deferred_signal_handler () at sr_port/deferred_signal_handler.c:45
 #15 wcs_wtstart (region=0x55b9581d66d8, writes=0, cr_list_ptr=0x0, cr2flush=0x0) at sr_unix/wcs_wtstart.c:829
 #16 wcs_stale (tid=94254535632600, hd_len=8, region=0x55b9581d62a8) at sr_port/t_end_sysops.c:1387
 #17 timer_handler (why=14) at sr_unix/gt_timers.c:821
 #18 <signal handler called>
 #19 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:277
 #20 gtm_memcpy_validate_and_execute (target=0x7f59dccc25c0, src=0x7f59e32fd6c6, len=0) at sr_port/gtm_memcpy_validate_and_execute.c:42
 #21 gvcst_put2 (val=0x7f59e30c7440 <increment_delta_mval>, parms=0x7f59dccc4be0) at sr_port/gvcst_put.c:626
 #22 gvcst_put (val=0x7f59e30c7440 <increment_delta_mval>) at sr_port/gvcst_put.c:299
 #23 gvcst_incr (increment=0x55b9581a05a0, result=0x7f59d8009410) at sr_port/gvcst_incr.c:56
 #24 op_gvincr (increment=0x55b9581a05a0, result=0x7f59d8009410) at sr_port/op_gvincr.c:58

The fix for this issue is to not invoke exit handling while inside the timer handler if we know
SimpleThreadAPI is active. In that case, finish the timer handler first and invoke exit handling
a little later in mainline code where it is safe to invoke exit handling.
chathaway-codes pushed a commit that referenced this issue Jan 15, 2019
…SimpleThreadAPI mode

* In gtm_exit_handler(), which is the function guaranteed to be invoked when a YottaDB process needs
  to exit, if SimpleThreadAPI is in effect and we are not the MAIN worker thread, call ydb_exit() so
  worker/tp threads are signaled to exit, exit handler is driven from the MAIN worker thread, and we
  wait for all those threads to terminate before ydb_exit() returns.

* In ydb_exit(), at function entry check if simpleThreadAPI_active is TRUE and we are not the MAIN
  worker thread, set a global variable "forced_thread_exit" to TRUE and send a signal through
  pthread_cond_signal() to indicate to the MAIN and TP worker threads that they need to exit at
  a logical point. And then wait for those threads to die.  And then return to caller. Also fix
  various edge cases in ydb_exit() so it is multi-thread safe always.  A lot of the SimpleThreadAPI
  related cleanup code has now been moved to ydb_stm_thread.c where the MAIN worker thread runs. It
  does all this cleanup when it exits. Some "SEE TODO" items are also taken care of now and so removed.

* ydb_stm_args*() functions now do not wait for the call block to be serviced by the MAIN worker thread
  in case "forced_thread_exit" is TRUE. They return with CALLINAFTEREXIT error in this case. This ensures
  that new calls to SimpleThreadAPI functions after a ydb_exit() is done no longer queue a request to
  the non-existent MAIN worker thread (which is non-existent or in the process of concurrently exiting).

* If MAIN worker thread (ydb_stm_thread.c), check if forced_thread_exit is TRUE. If so, go through the
  queue and service each waiting request with a CALLINAFTEREXIT error as the return value. And then
  invoke the exit handler gtm_exit_handler() and then wait for TP worker threads to terminate and do
  various SimpleThreadAPI data structure cleanup before exiting from the worker thread.

* If TP worker thread (ydb_stm_tpthread.c), check if forced_thread_exit is TRUE. If so, go through the
  queue and service each waiting request with a CALLINAFTEREXIT error as the return value. And then exit.
chathaway-codes pushed a commit that referenced this issue Jan 15, 2019
…ids YDB_ERR_INVTPTRANS errors while queueing the LYDB_RTN_TP_ROLLBACK_TLVL0 request (issue exposed by r124/ydb383 subtest in SimpleThreadAPI mode)
chathaway-codes pushed a commit that referenced this issue Jan 17, 2019
…reads in core for better debugging (only for DEBUG builds)

A prior commit enabled a similar change in sr_unix/ch_cond_core.c but that was done only in case of
a fatal error in the YottaDB engine. It is possible for ydb_fork_n_core/gtm_fork_n_core to be called
without the engine encountering a fatal error (e.g. a test C program that uses the SimpleThreadAPI
could encounter a YDB_ASSERT macro failure which will invoke ydb_fork_n_core) and we want the
C-stack of all threads even in that case for better debugging.

Note that because of this change, the code flow for PRO vs DBG builds is different. In PRO, one
would invoke a ydb_fork_n_core/gtm_fork_n_core, generate a core (to create a snapshot of the process
state for later debugging) but the process would continue. Whereas in DBG, the process would create
the core and terminate right then. Given this is done only in DBG builds, it is considered okay.
chathaway-codes pushed a commit that referenced this issue Jan 17, 2019
…ad_join during exit handling) by instead doing non-blocking join of MAIN/TP worker threads in a sleep-loop and sending multiple wake-ups

* The main changes are in sr_unix/gtmci.c and sr_unix/ydb_stm_thread.c.
  These are necessary to fix a deadlock that happens when the thread invoking ydb_exit() does
  a "pthread_cond_signal" to wake up a MAIN/TP worker thread but the receiving thread is not yet
  in a "pthread_cond_wait". The wake up signal sent is therefore lost. And this implies that the
  "pthread_join" that the ydb_exit() thread runs will hang forever in case the receiving worker
  thread soon afterwards goes to do a "pthread_cond_wait". This is now fixed to do a non-blocking
  join (using the Linux-specific pthread_tryjoin_np() function) in a sleep-loop and do a
  "pthread_cond_signal" in each iteration of the loop. Additionally, the cond/mutex variables
  across the various structures (stmWorkQueue, stmTPWorkQueue) are now destroyed only after they
  have been used for waiting/signaling a wake-up and once they are definitely no longer needed.
  This meant moving the destroy logic for those cond/mutex variables used by the MAIN worker thread
  to the ydb_exit()-invoking thread and moving the destroy logic for those cond/mutex variables
  used by the TP worker thread to the MAIN worker thread.

* Also fixed cosmetic tab vs space issues in sr_unix/libyottadb.h
chathaway-codes pushed a commit that referenced this issue Jan 17, 2019
…ll onto stderr/syslog; Just return YDB_ERR_CALLINAFTERXIT

Writing to stderr or syslog (through gtm_putmsg_csa or send_msg_csa) is user-unfriendly for what is
a programming error. Best to return the error through the invoked YottaDB function so caller can
then handle that as appropriate.
chathaway-codes pushed a commit that referenced this issue Jan 17, 2019
… CALLINAFTERXIT errors; Ensure ydb_zstatus() returns error string after a CALLINAFTERXIT return from ydb_init

Ensure we get the YottaDB engine pthread mutex before checking for STAPIFORKEXEC or CALLINAFTERXIT
errors. This is needed since SETUP_GENERIC_ERROR macro (which is invoked in both these error scenarios)
operates on global variables (e.g. dollar_zstatus) and therefore has to be multi-thread safe since
ydb_init() is supposed to be multi-thread safe.

Invoke SETUP_GENERIC_ERROR macro in case of a CALLINAFTERXIT error in ydb_init(). This error was being
issued in two places in the same function. The duplication is now removed too.
chathaway-codes pushed a commit that referenced this issue Jan 17, 2019
…k function in TP worker thread

The v60000/gtm4525b subtest failed (1 out of 100 runs or so) with the following assert failure.
The TP worker thread asked for a op_trollback to be done by the MAIN worker thread (using the
LYDB_RTN_TP_ROLLBACK_TLVL0 opcode) but instead got a YDB_ERR_CALLINAFTERXIT status returned.
This is possible because the test does a MUPIP STOP (i.e. kill -15) of processes which would
cause the MAIN/TP worker threads to be signaled to terminate at which point, any more requests
from them will return with YDB_ERR_CALLINAFTERXIT. The assert is now modified to take this into
account. Although this did not show up in the test failure, a similar issue exists with the opcode
LYDB_RTN_TP_ROLLBACK_TLVL0 and so even in that case we now handle the possibility of
YDB_ERR_CALLINAFTERXIT.

Below is a gdb session of the core failure for the record.

 #0 pthread_kill () from /usr/lib64/libpthread.so.0
 #1 gtm_dump_core () at sr_unix/gtm_dump_core.c:72
 #2 ch_cond_core () at sr_unix/ch_cond_core.c:76
 #3 rts_error_va () at sr_unix/rts_error.c:194
 #4 rts_error_csa () at sr_unix/rts_error.c:101
 #5 ydb_stm_tpthreadq_process () at sr_unix/ydb_stm_tpthread.c:259
 #6 ydb_stm_tpthread (parm=0x0) at sr_unix/ydb_stm_tpthread.c:83
 #7 start_thread () from /usr/lib64/libpthread.so.0
 #8 clone () from /usr/lib64/libc.so.6

(gdb) f 5
 #5  ydb_stm_tpthreadq_process () at sr_unix/ydb_stm_tpthread.c:259
259                                                     assert(YDB_TP_ROLLBACK == rlbk_retval);

(gdb) p rlbk_retval
$1 = -150381530

(gdb) p int_retval
$2 = -150381530

libydberrors.h:#define YDB_ERR_CALLINAFTERXIT -150381530
chathaway-codes pushed a commit that referenced this issue Jan 18, 2019
…TP transaction in final retry when SIGTERM/SIG-15 is received), MAIN/TP worker threads should service requests without YDB_ERR_CALLINAFTERXIT errors until the TP transaction commits

If SIGTERM is sent and generic_signal_handler() gets invoked, if we find that DEFER_EXIT_PROCESSING
is TRUE, we do not invoke the exit handler right away but instead defer it until it is safe to
start exit handler processing. But we do invoke SET_FORCED_EXIT_STATE which would set the global
variable "forced_thread_exit" to TRUE. And since this is the variable currently relied upon by the
SimpleThreadAPI worker threads (ydb_stm_thread.c and ydb_stm_tpthread.c), they would return
YDB_ERR_CALLINAFTERXIT on all pending requests in their work queues. But this means that if a TP
transaction is active and in the final retry (which also means we are holding crit on the database)
at the time the SIGTERM was received and exit handling was deferred, this TP transaction will never
commit fine because any ydb_*_s() requests done inside this callback function after the SIGTERM signal
got sent would return with a YDB_ERR_CALLINAFTERXIT error. This is not desirable as we want the
crit-holding transaction to be done as soon as possible and in a clean fashion.

Therefore, the worker threads design is reworked a bit to now rely on "forced_simplethreadapi_exit",
a new global variable. On seeing this, they will exit right away (what they previously used to do
when they saw "forced_thread_exit" to be TRUE).

ydb_exit() and generic_signal_handler() will invoke SET_FORCED_EXIT_STATE to set "forced_thread_exit"
to TRUE whenever they want the SimpleThreadAPI process to terminate at the next logical point.
The MAIN worker thread, before it attempts to service any request, will check "forced_thread_exit"
and if it finds this to be TRUE, but finds "forced_simplethreadapi_exit" to be FALSE and OK_TO_INTERRUPT
is TRUE, it will set "forced_simplethreadapi_exit" to be TRUE to indicate the logical point has been
reached and that the worker thread should exit right away.
chathaway-codes pushed a commit that referenced this issue Jan 18, 2019
…eads to reach logical point before starting exit handler processing

We had a test failure (in the dual_fail_extend/dual_fail2_mustop_sigquit subtest) where a SimpleThreadAPI
process was sent a SIG-15 by the test and the signal got delivered to the MAIN worker thread but it
went ahead with exit handler processing (including rolling back an active TP transaction) while a
TP worker thread was concurrently running the TP callback function without realizing all of this going on.
The TP worker thread effectively got an INVTPTRANS error since it was using a non-zero tptoken in a
ydb_set_st() call when there was no active TP transaction (due to the exit handler doing an op_trollback()).

The fix is to defer exit processing in generic_signal_handler.c if we find out that we are the
MAIN worker thread. This way the MAIN worker thread will invoke the exit handler gtm_exit_handler()
inside ydb_stm_thread() when it knows it is a logical/safe point to do so.

In addition, deferred_signal_handler() is now fixed to skip invoking the exit handler in case we
are the MAIN worker thread. This is because ydb_stm_thread() has an already established scheme
(using "forced_simplethreadapi_exit" global variable) to determine the logical point and then invoke
gtm_exit_handler().

Below is the C-stack of all threads at the time of the core for the record.

(gdb) thread apply all bt

Thread 3 (Thread 0x7fde4cb67700 (LWP 14698)):
 #0  fsync () from /usr/lib64/libc.so.6
 #1  jnl_fsync (reg=0x55af6c90e7b8, fsync_addr=38517184) at sr_unix/jnl_fsync.c:134
 #2  wcs_flu (options=519) at sr_unix/wcs_flu.c:413
 #3  gds_rundown (cleanup_udi=1) at sr_unix/gds_rundown.c:608
 #4  gv_rundown () at sr_port/gv_rundown.c:123
 #5  gtm_exit_handler () at sr_unix/gtm_exit_handler.c:216
 #6  __run_exit_handlers () from /usr/lib64/libc.so.6
 #7  exit () from /usr/lib64/libc.so.6
 #8  gtm_image_exit (status=-15) at sr_unix/gtm_image_exit.c:27
 #9  generic_signal_handler (sig=15, info=0x7fde4cb66830, context=0x7fde4cb66700) at sr_unix/generic_signal_handler.c:380
 #10 <signal handler called>
 #11 pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
 #12 ydb_stm_thread (parm=0x0) at sr_unix/ydb_stm_thread.c:123
 #13 start_thread () from /usr/lib64/libpthread.so.0
 #14 clone () from /usr/lib64/libc.so.6

Thread 2 (Thread 0x7fde510c6dc0 (LWP 14695)):
 #0  do_futex_wait.constprop () from /usr/lib64/libpthread.so.0
 #1  __new_sem_wait_slow.constprop.0 () from /usr/lib64/libpthread.so.0
 #2  ydb_stm_args (callblk=0x55af6c96b550) at sr_unix/ydb_stm_args.c:183
 #3  ydb_stm_args5 (tptoken=0, errstr=0x0, calltyp=16, p1=94211928230125, p2=140733677288928, p3=94211928265280, p4=1, p5=140733677288912) at sr_unix/ydb_stm_args.c:320
 #4  ydb_tp_st (tptoken=0, errstr=0x0, tpfn=0x55af6c8408ed <tpfn_stage1>, tpfnparm=0x7fff1cd7bde0, transid=0x55af6c849240 <tptypebuff> "BATCH", namecount=1, varnames=0x7fff1cd7bdd0) at sr_unix/ydb_tp_st.c:33
 #5  impjob (childnum=2) at simplethreadapi_imptp.c:1148
 #6  main (argc=1, argv=0x7fff1cd7c198) at simplethreadapi_imptp.c:602

Thread 1 (Thread 0x7fde47fff700 (LWP 14705)):
 #0  pthread_kill () from /usr/lib64/libpthread.so.0
 #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:72
 #2  ch_cond_core () at sr_unix/ch_cond_core.c:76
 #3  rts_error_va (csa=0x0, argcnt=7, var=0x7fde47ffeaa0) at sr_unix/rts_error.c:194
 #4  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:101
 #5  ydb_stm_args (callblk=0x7fde40000b20) at sr_unix/ydb_stm_args.c:126
 #6  ydb_stm_args4 (tptoken=7085, errstr=0x0, calltyp=12, p1=94211928265184, p2=2, p3=94211928261632, p4=94211928263568) at sr_unix/ydb_stm_args.c:298
 #7  ydb_set_st (tptoken=7085, errstr=0x0, varname=0x55af6c8491e0 <ygbl_arandom>, subs_used=2, subsarray=0x55af6c848400 <subscr>, value=0x55af6c848b90 <ybuff_val>) at sr_unix/ydb_set_st.c:33
 #8  tpfn_stage1 (tptoken=7085, errstr=0x0, parm_array=0x7fff1cd7bde0) at simplethreadapi_imptp.c:1384
 #9  ydb_stm_tpthreadq_process (curTPWorkQHead=0x7fde48024c40, forced_simplethreadapi_exit_seen=0x7fde47ffeea8) at sr_unix/ydb_stm_tpthread.c:225
 #10 ydb_stm_tpthread (parm=0x0) at sr_unix/ydb_stm_tpthread.c:84
 #11 start_thread () from /usr/lib64/libpthread.so.0
 #12 clone () from /usr/lib64/libc.so.6
chathaway-codes pushed a commit that referenced this issue Jan 18, 2019
…ay without waiting for a logical point to be reached (i.e. forced_simplethreadapi_true == FALSE)

A new function ydb_stm_thread_exit() is added to sr_unix/ydb_stm_thread.c.
This takes care of exit handling related activities for the MAIN worker thread.

It is invoked from ydb_stm_thread() when a logical point has been reached where it is safe to exit.
This is the case where "forced_simplethreadapi_true" is TRUE.

ydb_stm_thread_exit() is also invoked (from gtm_exit_handler()) in case "forced_simplethreadapi_exit"
is FALSE. In this case, it is possible we will never reach a logical point for safe exit (i.e. TP
worker thread can never terminate as it is waiting for MAIN worker thread to service request which
has been interrupted to handle exit handler request). To avoid deadlock in these cases, we do not
indefinitely wait, for TP worker thread to terminate, in the MAIN worker thread.  Instead we wait
for 1000 iterations of 1 microsecond each for a total of around 1 milli-second per TP worker thread
before moving on with YottaDB exit handling.

Additionally, changes to sr_port/deferred_signal_handler.c and sr_unix/generic_signal_handler.c
that deferred exit processing in case we are the MAIN worker thread (done in a prior commit
305fe69) are now reverted. This is because it is possible the MAIN
worker thread is servicing a long running command (e.g. ydb_ci_t of a call-in M program that runs
for ever until stopped using a kill -15 as is done in the dual_fail_extend/dual_fail2_mustop_sigquit
subtest). In that case if we defer the signal, the MAIN worker thread that is in ydb_ci_t() will
never come back to ydb_stm_threadq_dispatch() which means the process will never terminate if we
defer exit handling.  With the introduction of ydb_stm_thread_exit() in the current commit, the
MAIN worker thread will wait for TP worker threads to terminate and time out (instead of waiting
indefinitely) and continue with exit processing. This wait should address the original issue raised
by the prior commit and so it is okay to revert these two module changes from that commit.
chathaway-codes pushed a commit that referenced this issue Jan 18, 2019
…EGV/SIG-11) in multi-thread processes

Problem statement
-----------------
In a multi-thread process, if a SIG-11 happens in say the TP worker thread, the signal handler
generic_signal_handler() is invoked. That in turn notices the current thread is not the MAIN
worker thread and so forwards the SIG-11 from the TP worker thread to the MAIN worker thread.
And generic_signal_handler() is invoked again in the MAIN worker thread. It is this invocation
that will call gtm_fork_n_core() for the fatal SIGSEGV signal. And since gtm_fork_n_core() does
a fork and then dumps the core file, it is the MAIN worker thread's C-stack that will be captured
in the core file (a fork only inherits the current thread's C-stack) making the core file unusable
since we are interested in the SIG-11 that happened in the TP worker thread.

Fix
---
As part of the FORWARD_SIG_TO_MAIN_THREAD_IF_NEEDED macro invoked by the TP worker thread (in the
above example on a SIG-11), after it forwards the signal to the MAIN worker thread, it does not
immediately return but waits for a signal (through a new global variable "safe_to_fork_n_core")
from the MAIN worker thread to indicate it is safe to do a "gtm_fork_n_core" call from the TP
worker thread even though it is not the MAIN worker thread. The MAIN worker thread pauses execution
while this core dump happens in the TP worker thread and then continues with its cleanup.

A new macro MULTI_THREAD_AWARE_FORK_N_CORE is invoked by the MAIN worker thread in generic_signal_handler()
wherever it needs to do a gtm_fork_n_core(). This macro checks if this is a RAISED signal (e.g. SIGSEGV,
SIGILL etc.) and if so sets safe_to_fork_n_core to TRUE and waits for this to be reset to FALSE (will be
reset by the TP worker thread or whichever thread got the SIGSEGV and is in the
FORWARD_SIG_TO_MAIN_THREAD_IF_NEEDED macro). If it is not a RAISED signal, this macro does a gtm_fork_n_core()
in the MAIN worker thread itself.
chathaway-codes pushed a commit that referenced this issue Jan 24, 2019
…ODE errors; Ensure errstr is filled in case of ydb_*_st() or ydb_*_t() calls which return these two errors; Nix INVAPIMODE error and instead add SIMPLEAPINOTALLOWED and THREADEDAPINOTALLOWED errors

Note that whenever a SimpleThreadAPI function is mentioned below, it is meant functions of the form
ydb_*_st() or ydb_*_t().

* The primary issue is that SimpleThreadAPI functions could return with a YDB_ERR_CALLINAFTERXIT
  error in case the YottaDB engine has been shutdown (e.g. ydb_exit()).  But in that case, the "errstr"
  parameter was not filled in. It would contain garbage strings which is not user-friendly. It
  would be desirable to also fill "errstr" with the actual error string corresponding to
  CALLINAFTERXIT error. Towards this, a new macro SET_STAPI_ERRSTR_MULTI_THREAD_SAFE has been
  introduced which sets the errstr to the $zstatus corresponding to any passed in valid error code
  (e.g. YDB_ERR_CALLINAFTERXIT). This is a multi-thread safe macro (i.e. does not use the YottaDB
  engine other than to read the error message string table which is a read-only structure anyways)
  and hence can be safely invoked from SimplethreadAPI function calls.

* While analyzing the above primary issue, I realized that YDB_ERR_CALLINAFTERXIT can also be issued
  from ydb_init() which could be implicitly called by any of the SimpleThreadAPI functions through
  the LIBYOTTADB_RUNTIME_CHECK* macros. Therefore those macros were redesigned to pass an "errstr"
  if the caller has access to it (i.e. if the caller is a SimpleThreadAPI function). Callers which
  do not have access to an "errstr" (e.g. ydb_*_s() functions) will pass a NULL parameter instead.
  The LIBYOTTADB_RUNTIME_CHECK* macros now invoke SET_STAPI_ERRSTR_MULTI_THREAD_SAFE to set "errstr"
  in case ydb_init() returns a non-zero status.

* By a similar logic, the VERIFY_THREADED_API* macros are also enhanced to pass an additional
  "errstr" parameter. This is because this macro is called by all SimpleThreadAPI functions
  (in addition to the LIBYOTTADB_RUNTIME_CHECK* macro) before invoking ydb_stm_args*().

* Since the VERIFY_THREADED_API* macros can issue an INVAPIMODE error, this error also needs to fill
  in errstr (like the CALLINAFTERXIT error is fixed above). But this message has parameters
  that need to be substituted. Since this error is issued in SimpleThreadAPI functions when
  they do not yet run in the MAIN worker thread, the INVAPIMODE error issuing logic (which used
  SETUP_GENERIC_ERROR_4PARMS, a routine that is not multi-thread safe) had to be reworked to
  instead use SET_STAPI_ERRSTR_MULTI_THREAD_SAFE. But in order to use that, we needed an error
  message with no parameters.  Since INVAPIMODE message has only two possibilities, we instead
  create two new messages with the appropriate text that way the two new messages do not need any
  parameters and hence can be used with SET_STAPI_ERRSTR_MULTI_THREAD_SAFE. SIMPLEAPINOTALLOWED
  and THREADEDAPINOTALLOWED are the new messages and INVAPIMODE is nixed.
chathaway-codes pushed a commit that referenced this issue Jan 28, 2019
…ndlers have been invoked

The v62002/gtm6638 subtest failed once in a while on some systems with the following diff.

39c39,40
< Pass
---
> Alarm clock
> Fail: expected=559 actual=

This test runs simplethreadapi 3n+1 for a while. As part of recent changes for YottaDB signal
handlers to co-exist with non-YottaDB signal handlers, there is now code in ydb_exit() that
resets the signal handlers to their non-YottaDB versions. But this is done BEFORE invoking
gtm_exit_handler() in the MAIN worker thread in case ydb_exit() is invoked in some other thread in
a SimplethreadAPI process. This means that it is possible that a SIGALRM timer is still active
at the time we reset the SIGALRM signal handler in ydb_exit() but before gtm_exit_handler()
(which does a cancel_timer()) has been invoked. If due to timing scenarios, this timer actually
pops before the cancel_timer() is done, the non-YottaDB SIGALRM handler will kick in. And I think
the system default handler for SIGALRM prints the "Alarm clock" message and just exits the process
(without also invoking YottaDB exit handler). This is most likely what caused the test failure.

The flow in ydb_exit() has been reworked so the signal handler reset happens AFTER the MAIN
worker thread has exited (i.e. after it has invoked gtm_exit_handler()).

Also noticed there was a pre-existing race condition in ydb_exit() (with multiple concurrent
invocations from different threads) which is now addressed by ensuring we hold the ydb engine
thread lock for the entire duration of the ydb_exit() even in the SimpleThreadAPI case
(when the now-nixed "wait_for_main_worker_thread_to_die" variable was TRUE).

Additionally, noticed the "struct sigaction" structure is not memset() to 0 before setting
the SIGALRM handler in init_timers() in gt_timers.c. Fixed that too just in case it can cause
other issues.
chathaway-codes pushed a commit that referenced this issue Jan 30, 2019
…o pulling in ydb_stm_thread_exit() unnecessarily (in generic_signal_handler.c)
chathaway-codes pushed a commit that referenced this issue Mar 25, 2019
…trlc_handler.c or SIGCONT in continue_handler.c etc.) to main thread (a change that was missed out as part of #205)
chathaway-codes pushed a commit that referenced this issue Mar 25, 2019
…signal handler in MAIN worker thread or in YottaDB engine lock holding thread just before releasing thread lock

continue_handler()/jobexam_process()/jobexam_signal_handler rely on info and context being filled in
but in SimpleThreadAPI environment they will not be due to signal forwarding to main worker thread.
It is a pre-existing issue in SimpleThreadAPI in r1.24 (#205) that is now fixed. Not considered
likely to be encountered in practice so no user-visible issue created for this separately.
chathaway-codes pushed a commit that referenced this issue Apr 10, 2019
The following assert failed in the simplethreadapi/externalcall subtest when a ydb_ci_t()
call invoked an M routine that did an external call which in turn invoked a ydb_hiber_start().

%YDB-F-ASSERT, Assert failed in sr_unix/ydb_hiber_start.c line 35
	for expression (LYDB_RTN_NONE == TREF(libyottadb_active_rtn))

This is now fixed to take into account that TREF(libyottadb_active_rtn) can be LYDB_RTN_YDB_CI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants