Skip to content

Commit

Permalink
(core) don't force adopt handler ack on ep-create
Browse files Browse the repository at this point in the history
Fun one. The flow for appl switching with adoption as well as with crash
recovery first runs the application entry point then the adopt handler.

This is deliberate so that window/resource management code doesn't
suffer from initialization order problems. The side effect from that is
that if a frameserver connection is established in the entry point, it
too will be exposed in the adoption handler. This has mostly been
relevant for durden, but it has had a deferred / reopen on termination
scheme that masked the issue.

When testing a12- based hot network reloading with the @stdin bit, the
net_open call does create a frameserver for the monitoring parent to
connect through in order for shmif event mapping to be less painful.
This had the side effect of getting pruned immediately through the
missing _adopt handler, uncovering the issue.

The solution as it stands is to track which reset- counter a frameserver
was created and registered in. If they are the same it means that it was
actually created in the main entry point and the script can be assumed
having proper tracking for it.
  • Loading branch information
letoram committed Nov 2, 2023
1 parent 9371c2e commit e026285
Show file tree
Hide file tree
Showing 5 changed files with 31 additions and 0 deletions.
9 changes: 9 additions & 0 deletions src/engine/arcan_conductor.c
Expand Up @@ -14,6 +14,7 @@
/* defined in platform.h, used in psep open, shared memory */
_Atomic uint64_t* volatile arcan_watchdog_ping = NULL;
static size_t gpu_lock_bitmap;
static int reset_counter;

void arcan_conductor_enable_watchdog()
{
Expand Down Expand Up @@ -251,6 +252,7 @@ void arcan_conductor_register_frameserver(struct arcan_frameserver* fsrv)
{
size_t dst_i = 0;
alloc_frameserver_struct();
fsrv->desc.recovery_tick = reset_counter;

/* safeguard */
ssize_t src_i = find_frameserver(fsrv);
Expand Down Expand Up @@ -591,6 +593,13 @@ bool arcan_conductor_valid_cycle()
return valid_cycle;
}

int arcan_conductor_reset_count(bool step)
{
if (step)
reset_counter++;
return reset_counter;
}

static int trigger_video_synch(float frag)
{
conductor.set_deadline = -1;
Expand Down
2 changes: 2 additions & 0 deletions src/engine/arcan_conductor.h
Expand Up @@ -117,6 +117,8 @@ void arcan_conductor_setsynch(const char* arg);
void arcan_conductor_fakesynch(uint8_t left_ms);

#ifndef VIDEO_PLATFORM_IMPL
int arcan_conductor_reset_count(bool step);

/* Update the priority target to match the specified frameserver. This
* means that heuristics driving synchronization will be biased towards
* letting the specific fsrv align synchronization - if the synchronization
Expand Down
10 changes: 10 additions & 0 deletions src/engine/arcan_frameserver.h
Expand Up @@ -93,6 +93,16 @@ struct arcan_frameserver_meta {
unsigned long long framecount;
unsigned long long dropcount;
unsigned long long lastpts;

/* This one was added to have a way to mark objects that were created in the
* main entrypoint of the lua scripts (that run before _adopt) but in recovery
* would still be exposed to adopt. If the developer would reject it there
* (from not implementing the handler or doing it in a contrived way) the
* frameserver would be immediately destroyed, causing a fs- race condition.
*
* This counter indicates which recovery- window the object was created in,
* so that the automatic deletion facility does not kill it off prematurely. */
unsigned recovery_tick;
};

struct frameserver_audsrc {
Expand Down
7 changes: 7 additions & 0 deletions src/engine/arcan_lua.c
Expand Up @@ -711,6 +711,7 @@ void arcan_lua_adopt(struct arcan_luactx* ctx)

arcan_vobj_id delids[n_fsrv];
size_t delcount = 0;
int tc = arcan_conductor_reset_count(false);

/* three: forward to adopt function (or delete) */
for (count = 0; count < n_fsrv; count++){
Expand All @@ -722,6 +723,12 @@ void arcan_lua_adopt(struct arcan_luactx* ctx)
fsrv->tag = LUA_NOREF;

bool delete = true;

/* use this to determine if the frameserver was created in the appl entry
* and thus should not be exposed as an adopt handler */
if (fsrv->desc.recovery_tick == tc)
continue;

if (alt_lookup_entry(ctx, "adopt", sizeof("adopt") - 1) &&
arcan_video_getobject(ids[count]) != NULL){
lua_pushvid(ctx, vobj->cellid);
Expand Down
3 changes: 3 additions & 0 deletions src/engine/arcan_main.c
Expand Up @@ -590,6 +590,7 @@ int MAIN_REDIR(int argc, char* argv[])
* invoke adopt() in the script
*/
bool adopt = false, in_recover = false;

int jumpcode = setjmp(arcanmain_recover_state);
int saved, truncated;

Expand All @@ -599,6 +600,7 @@ int MAIN_REDIR(int argc, char* argv[])
/* close down the database and reinitialize with the name of the new appl */
arcan_db_close(&dbhandle);
arcan_db_set_shared(NULL);
arcan_conductor_reset_count(true);

dbhandle = arcan_db_open(dbfname, arcan_appl_id());
if (!dbhandle)
Expand Down Expand Up @@ -669,6 +671,7 @@ int MAIN_REDIR(int argc, char* argv[])
goto error;
}

arcan_conductor_reset_count(true);
arcan_event_maskall(evctx);
arcan_video_recoverexternal(true, &saved, &truncated, NULL, NULL);
arcan_event_clearmask(evctx);
Expand Down

0 comments on commit e026285

Please sign in to comment.