Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple workers slows study #270

Open
benjybarnett opened this issue Dec 7, 2022 · 6 comments
Open

Multiple workers slows study #270

benjybarnett opened this issue Dec 7, 2022 · 6 comments

Comments

@benjybarnett
Copy link

Hi,

I'm running a fairly simple study that lasts around 20 minutes using jsPsych 6.1.0 and JATOS, hosted on mindprobe.

When I post to > 4 or so subjects on prolific, I get messages from some subjects saying that, towards the end of the experiment, there are huge delays in between stimulus presentations and response screens, or between trials etc. They never complete the study because the delays last so long (i think minutes, not seconds).

My study doesn't involve the use of images or videos, it simply uses the p5 library to display text stimuli very quickly and get responses with button presses and a simple graphical display.

I'm currently using the 'General Multiple' link for subjects. Perhaps I should use the 'General Single', but having subjects do the subject more than once has not been an issue for me.

Is this a known issue? Can you think of anything that may be wrong with my set up to cause this?

Thanks

@kristian-lange
Copy link
Member

Hi,
Can you please tell me when you run those studies on Mindprobe. I'll have a look in the logs and monitoring if I can find something. There is an issue with suboptimal database queries that can lead to JATOS answering delayed, but this shouldn't happen with only 4 subjects.
Best,
Kristian

@benjybarnett
Copy link
Author

benjybarnett commented Dec 7, 2022

Hi! Thanks for the quick response! I've attached the mindprobe information for two recent participants who returned the study towards the end because of the delays. Thanks for looking into this.
Screenshot 2022-12-07 at 2 12 22 PM
Screenshot 2022-12-07 at 2 16 25 PM

@kristian-lange
Copy link
Member

I'm looking into the logs right now. And just so I understand the issue correctly, the two study results you posted last are the one with the long delays, aren't they? And what is with the 6 you posted before? Do they have theses delays too?

@kristian-lange
Copy link
Member

So, I first looked at the study run with the result ID 190116. I post the important lines of the log further down. It all looks normal up the a certain point and then it somehow stops going on. Your study seems to have 3 components (10310, 10311, 10312). The first two components get started and each send one result data and then starts the next component. This happens quite fast (probably some intro and consent form). But somehow after the 3rd component got started nothing more happens, no more result data or files are sent, no sessions updated, and no study finished. I can see a couple of heartbeats from the study run, so I know the 3rd component is still running in some browser. But then at 12:09:25 we have the last heartbeat and we don't see this study run ever again in the logs. At the same time the server wasn't overloaded or anything that could explain a delay in the response. Actually the server had nothing to response to because there were no requests :/ . So here I'd conclude so far that it is not an JATOS or Mindprobe server issue - more an issue with the client-side. But let me look into the second study run that you posted.

2022-12-07 11:36:20,565 [INFO] - publix_access - GET /publix/3fe26b23-5baf-4f57-b8f1-f512d6eaf73c/2d1b0c46-692c-468c-82d7-18dca9863bf4/start
2022-12-07 11:36:21,126 [INFO] - c.p.PublixInterceptor - .startComponent: studyResultId 190116, componentId 10310
2022-12-07 11:36:49,870 [INFO] - publix_access - PUT /publix/3fe26b23-5baf-4f57-b8f1-f512d6eaf73c/2d1b0c46-692c-468c-82d7-18dca9863bf4/resultData
2022-12-07 11:36:51,397 [INFO] - publix_access - GET /publix/3fe26b23-5baf-4f57-b8f1-f512d6eaf73c/cb1370d2-3893-4560-9b79-adff2802f19b/start
2022-12-07 11:36:51,925 [INFO] - c.p.PublixInterceptor - .startComponent: studyResultId 190116, componentId 10311
2022-12-07 11:37:11,072 [INFO] - publix_access - PUT /publix/3fe26b23-5baf-4f57-b8f1-f512d6eaf73c/cb1370d2-3893-4560-9b79-adff2802f19b/resultData
2022-12-07 11:37:12,217 [INFO] - publix_access - GET /publix/3fe26b23-5baf-4f57-b8f1-f512d6eaf73c/1e5cb2f0-a5e9-4ba2-af59-5e53d18eb8d2/start
2022-12-07 11:37:12,731 [INFO] - c.p.PublixInterceptor - .startComponent: studyResultId 190116, componentId 10312

@kristian-lange
Copy link
Member

And I looked into the study run with the result ID 185001. It behaved mostly like the previous one. Again I post the important log lines further down. The first and second component start and send result data in the end. But then after the 3rd component got started nothing further happens. This time the last heartbeat is at 16:57:57 (heartbeats are send every 2 min from the browser to the server as long as the tab is open). I can't say if the JATOS server was under heavy load at this point because I have no monitoring data from the Nov 23rd but there weren't even any requests from the client-side that JATOS could respond to. So I think the cause for those issues your participants experienced is on the client-side, in your experiment's code.

2022-11-23 16:03:28,735 [INFO] - c.p.w.GeneralMultiplePublix - .startStudy: studyCode FpMDmu1WzCi, studyResultId185001, studyId 7561, batchId 8742, workerId 237139
2022-11-23 16:03:30,166 [INFO] - publix_access - GET /publix/04548b96-f1e9-4ae5-8176-5e2043a21f71/2d1b0c46-692c-468c-82d7-18dca9863bf4/start
2022-11-23 16:03:30,615 [INFO] - c.p.PublixInterceptor - .startComponent: studyResultId 185001, componentId 10310
2022-11-23 16:03:38,684 [INFO] - publix_access - PUT /publix/04548b96-f1e9-4ae5-8176-5e2043a21f71/2d1b0c46-692c-468c-82d7-18dca9863bf4/resultData
2022-11-23 16:03:39,257 [INFO] - c.p.PublixInterceptor - .submitOrAppendResultData: studyResultId 185001, componentId 10310
2022-11-23 16:03:41,140 [INFO] - c.p.PublixInterceptor - .startComponent: studyResultId 185001, componentId 10311
2022-11-23 16:04:39,143 [INFO] - publix_access - PUT /publix/04548b96-f1e9-4ae5-8176-5e2043a21f71/cb1370d2-3893-4560-9b79-adff2802f19b/resultData
2022-11-23 16:04:39,666 [INFO] - c.p.PublixInterceptor - .submitOrAppendResultData: studyResultId 185001, componentId 10311
2022-11-23 16:04:41,486 [INFO] - c.p.PublixInterceptor - .startComponent: studyResultId 185001, componentId 10312

@benjybarnett
Copy link
Author

Thanks for digging into this. Yes you're right the first two components are just consent forms. I only save my data at the end of the study, so perhaps that is why you see nothing from the third component - because nothing is saved while they are doing the study and as they dont finish it it is not saved at all. If there was no overwhelming of the server then I agree it's probably not a jatos issue.

I'll think about this further.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants