Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot parse/unpack Stroop Task Data Embedded Into Qualtrics #209

Open
meyb9 opened this issue Jul 11, 2023 · 15 comments
Open

Cannot parse/unpack Stroop Task Data Embedded Into Qualtrics #209

meyb9 opened this issue Jul 11, 2023 · 15 comments

Comments

@meyb9
Copy link

meyb9 commented Jul 11, 2023

Hi, I have used the stroop task of lab.js to collect pre-test and post-test data of the stroop task.
I hvae succesfully collected the data and stored it using the instructions provided.
However, now I cannot parse/uunpack the two columns of data in R and they contain hundreds of lines.
I have tried using the code that was provided in lab.js general documentation which was this:

This code relies on the pacman, tidyverse and jsonlite packages

require(pacman)
p_load('tidyverse', 'jsonlite')

We're going to assume that the data coming from

the third-party tool has been loaded into R,

for example from a CSV file.

data_raw <- read_csv('raw_data_from_external_tool.csv')

Please also check that any extraneous data that

an external tool might introduce are stripped

before the following steps. For example, Qualtrics

introduces two extra rows of metadata after the

header. Un-commenting the following command removes

this line and re-checks all column data types.

#data_raw <- data_raw[-c(1, 2),] %>% type_convert()

One of the columns in this file contains the

JSON-encoded data from lab.js

labjs_column <- 'labjs-data'

Unpack the JSON data and discard the compressed version

data_raw %>%

Provide a fallback for missing data

mutate(
!!labjs_column := recode(.[[labjs_column]], .missing='[{}]')
) %>%

Expand JSON-encoded data per participant

group_by_all() %>%
do(
fromJSON(.[[labjs_column]], flatten=T)
) %>%
ungroup() %>%

Remove column containing raw JSON

select(-matches(labjs_column)) -> data

The resulting dataset, available via the 'data'

variable, now contains both the experimental

data collected by lab.js, as well as any other

columns introduced by the software that collected

the data. Values from the latter are repeated

to fill added rows.

As a final step, you might want to save the

resulting long-form dataset

#write_csv(data, 'labjs_data_output.csv')

However the code does not work at all, even though I tried to make modifications I cannot parse the data.
Could someone please provide me asisstance?

@mcfarla9
Copy link

Could you provide more detail about what you mean by "the code does not work at all"? Do you get any error messages? What specific actions did you do, and what behavior did you observe?

Also, could you provide your data file as an example for troubleshooting? I have never handled any lab.js data from Qualtrics, but I do have some experience unpacking lab.js data using R and I might like to take a crack at this for my own amusement. Could you also provide a link to the instructions that you were following. Thanks.

@meyb9
Copy link
Author

meyb9 commented Jul 11, 2023

Hi, what I mean is after following this code snipped ( and many many more modifications of it using chatgpt):# This code relies on the pacman, tidyverse and jsonlite packages
require(pacman)
p_load('tidyverse', 'jsonlite')

We're going to assume that the data coming from

the third-party tool has been loaded into R,

for example from a CSV file.

data_raw <- read_csv('raw_data_from_external_tool.csv')

Please also check that any extraneous data that

an external tool might introduce are stripped

before the following steps. For example, Qualtrics

introduces two extra rows of metadata after the

header. Un-commenting the following command removes

this line and re-checks all column data types.

#data_raw <- data_raw[-c(1, 2),] %>% type_convert()

One of the columns in this file contains the

JSON-encoded data from lab.js

labjs_column <- 'labjs-data'

Unpack the JSON data and discard the compressed version

data_raw %>%

Provide a fallback for missing data

mutate(
!!labjs_column := recode(.[[labjs_column]], .missing='[{}]')
) %>%

Expand JSON-encoded data per participant

group_by_all() %>%
do(
fromJSON(.[[labjs_column]], flatten=T)
) %>%
ungroup() %>%

Remove column containing raw JSON

select(-matches(labjs_column)) -> data

The resulting dataset, available via the 'data'

variable, now contains both the experimental

data collected by lab.js, as well as any other

columns introduced by the software that collected

the data. Values from the latter are repeated

to fill added rows.

As a final step, you might want to save the

resulting long-form dataset

#write_csv(data, 'labjs_data_output.csv')

R still does not recognise the 2 columns of lab.js data I have which are labjs-data and labjs2-data. The main error it gives is that the columns do not exist.

This one was the last one I tried ( this time I donwloaded my Qualtrics data as excel because the columns labjs-data and labjs2-data were clearly defined. still no luck:

library(readxl)
library(dplyr)
library(jsonlite)
library(tidyr)
library(readr)

Read the Excel file

data_raw <- read_excel("mey_experiment.xlsx")
data_raw <- data_raw[-c(1, 2),] %>% type_convert()

One of the columns in this file contains the

JSON-encoded data from lab.js

labjs_column <- c('labjs-data','labjs2-data')

Unpack the JSON data and discard the compressed version

data_raw %>%

Provide a fallback for missing data

mutate(
!!labjs_column := recode(.[[labjs_column]], .missing='[{}]')
) %>%

Expand JSON-encoded data per participant

group_by_all() %>%
do(
fromJSON(.[[labjs_column]], flatten=T)
) %>%
ungroup() %>%

Remove column containing raw JSON

select(-matches(labjs_column)) -> data

The resulting dataset, available via the 'data'

variable, now contains both the experimental

data collected by lab.js, as well as any other

columns introduced by the software that collected

the data. Values from the latter are repeated

to fill added rows.

Save the resulting dataset

write_csv(data, 'study1_data_output.csv')

this was the warning: rror in quos():
! The LHS of := must be a string, not a character vector.
Backtrace:

  1. ... %>% select(-matches(labjs_column))
  2. rlang::abort(message = message)

@mcfarla9
Copy link

OK. In addition to providing the code snips, could you provide a link to the page where you get the code so that I can get a broader view?

Also, could you upload the problem data file, or maybe the partially unpacked file at the point where you find that R does not recognize the 2 columns of lab.js data ? I need something specific to test & work on. Even better if you can upload a reduced set of data that exhibits the problem. Thanks.

@meyb9
Copy link
Author

meyb9 commented Jul 11, 2023

Hi of course, here is the webpage : https://labjs.readthedocs.io/en/latest/learn/deploy/3-third-party.html#tutorial-deploy-third-party-postprocessing

I will have to ask my supervisor to share the data and get back to you!
It is hundreds of lines of JSON type data per person as truncated strings in each cell (that's how its seen in excel). I suspect that some may also be lacking brackets at the end of the lines as well which is weird for downloaded data

@mcfarla9
Copy link

Thanks for the link, that reads much better and provides some needed context.

As I said, I have never unpacked lab.js data from Qualtrics. We run lab.js studies from our own server using the PHP exported versions, and the data end up in a data.sqlite file which we then export to .csv using a utility that I wrote (originally used the utility hosted at https://felixhenninger.shinyapps.io/labjs-export/). What sort of files do you get from your Qualtrics-based study -- .sqlite, .csv, etc.?

How much experience do you have with R? Basically, when I used R to build my own exporting utility I went in to the R command line and explored sample data files (that I created) step by step in order to understand the data structure, and then used the knowledge gained to help design my utility.

@mcfarla9
Copy link

Also, instead of sending a large existing body of data from your study, could you simply run a session to create a smaller example data file that I could examine?

@meyb9
Copy link
Author

meyb9 commented Jul 11, 2023

Hi, you can get a lot of different file types according to what you want. I usually get either csv or better Excel since it's easier for me to deal with first manually and then in R. As for experience, I'm afraid I have just started learning the language this year so I have limited experience with data cleaning on it. I mostly use it for statistics. ıf you could guide me maybe I could do the example data file?

@mcfarla9
Copy link

How much control do you have over the administration of your study? For our situation here, I would simply make a development copy of the study and then run a session starting from an empty data file. Could you do anything like that? Again, I have no experience with Qualtrics, we do things differently here and have much finer-grained control over our processes.

@mcfarla9
Copy link

Actually, if it were me I would start by making a simplified "Hello World" style of study with a couple of columns with distinctive names and some distinctive data values that I could easily search for. I would do a couple runs to generate some dummy data, and then I would use that to figure out how to export my data. I would then build up from the simple to the more complex. I would never start with full-blown production data.

@meyb9
Copy link
Author

meyb9 commented Jul 13, 2023

Hello! sorry for the late response. After people from our team much better than me at R have given it a go we have identified the problem. There are too many characters per participant in the embedded data that we get from lab Js. Apparently, Qualtrics now has a character limit and it just truncates the data of sessions with longer trials which makes them not analyzable automatically. It looks like others also had to trash their data ( see https://community.qualtrics.com/custom-code-12/restriction-on-embedded-variable-length-as-of-march-11-output-getting-truncated-14320) This is extremely sad. I would appreciate any tips regarding another Stroop Task to embed in the study maybe that would have fewer characters.

@mcfarla9
Copy link

Wow that is a tough discovery. Thanks for reporting back with that info. Now that I look, I notice that this problem was also reported in issue #119 of this GitHub project.

Someone should now close this issue. As for finding another lab.js Stroop task, you might try asking over in the lab.js Slack channel, nmbrcrnchrs.slack.com.

@meyb9
Copy link
Author

meyb9 commented Jul 13, 2023

Thank you very much for providing the slack link! It looks like I need an invitation to join though.

@mcfarla9
Copy link

Well that is a sticky wicket, isn't it? You do need to register in order to get an invitation, but the application link at https://lab.js.org/resources/support/ is broken and has been for quite some time -- see issues #124, #200, & #208.

@meyb9
Copy link
Author

meyb9 commented Jul 13, 2023

Amazing... I really hope someone cares enough to fix this. Would have never gone with lab.ja or Qualtrics if I knew all our funds would be wasted over this.

@FelixHenninger
Copy link
Owner

Hej you two, I'm sincerely sorry for the loss of data, and massively appreciate your discussions here. Heartfelt thanks @mcfarla9 for jumping in!

I'd like to quickly point you to the lab.js documentation page on Qualtrics, which warns about this very issue, but also points to some alternatives. This is a Qualtrics issue, and there's nothing we can do about it but warn folks away. I'm sorry this warning didn't reach you in time. Again, though, please let me stress that there are great alternatives around.

I'm going to respond to the other points in their respective issues. Again, thanks for the discussion here -- please be kind! Best,

-Felix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants