Automated Analysis of Verbal Reports

This is a sample implementation of our automated verbal reports analysis framework. The framework allows you to conduct your jsPsych study whilst recording the participants. Subsequently, analyses can be replicated by executing our data-pipeline. We also provide a more detailed guide here.

Overview

Automated Analysis of Verbal Reports

You can choose between the setup for developing and a setup for production. We recommend testing and developing your study locally on your own computer. After everything works, you can run the setup on the server.

Download Code

You can obtain the code using either one of the following ways:

Git (recommended)

Open a console to check if Git is already installed. Open your terminal and type:

git --version

If Git is not installed, you need to install Git. Then open your terminal and navigate to a preferred folder and run:

git clone https://github.com/tehillamo/AutoV-LLM.git

Download Zip file

Navigate in your browser to our GitHub repository https://github.com/tehillamo/AutoV-LLM. Then click on the green button <> Code and click on Download ZIP. After that, you need to unzip the file in a folder of your choice.

Setup Development

Prerequisites

npm (>= 9.5.1), Node.js (>= 18.16.1), Python (>= 3.10)

Setup of Experiment Website and Back-End Server

Open your terminal and navigate to our framework folder
Run cd AUTOV-LLM/webserver to step into the webserver folder
Run npm install to install all dependencies
Run npm run dev to start the webserver
Available in browser under http:://localhost:8000

Run the Data-Pipeline

Please install the necessary cuda backend if you plan to use cuda. The code is based on cuda 12.6.

Open your terminal and navigate to our framework folder
Run cd AUTOV-LLM/data_pipeline to step into the data pipeline folder
We would recommend to create a virtual environment with python -m venv venv. If you do not want to do that skip to step 5
Start virtual environment source venv/bin/activate
Run pip install -r requirements.txt to install python dependencies
Please run after that pip install whisperx==3.4.2 and pip install numpy==1.26.4 in this order to avoid version conflicts.
Install ffmpeg (required version >=4.1 and <= 4.4)
- MacOS: brew install ffmpeg@4 and brew link ffmpeg@4
Start the corresponding script using python -u scripts.py

Setup Production (Server)

Prerequisites

Install Docker and Docker-Compose

Setup of Experiment Website and Back-End Server

Open your terminal and navigate to our framework folder
Run cd AUTOV-LLM/webserver to step into the webserver folder
Run docker compose -f docker-compose-prod.yaml up --build -d to start the docker container
Available in browser under http:://localhost:8080
To stop the container use docker compose stop

Run the Data-Pipeline

Open your terminal and navigate to our framework folder
Run cd AUTOV-LLM/data_pipeline to step into the data pipeline folder
Run docker compose up to start the docker container and run the whisper model on cpu.
- use docker compose -f docker-compose-gpu.yaml up for a speedup if you have a cuda compatible gpu
To stop the container use docker compose stop

Create a Study using jsPsych with Automated Recording

We use the jsPsych framework to run our study. You can specify the different trials in /webserver/public/src/index.js. To make use of the recordings you need to specify the first and last trial, as well as some callback functions. To track the trials, it is necessary to initialize jsPsych with a trigger functions. onTrialStartRecording(trial); starts the recording as onTrialFinishRecording(trial); ends the recording of this trial and sends it to the server. When the study has finished, we send the data to the server using sendData();.

var jsPsych = initJsPsych({
  on_trial_start: function(trial) {
    onTrialStartRecording(trial);
  },
  on_trial_finish: function(trial) {
    onTrialFinishRecording(trial);
  },
  on_finish: function() {
    sendData(jsPsych.data.allData.trials);
  }
});

Then we specify our trials. The first and last trial should both be jsPsychSpeechRecording. The first trial indicates the start (start: true). The last trial specifies the end (start: false). The actual study trials can be declared between the two recording trials. If you need other plugins import them in /webserver/public/index.html. Just wrap the jsPsychSpeechRecording around your study trials.

const trials = [
  {
    type: jsPsychSpeechRecording,
    start: true
  },
    // add your study trials here
  {
    type: jsPsychSpeechRecording,
    start: false
  }
];

After that, we can run the study.

jsPsych.run(trials);

The implementation of the jsPsychSpeechRecording plugin can be found in /webserver/public/jspsych/dist/plugin-recording.js. Note that you must include the script in the html file. Our memory task study can be found in /webserver/public/index.html. You can find a template to integrate you own study here /webserver/public/index_template.html.

Ressources & Outputs

All recordings and the trial data will be saved per default to the ressources folder. Each participant has a unique, random and anonymous id. For each participant we create a new folder inside ressources. In this folder you can find the recordings and the behavioral data from the study.

Evaluation Script

We offer various scripts to automatically assess verbal report recordings. In the following sections, we describe each script's function and how to employ it. Our script is built in a modular fashion such that more scripts can be easily integrated. Furthermore the used machine learning models can also be changed. If you added scripts feel free to open a pull request such that we can add them to our repository.

Speech2Text

This script transcribes the recordings into text. We are using OpenAI's whisper model [1] for this. However, we also implemented different options (see config options). Since many automated speech recognition (ASR) models differ in their initialization and execution we need to write custom initializations and execution functions. In the code we have examples for Whisper, WhisperX and Nvidia Parakeet. This should give a very good overview how to incooperate new ASR models.

Sentence Embeddings

We use Sentence-BERT [2] to obtain embeddings for each transcribed verbal report.

Dimensionality Reduction

We implemented three techniques to reduce the dimensionality:

PCA
t-SNE
combination of PCA and t-SNE

Text Labelling

We use a zero-shot classification algorithm to find the most problable text label for a given text.

Merge Behavioral Data

In the last step we merge the data obtained from the study with the earlier computed data (transcribed text, embeddings, lower dimensional embeddings).

Input Configuration

In the config file you can specify all parameters. Note that if you change input_path, output_path and cache_path, then you must change this also in the docker-compose file.

Parameter	Explanation
input_path	Path to the CSV file
output_path	Path where the output file should be saved to
output_name	Name of the resulting CSV file in the output path
transcription_model	Which ASR model to use. Possible options are Whisper, WhisperX and nvidia/parakeet-tdt-0.6b-v2. For Whisper and WhisperX use `whisper-<model>` and `whisperx-<model>` respectively. For the Nvidia model just use the model name from huggingface (nvidia/parakeet-tdt-0.6b-v2).
behavioral_columns	A list of the behavioral columns from jsPsych which should be merged into the output file
reduction_algorithm	Algorithm for dimensionality reduction, possible values: ("PCA", "TSNE", "both")
dimension	Dimension to which the embedding dimension should be reduced
transcribe_text	Perform speech to text (must provide input_path)
post_asr_correction	Perform a post asr correction (OpenAI API key required because it uses GPT4)
word_cloud	Create word cloud
ignore_words_in_word_cloud	List of words to ignore in the word cloud
text_classification	Perform the text labelling algorithm
calculate_text_embeddings	Calculate the text embeddings
dimensionality_reduction	Apply dimensionality reduction to the embeddings
text_classes	List of text classes for the text classification algorithm
keywords	Compute keywords
top_n_keywords	Select top n keywords, ordered by probability (top = highest probability)
summarize	Compute summary
max_length_summary	Maximum length of the summary
min_length_summary	Minimum length of the summary
zero_shot_text_finetuned_model	Path to fine-tuned text-classification model. Leave null if you do not have a fine-tuned model.
bert_finetuned_model	Path to fine-tuned embedding model. Leave null if you do not have a fine-tuned model. You can also specify any SentenceBERT model.
openai_api_key	API key for OpenAI
use_openai_prompting	Flag to use OpenAI prompting. The prompt will be the transcribed text. Note that you must provide an OpenAI API key!
use_openai_embeddings	Flag to use OpenAI text embeddings. Note that you must provide an OpenAI API key! calculate_text_embeddings must be also set to true
developer_prompt	Developer prompt for the Prompting script, i.e. instructions for the Model
openai_model	Specific model for prompting
openai_embeddings_model	Specific model for embeddings

Output

You can find the output of the evaluation script in /output/. The script produces a CSV file.

CSV column	Explanation
uuid	UUID of each participant (unique)
trial_number	Trial number of the study
transcribed_text	Transcribed text from the recording
embedding	Embedding obtained from SentenceBERT
embedding_reduced_pca	Lower dimensional embedding (PCA)
embedding_reduced_tsne	Lower dimensional embedding (t-SNE)
embedding_reduced_both	Lower dimensional embedding (first PCA to 50 dimensions, then t-SNE)
keywords	Keywords of transcribed text
summary	Summary of transcribed text

Citing & Authors

If you use our framework in yours studies or use it in your research, feel free to cite our work.

...

References

[1] Radford, A., Kim, J.W., Xu, T., Brockman, G., Mcleavey, C. & Sutskever, I.. (2023). Robust Speech Recognition via Large-Scale Weak Supervision. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:28492-28518 Available from https://proceedings.mlr.press/v202/radford23a.html.

[2] Reimers, N., & Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
data_pipeline		data_pipeline
logs		logs
output		output
ressources		ressources
webserver		webserver
.gitignore		.gitignore
beginners_guide.pdf		beginners_guide.pdf
pre-print.pdf		pre-print.pdf
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automated Analysis of Verbal Reports

Overview

Download Code

Git (recommended)

Download Zip file

Setup Development

Prerequisites

Setup of Experiment Website and Back-End Server

Run the Data-Pipeline

Setup Production (Server)

Prerequisites

Setup of Experiment Website and Back-End Server

Run the Data-Pipeline

Create a Study using jsPsych with Automated Recording

Ressources & Outputs

Evaluation Script

Speech2Text

Sentence Embeddings

Dimensionality Reduction

Text Labelling

Merge Behavioral Data

Input Configuration

Output

Citing & Authors

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

tehillamo/AutoV-LLM

Folders and files

Latest commit

History

Repository files navigation

Automated Analysis of Verbal Reports

Overview

Download Code

Git (recommended)

Download Zip file

Setup Development

Prerequisites

Setup of Experiment Website and Back-End Server

Run the Data-Pipeline

Setup Production (Server)

Prerequisites

Setup of Experiment Website and Back-End Server

Run the Data-Pipeline

Create a Study using jsPsych with Automated Recording

Ressources & Outputs

Input Configuration

Output

Citing & Authors

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Uh oh!

Languages