explore-vs-explorit-gopniklab

Project: Computational Modeling of Approach-Avoid Task with Reinforcement Learning Frameworks

Overview

Repository for Kai Hung's research project under UC Berkeley NSF SUPERB REU for summer 2022. Kai was fortunate to be mentored by Eunice Yiu and Dr. Alison Gopnik. Credits to Fei Dai for contributing her code and ideas for capturing the 1-D and 2-D learning preference tradeoffs.

Abstract: State-of-the-art deep and reinforcement learning algorithms have achieved incredible progress towards pattern recognition and decision-making problems at the cost of large computing power and/or processed data, but their ability to generalize quickly and reliably remains poor relative to an average human child. To understand how children are able to gather information and learn so much from so little, we focus on computationally modeling children’s decision-making in an approach-avoid paradigm: children can opt to approach a certain stimulus, which may be rewarding or punishing; or they can opt to avoid it and learn nothing about whether the stimulus is rewarding or punishing. Specifically, we perform parameter estimation by fitting experimental data with variants of a standard reinforcement learning model including parameters such as learning rate and inverse temperature. Contrasting children’s best-fit model parameters with adults, we find that children are more exploratory (lower inverse temperature) and less affected by external negative reward factors (smaller negative learning rate), yet more capable of inferring the correct two-dimensional decision rule for maximizing net external reward gains from experimental results.

The experimental data for this project originated from a study conducted by Dr. Emily Liquin and Dr. Alison Gopnik: https://www.sciencedirect.com/science/article/pii/S0010027721003632

Instructions

This repository is organized around the core pipeline illustrated in the poster and technical talk files, which is contained in reinforcement_learning.ipynb. Model-related functions are stored in the models folder where they are further split into generative_models.py and likelihood_models.py, corresponding to the set of functions used to generate data given model + parameter and the set of functions used to estimate the parameters given model + data.

It is advised for contributors to view the content of the reinforcement_learning.ipynb script to understand the overall pipeline before adding/modifying generating and parameter estimation (likelihood) functions. To avoid confusion, one should largely ignore all other scripts beside reinforcement_learning.ipynb, the models folder, and helpers.py at the start.

Here is a detailed breakdown of the files...

models/
- generating_models.py - a script storing functions to generate data
- likelihood_models.py - a script storing functions to estimate parameters
Computational Modeling for Approach-Avoid Task with Reinforcement Learning Frameworks .pptx - final poster for this project
Study3_AAData_Adults.csv - the dataset for adults
Study3_AAData_Kids.csv - the dataset for kids
Technical Talk - Kai Hung.pptx - final slide presentation for this project
Variable_Key.xlsx - a key for the variable labels in the above two datasets
additional.py - a script containing commented out code for additional analysis, should be c/p into reinforcement_learning.ipynb cells to be ran
code_optimization.ipynb - a script used to debug inefficient code
data_exploration.ipynb - a script used to perform exploratory data analysis
helpers.py - a script containing non-model helper methods for reinforcement_learning.ipynb
modeling_tutorial.ipynb - a script modeled after Dr. Anne Collins' computational modeling workflow
reinforcement_learning.ipynb - the main script of this project, where the entire project workflow is conducted
rl_model.ipynb - a script from Fei Dai, containing attempts to model the data with a Bayesian framework; largely incomplete

The overall workflow within reinforcement_learning.ipynb is as follows: (1) Scroll to the third code cell with the first line of "Initialize a vector to store the bestllh..." and confirm the number of models you want to use. (2) Scroll to the "Experiments: Parameter Recovery" section. Perform parameter recovery for all of the models. Their generative and likelihood function should already be in the corresponding "models" folder. (3) Scroll to the "Model-Fitting on Experimental Data" section. Perform model fitting using the fit_model() function. (4) Scroll to the "Model Comparisons" section and follow its workflow to ensure that all the models are individually powerful via the confusion matrix. Make sure that "save = True" for the fit_model() calls in the previous section or else the global model_info variable won't have the correct values for this section. (5) Scroll to the "Model Simulation" section and manually enter the specs associated with the best-fit models for each age group (hint: search "TODO"). WARNING: All the plot functions in this section containssave and save_path optional parameters which must be both deleted from their function call if the user does not wish to save the resulting plots. You may also need to create an "outputs" folder in this directory for it to run properly with the save on.

Future Directions and Ideas

The "discount" factor (which really is a tuning parameter on reward perception, and not how the phrase "discount" is typically used in RL and economics) showed promising fit. So, it is very plausible that kids are not treating initial exposure to negative stimuli as purely negative. In fact, it is likely that they may be curious (hence there is an intrinsic reward to better understanding the reward distribution). I imagine that this could both be modeled in a flat intrinsic reward as a function of observation, or through much complicated procedure.
It may also be interesting to examine the extent of conforming to one-dimensional rule vs. two-dimensional rule between children and adults. We could potentially draw inspiration from the concept of "interaction" in classical linear regression to construct a model with similar components accounting for both the Q function input space of patterns, colors, and the object identities themselves.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
models		models
Computational Modeling for Approach-Avoid Task with Reinforcement Learning Frameworks .pptx		Computational Modeling for Approach-Avoid Task with Reinforcement Learning Frameworks .pptx
README.md		README.md
Study3_AAData_Adults.csv		Study3_AAData_Adults.csv
Study3_AAData_Kids.csv		Study3_AAData_Kids.csv
Technical Talk - Kai Hung.pptx		Technical Talk - Kai Hung.pptx
Variable_Key.xlsx		Variable_Key.xlsx
additional.py		additional.py
code_optimization.ipynb		code_optimization.ipynb
data_exploration.ipynb		data_exploration.ipynb
helpers.py		helpers.py
modeling_tutorial.ipynb		modeling_tutorial.ipynb
reinforcement_learning.ipynb		reinforcement_learning.ipynb
rl_model.ipynb		rl_model.ipynb

KataTech/explore-vs-exploit-gopniklab

Folders and files

Latest commit

History

Repository files navigation

explore-vs-explorit-gopniklab

Project: Computational Modeling of Approach-Avoid Task with Reinforcement Learning Frameworks

Overview

Instructions

Future Directions and Ideas

About

Topics

Resources

Stars

Watchers

Forks

Languages