Modeling Grocery Prices In Seven Countries Using Bayesian Statistics

Motivation

One of the advantages of being a Minerva Student is the opportunity to have quick access to fellow students from all around the world. This gives us a chance to broaden our horizons and apply our knowledge in various contexts and using a great variety of data.

My model would help answer questions like:

To what extent do grocery prices vary by country and store brand?
Are grocery prices and the geographical distribution correlated with other cost-of-living measures?

I attempt to address these questions by utilizing Bayesian Statistics and PyStan functionality available in Python. Model setup, as well as conclusions, are described below.

About dataset

total of 64 observations were collected during November 2020. The raw data included data from 7 countries, namely Germany, the United Kingdom, the United States of America, Morocco, Vietnam, South Korea, and Guatemala. Each country has from 2 (Guatemala, South Korea) to 22 (Germany) observations. There are 61 unique stores and 1 store with duplicate stores (located in San Francisco) in our dataset.
for the following products: apples, bananas, tomatoes, potatoes, flour, rice, milk, butter, eggs, chicken breasts
pre-processing:
- a total of 34 observations had some missing values. The values were ‘missing at random’ (MAR), which implies that the propensity for a missing data point is related to some of the observed data, such as the type of the store (e.g. certain store did not have a lot of banana brands). It is often suggested to simply drop NAs; however, doing so would result in the deletion of 34 rows and is not suggested for such a small dataset and MAR. Given the above, I decided to replace some values with the mean value of the product. For computing the mean, I differentiated between the country, product, and store type.

Modeling

the basic idea of the model is that each type of product has a base price, with multipliers depending on store brand and geographical location.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Bayesian Modeling of the World Grocery Prices.ipynb		Bayesian Modeling of the World Grocery Prices.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bayesian Modeling of the World Grocery Prices.ipynb

Bayesian Modeling of the World Grocery Prices.ipynb

readme.md

readme.md

Repository files navigation

Modeling Grocery Prices In Seven Countries Using Bayesian Statistics

Motivation

About dataset

Modeling

Results (for more, check ipynb)

What is bayesian stats?

About

Releases

Packages

Languages

svitlanamidianko/what-factors-influence-grocery-prices-bayesian-stat-approach

Folders and files

Latest commit

History

Bayesian Modeling of the World Grocery Prices.ipynb

Bayesian Modeling of the World Grocery Prices.ipynb

readme.md

readme.md

Repository files navigation

Modeling Grocery Prices In Seven Countries Using Bayesian Statistics

Motivation

About dataset

Modeling

Results (for more, check ipynb)

What is bayesian stats?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages