Skip to content

takekan/ztree2python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ztree2python

ztree2python imports a data file created by z-Tree (Fischbacher, 2007) into python as pandas dataframes. This function inputs the "filename" of a z-Tree data file, and it returns a dictionary, which contains the dataframes of the tables. The keys are the names of all tables, "globals", "subjects", and so on. The value associated with each key is a pandas dataframe for the table.

Installation

Use the package manager pip to install.

pip install ztree2python

Alternatively, simply put ztree2python.py and a z-Tree data file (e.g., 221215_1449.xls) in the current directory or the working directory.

Usage

The ztree2python is a simple function that takes the filename of a z-Tree data file as the argument and returns a dictionary that contains all of the tables in the z-Tree data file.

from ztree2python import ztree2python as z2p

# input the file name, and it returns a dictionary.  
tables = z2p('221215_1449.xls')

The function returns a dictionary. Each table is stored as a dataframe in the tables. Get the data of a table as follows:

# Extract a table by name, for example, the "subjects" table.
my_table = tables['subjects']
my_table.head()

See all of the tables in tables as follows:

# The dictionary also contains a series of table names. See the list.
tables['list_tables']

# Display all of the tables.
from IPython.display import display
for name, tbl in tables.items():
  display(tbl)

Technical notes

The function reads the data and iterates the following process over the names of the tables. It filters the rows of the main dataframe to only include rows that belong to the current table. Then it processes the data for each treatment within the table and creates a dataframe for each treatment. If the period is repeated in the treatment, the data for the treatment has a header row with variable names inserted each period. This function assumes that these header rows are the same within the treatment and reads the top header row as the variable names, then removes all header rows afterwards. All data will be converted to numeric, if possible. Finally, the table for the current treatment is added to the dataframe for the current table.

After all the tables have been processed, the function returns the dictionary of dataframes.

License

MIT. ztree2python is "provided "as is", without warranty of any kind."

Reference

Fischbacher, U. (2007). z-Tree: Zurich toolbox for ready-made economic experiments. Experimental Economics, 10(2), 171-178. https://doi.org/10.1007/s10683-006-9159-4.

Takeuchi, Kan. (2022). ztree2python.py, http://github.com/takekan/ztree2python. (https://sites.google.com/view/takekan/research/ztree2python)

Takeuchi, K. (forthcoming). ztree2stata: A data converter for z-Tree and Stata users. Journal of the Economic Science Association.

I would appreciate it if you kindly mention to this code in a footnote or somewhere.

Acknowledgment

Takeuchi thanks Zhewei Song for the helpful feedback.

A sample data

from ztree2python import ztree2python as ztree2python
tables = ztree2python('221215_1449.xls')
# Extract the data of a table by its name. For example, 
df = tables['globals']
df
session treatment table Period NumPeriods TreatmentNumber TreatmentID RepeatTreatment Repetition
0 221215_1449 1 globals 1 5 1 1 0 1
1 221215_1449 1 globals 2 5 1 1 0 1
2 221215_1449 1 globals 3 5 1 1 0 1
3 221215_1449 1 globals 4 5 1 1 0 1
4 221215_1449 1 globals 5 5 1 1 0 1
# To display all the tables and the table name list.
from IPython.display import display
for name, tbl in tables.items():
    display(tbl)
session treatment table Period NumPeriods TreatmentNumber TreatmentID RepeatTreatment Repetition
0 221215_1449 1 globals 1 5 1 1 0 1
1 221215_1449 1 globals 2 5 1 1 0 1
2 221215_1449 1 globals 3 5 1 1 0 1
3 221215_1449 1 globals 4 5 1 1 0 1
4 221215_1449 1 globals 5 5 1 1 0 1
session treatment table Period Subject ClientNumber LastClientNumber Group Profit TotalProfit Participate EfficiencyFactor Endowment Contribution TimeOKContributionEntryOK SumC N TimeContinueProfitDisplayOK LeaveStage
0 221215_1449 1 subjects 1 1 1 1 1 26.6 26.6 1 1.6 20 7 -41 17 2 30 0
1 221215_1449 1 subjects 1 2 2 2 1 23.6 23.6 1 1.6 20 10 -36 17 2 28 0
2 221215_1449 1 subjects 1 3 3 3 2 31.4 31.4 1 1.6 20 15 -30 33 2 27 0
3 221215_1449 1 subjects 1 4 4 4 2 28.4 28.4 1 1.6 20 18 11 33 2 25 0
4 221215_1449 1 subjects 2 1 1 1 1 25.0 51.6 1 1.6 20 11 9 20 2 99999 0
5 221215_1449 1 subjects 2 2 2 2 1 27.0 50.6 1 1.6 20 9 13 20 2 29 0
6 221215_1449 1 subjects 2 3 3 3 2 30.6 62.0 1 1.6 20 15 22 32 2 26 0
7 221215_1449 1 subjects 2 4 4 4 2 28.6 57.0 1 1.6 20 17 17 32 2 28 0
8 221215_1449 1 subjects 3 1 1 1 1 20.4 72.0 1 1.6 20 18 9 23 2 30 0
9 221215_1449 1 subjects 3 2 2 2 1 33.4 84.0 1 1.6 20 5 13 23 2 28 0
10 221215_1449 1 subjects 3 3 3 3 2 31.4 93.4 1 1.6 20 11 16 28 2 27 0
11 221215_1449 1 subjects 3 4 4 4 2 25.4 82.4 1 1.6 20 17 21 28 2 25 0
12 221215_1449 1 subjects 4 1 1 1 1 27.4 99.4 1 1.6 20 3 7 13 2 99999 0
13 221215_1449 1 subjects 4 2 2 2 1 20.4 104.4 1 1.6 20 10 13 13 2 25 0
14 221215_1449 1 subjects 4 3 3 3 2 21.2 114.6 1 1.6 20 10 19 14 2 27 0
15 221215_1449 1 subjects 4 4 4 4 2 27.2 109.6 1 1.6 20 4 10 14 2 29 0
16 221215_1449 1 subjects 5 1 1 1 1 22.0 121.4 1 1.6 20 10 0 15 2 28 0
17 221215_1449 1 subjects 5 2 2 2 1 27.0 131.4 1 1.6 20 5 -3 15 2 30 0
18 221215_1449 1 subjects 5 3 3 3 2 19.4 134.0 1 1.6 20 3 9 3 2 25 0
19 221215_1449 1 subjects 5 4 4 4 2 22.4 132.0 1 1.6 20 0 15 3 2 26 0
session treatment table Period
session treatment table Period
0 221215_1449 1 summary 1
1 221215_1449 1 summary 2
2 221215_1449 1 summary 3
3 221215_1449 1 summary 4
4 221215_1449 1 summary 5
session treatment table ClientNumber ClientName TreatmentNumber Subject StateString RemainingSeconds FinalProfit MoneyAdded ShowUpFee ShowUpFeeInvested MoneyToPay MoneyEarned LastClientNumber
0 221215_1449 1 clients 1 4 1 1 - Profit Display - 25 121.4 0 0 0 121.4 121.4 0
1 221215_1449 1 clients 2 1 1 2 - Profit Display - 25 131.4 0 0 0 131.4 131.4 0
2 221215_1449 1 clients 3 2 1 3 - Profit Display - 25 134.0 0 0 0 134.0 134.0 0
3 221215_1449 1 clients 4 3 1 4 - Profit Display - 25 132.0 0 0 0 132.0 132.0 0
4 221215_1449 -1 clients 1 4 -1 -1 Ready 25 121.4 0 0 0 121.4 121.4 0
5 221215_1449 -1 clients 2 1 -1 -1 Ready 25 131.4 0 0 0 131.4 131.4 0
6 221215_1449 -1 clients 3 2 -1 -1 Ready 25 134.0 0 0 0 134.0 134.0 0
7 221215_1449 -1 clients 4 3 -1 -1 Ready 25 132.0 0 0 0 132.0 132.0 0
session treatment table TreatmentID BaseTreatment StartMethod FileName BackupFileName NumberOfSubjects TreatmentNumber StartTimeString EndTimeString ErrorString MessageString EmptyString
0 221215_1449 1 treatments 1 1 menu pg.ztt C:\\ztree-5_1_8\\@221215_1449_pg.ztt 4 1 2022-12-15T14:52:17.368-05:00 2022-12-15T14:55:46.382-05:00
1 221215_1449 -1 treatments 1 1 menu pg.ztt C:\\ztree-5_1_8\\@221215_1449_pg.ztt 4 1 2022-12-15T14:52:17.368-05:00 2022-12-15T14:55:46.382-05:00
session treatment table
0 221215_1449 1 sessionglobals
1 221215_1449 1 sessionglobals
2 221215_1449 -1 sessionglobals
3 221215_1449 -1 sessionglobals
session treatment table ParticipationID ClientNumber Subject TreatmentNumber StartInPeriod FinishedPeriod
0 221215_1449 1 participations 1 1 1 1 1 5
1 221215_1449 1 participations 2 2 2 1 1 5
2 221215_1449 1 participations 3 3 3 1 1 5
3 221215_1449 1 participations 4 4 4 1 1 5
4 221215_1449 -1 participations 1 1 1 1 1 5
5 221215_1449 -1 participations 2 2 2 1 1 5
6 221215_1449 -1 participations 3 3 3 1 1 5
7 221215_1449 -1 participations 4 4 4 1 1 5
session treatment table
0 221215_1449 1 << end
0           globals
1          subjects
2         contracts
3           summary
4           clients
5        treatments
6    sessionglobals
7    participations
8           <<  end
Name: 2, dtype: object
#### 1. Compute the mean contribution to draw a simple graph

# read the data of the subjects table
from ztree2python import ztree2python as ztree2python
tables = ztree2python('221215_1449.xls')
df = tables['subjects']

# extract the unique values from the period column. This is going to be the x.
period_values = df['Period'].unique()
print(period_values)

# get the mean of 'Contribution' by 'Period'. This is going to be the y.
mean_contribution_by_period = df.groupby('Period')['Contribution'].mean()
print(mean_contribution_by_period)
[1 2 3 4 5]
Period
1    12.50
2    13.00
3    12.75
4     6.75
5     4.50
Name: Contribution, dtype: float64
#### 1. Draw a simple graph (continuted from the above)

# set the values.
x = period_values
y = mean_contribution_by_period

# draw a simple graph.
# You need matplotlib
import matplotlib.pyplot as plt
plt.plot(x, y)

# Add labels to the axes
plt.title("Mean contribution")
plt.xlabel("Period")
plt.ylabel("Mean contribution")

# Show the plot
plt.show()

png

#### 2. Draw a graph for each Group and add error bar.

# read the data of the subjects table
from ztree2python import ztree2python as ztree2python
tables = ztree2python('221215_1449.xls')
df = tables['subjects']

# extract the unique values from the period column 
period_values = df['Period'].unique()
x = period_values

# Group the data by period and compute the mean and standard error of y for each group
mean_by_period_1 = df[df['Group'] == 1].groupby('Period')['Contribution'].mean()
se_by_period_1   = df[df['Group'] == 1].groupby('Period')['Contribution'].sem()
mean_by_period_2 = df[df['Group'] == 2].groupby('Period')['Contribution'].mean()
se_by_period_2   = df[df['Group'] == 2].groupby('Period')['Contribution'].sem()


# Use the plot function to create a line chart of the mean values
import matplotlib.pyplot as plt
plt.plot(x, mean_by_period_1, linestyle='solid' , marker='o', color='b', label='Group 1')
plt.plot(x, mean_by_period_2, linestyle='dashed', marker='o', color='r', label='Group 2')

# Use the errorbar function to add error bars to the line chart
plt.errorbar(x, mean_by_period_1, yerr=se_by_period_1, fmt='none', ecolor='b')
plt.errorbar(x, mean_by_period_2, yerr=se_by_period_2, fmt='none', ecolor='r')

# Add labels to the axes
plt.title("Mean contribution by Group")
plt.xlabel("Period")
plt.ylabel("Mean contribution")

# Add a legend and title to the plot
plt.legend(loc='upper right')

# Ticks at the data points
plt.xticks(period_values)

# Show the plot
plt.show()

png

Enjoy!