Nomics

Containing general utilities/functions often useful for economic analyses. More detailed documentation is found in the comments of the code itself

To install it run

git clone https://github.com/codyfcook/nomics.git && cd nomics && pip install -e . && cd ..

Utilities

There are various simple functions that may be of use. Often these functions just simplify the notation to reduce Googling/typing/typos

Utilities also imports from basil_utils.py, so running from nomics.utilities import * will get you the functions below plus a bunch of Basil's.

Sample of a few of the utility functions:

groups_apply_parallel(groups, func, cpu_count=4) Run applies in parallel. Make sure not to overwhelm the CPUs.
format_list_for_qr(list, quote=True) formats a list of items so that it can be used in a query (e.g. as 'name in ({list})')
dfs_in_mem() shows you the dataframes currently being held in memory
mem_usage(obj) prints how much memory the object is using

Statistics

Absorber

The absorber allows you to run absorbed FE regressions. See Stata's areg command for more details on what this means.

To run it, do:

from nomics.stats.absorber import OLSAbsorb
y = d['y_col']
dense = d['dense_cols']
absorb = np.array(d['absorb_cols'])
model = OLSAbsorb(y, dense, absorb)
results = model.fit()
print results.summary()

The absorbed cols should be integers from 1 to N denoting the group for which the effect size (coefficient) will be absorbed. You can turn any old column in to a series such integeres by running:

d[new_col] = d[old_col].astype('category').cat.codes.astype(int)

ZIP

Code to run a Zero Inflated Poisson.

from nomics.stats.zip import ZIPoisson
y = d['y_cols']
x = d['x_cols']
# The columns you want included in the logit part of a ZIP to predict 0/1 in the y col
inflateX = ['inflate_cols']

mod = ZIPoisson(y,x,inflateX).fit()
print mod.summary()

Also has code for a Vuong test.

Experiments

Has two functions to check whether treatment groups are balanced.

check_for_balance_with_regs(cat_vars, cont_vars, group_var, data) For continuous variables, get's the p-value from an F-test of var on treatment to see if treatment can predict variable. For categorical vars uses a chi squared test to see if treatment more likely to appear in some categories than others
pairwise_t_test(data, cont_vars, group_var): For each variable, compare pairwise across groups and do a t-test to see if different

Matching

Code to run (1) propensity score matching (2) distance matching (KNN; greedy matching; optimal matching) and (3) basic analytics on matched pairs (calculate ATT; some measures of balance)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
nomics.egg-info		nomics.egg-info
nomics		nomics
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
setup_coding_environ.sh		setup_coding_environ.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nomics.egg-info

nomics.egg-info

nomics

nomics

MANIFEST.in

MANIFEST.in

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

setup_coding_environ.sh

setup_coding_environ.sh

Repository files navigation

Nomics

Utilities

Statistics

Absorber

ZIP

Experiments

Matching

About

Releases

Packages

Languages

basilhalperin/nomics

Folders and files

Latest commit

History

Repository files navigation

Nomics

Utilities

Statistics

Absorber

ZIP

Experiments

Matching

About

Resources

Stars

Watchers

Forks

Languages