Rename dropq directory/files as tbi and refactor run_nth_year_*_model functions #1577

martinholmer · 2017-10-02T22:37:04Z

The goal of this pull request is to provide TaxBrain with a simple interface to Tax-Calculator that can handle cps.csv input as well as puf.csv input. So, this pull request is an attempt to resolve PolicyBrain issue 668. While this pull request does not attempt to deal with CPS benefits information, it should provide a foundation for doing that. So, that means that pending pull request #1500 needs to be coordinated with the changes in this pull request.

Note that this pull request changes the public API of Tax-Calculator (from the point of view of TaxBrain only).
In addition to the changes in the two run_nth_year_*_model functions, the create_json_table function has been renamed create_dict_table because the function converts a dataframe table into a dictionary. It has never created a JSON table, so the old name was highly misleading.

Despite the changes in the public API, there are no changes in tax-calculating logic or in tax results.

@MattHJensen @feenberg @Amy-Xu @andersonfrailey @hdoupe @GoFroggyRun @brittainhard

codecov-io · 2017-10-02T22:41:57Z

Codecov Report

Merging #1577 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #1577   +/-   ##
======================================
  Coverage     100%    100%           
======================================
  Files          37      37           
  Lines        2731    2731           
======================================
  Hits         2731    2731

Impacted Files	Coverage Δ
taxcalc/taxcalcio.py	`100% <ø> (ø)`	⬆️
taxcalc/records.py	`100% <ø> (ø)`	⬆️
taxcalc/macro_elasticity.py	`100% <100%> (ø)`	⬆️
taxcalc/__init__.py	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 408f82d...8561084. Read the comment docs.

hdoupe · 2017-10-03T00:14:02Z

@martinholmer I just briefly looked through this PR. This looks great. Thanks for knocking this out quickly. I'll take a closer look tomorrow.

martinholmer · 2017-10-03T15:31:52Z

The refactored run_nth_year_tax_calc_model function handles the reading of the specified input file, but does not maintain a cache of input file contents to speed execution. On a middle-aged iMac, the puf-read-time is about 2.5 seconds, which is a relatively small part of the overall run_nth_year_tax_calc_model function run time of 66 to 67 seconds.

So, it seems as if the speed-up benefits of maintaining a cache might not exceed the complexity costs of maintaining a cache. This issue can be reconsidered after pull request #1577 has been tested with TaxBrain.

@MattHJensen @feenberg @Amy-Xu @andersonfrailey @hdoupe @GoFroggyRun @brittainhard

P.S. The timing statistics above were generated for the second call to the run_nth_year_tax_calc_model function where the puf input data was in the puf.csv.gz file located in the current working directory, which is my understanding of where it would be located when running computational servers on multiple AWS instances.

hdoupe · 2017-10-03T15:48:29Z

taxcalc/tbi/tbi_utils.py

+        input_path = 'puf.csv.gz'
+        if not os.path.isfile(input_path):
+            # otherwise try local Tax-Calculator deployment path
+            input_path = os.path.join(tbi_path, '..', '..', 'puf.csv')


@martinholmer Currently, the PUF is sent to taxcalc as a pandas dataframe. I'm not sure how to generally specify the PUF path relative to this file. It may be easier for use_puf_not_cps to be a keyword argument where it is the PUF data frame if the user chooses the PUF file option and None or False otherwise.

@hdoupe said:

Currently, the PUF is sent to taxcalc as a pandas dataframe. I'm not sure how to generally specify the PUF path relative to this file.

The newest tbi_utils.py code tries to anticipate where the input files will be located.
The following code fragment from PolicyBrain/webapp/apps/taxbrain/tasks.py suggests to me that puf.csv.gz will be in the current working directory:

def get_tax_results_async(mods, inputs_pk): print "mods is ", mods user_mods = package_up_vars(mods) print "user_mods is ", user_mods print "begin work" cur_path = os.path.abspath(os.path.dirname(__file__)) tax_dta = pd.read_csv("puf.csv.gz", compression='gzip') mY_dec, mX_dec, df_dec, mY_bin, mX_bin, df_bin, fiscal_tots = dropq.run_models(tax_dta, num_years=NUM_BUDGET_YEARS, user_mods={START_YEAR:user_mods})

And that is where the tbi_utils.py code looks for it.
Does this make sense?

martinholmer · 2017-10-03T22:30:42Z

The philosophy guiding the refactoring in pull request #1577 is that TaxBrain should be able to just tell Tax-Calculator what it wants to do and then have Tax-Calculator do it and return the results. Expecting TaxBrain to read input files and draw quick-calculation subsamples (which TaxBrain has been doing incorrectly for many months --- see unresolved Policy Brain issue 574, which has been open since June 28th) does not seem like a sensible strategy.

@MattHJensen @hdoupe @GoFroggyRun

martinholmer added 4 commits October 2, 2017 11:40

Rename dropq as tbi (taxbrain interface)

643649a

Revise Read-the-Docs public_api.rst for dropq-to-tbi rename

9575e2c

Eliminate embedded gdp_elasticity dict in user_mods dict

60897ab

Refactor run_nth_year*model functions to use puf or cps

8c6a71a

martinholmer mentioned this pull request Oct 2, 2017

Review of pending merges #1575

Closed

martinholmer changed the title ~~Rename dropq directory/files as tbi and refactor run_nth_year*model functions~~ Rename dropq directory/files as tbi and refactor run_nth_year_*_model functions Oct 2, 2017

martinholmer added 2 commits October 2, 2017 18:46

Update RELEASES.md info

a16bb18

Change dropq to tbi in .coveragerc file

6433b1f

martinholmer added 3 commits October 3, 2017 08:00

Rename tbi_utils.py functions to indicate their function

cfbf8fa

Fix puf input_path and time input read

16a79df

Rename create_json_table as create_dict_table

8abb6ae

hdoupe reviewed Oct 3, 2017

View reviewed changes

Fix cps input logic

5c60bc1

Merge branch 'master' into dropq-to-tbi

70a8ece

martinholmer mentioned this pull request Oct 6, 2017

Fix macro-elasticity logic so that GDP change in year t depends on tax change in year t-1 #1579

Merged

Strengthen test in test_macro_elasticity.py

8561084

martinholmer merged commit 621cdd9 into PSLmodels:master Oct 10, 2017

martinholmer mentioned this pull request Oct 10, 2017

Update TaxBrain and Tax-Calculator to handle UBI and CPS specific inputs ospc-org/ospc.org#668

Open

9 tasks

martinholmer deleted the dropq-to-tbi branch October 10, 2017 14:27

This was referenced Oct 13, 2017

Better quick sample settings ospc-org/ospc.org#574

Closed

Update README.md ospc-org/ospc.org#702

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename dropq directory/files as tbi and refactor run_nth_year_*_model functions #1577

Rename dropq directory/files as tbi and refactor run_nth_year_*_model functions #1577

martinholmer commented Oct 2, 2017 •

edited

codecov-io commented Oct 2, 2017 •

edited

hdoupe commented Oct 3, 2017

martinholmer commented Oct 3, 2017 •

edited

hdoupe Oct 3, 2017

martinholmer Oct 3, 2017

martinholmer commented Oct 3, 2017

Rename dropq directory/files as tbi and refactor run_nth_year_*_model functions #1577

Rename dropq directory/files as tbi and refactor run_nth_year_*_model functions #1577

Conversation

martinholmer commented Oct 2, 2017 • edited

codecov-io commented Oct 2, 2017 • edited

Codecov Report

hdoupe commented Oct 3, 2017

martinholmer commented Oct 3, 2017 • edited

hdoupe Oct 3, 2017

Choose a reason for hiding this comment

martinholmer Oct 3, 2017

Choose a reason for hiding this comment

martinholmer commented Oct 3, 2017

martinholmer commented Oct 2, 2017 •

edited

codecov-io commented Oct 2, 2017 •

edited

martinholmer commented Oct 3, 2017 •

edited