Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PUF SOI estimates #411

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

andersonfrailey
Copy link
Collaborator

This PR addresses issue #399.

The updatesoi.py file is used to automatically update our SOI estimates, but the range of indicies used to add up total wages for those with AGI greater than $1 million was off and excluded some of the data, as @donboyd5 figured out. This PR fixes that and adds an additional check to updatesoi.py to prevent an issue like this from going undetected in the future.

The bug affected our SOI estimates for 2015-2017. They have all been fixed in this PR.

@andersonfrailey
Copy link
Collaborator Author

This PR is just about done, but with the changes there's a big increase in tax liability for 2030 and 2031 that I can't explain. I've attached a table comparing the tax liabilities that were found with taxcalc 3.2.1.
Screen Shot 2022-01-10 at 2 58 31 PM

@donboyd5
Copy link

donboyd5 commented Jan 10, 2022 via email

@andersonfrailey
Copy link
Collaborator Author

Deleted my last comment, Realized there was an issue with how I was looking at the data when I was making those charts

@jdebacker
Copy link
Member

@andersonfrailey Is this PR ready for review? Last comment made that unclear. Also, could be helpful to produce tables Don suggested to aid in review.

@andersonfrailey
Copy link
Collaborator Author

@jdebacker, I haven't been able to fix the issue with tax revenues jumping up in the last few years yet, but I'd definitely be open to others seeing if they get the same result when they run the changes in this PR. Spring break starts tomorrow so I should be able to work on those tables Don suggested this week!

@jdebacker
Copy link
Member

Great - thanks for the update!

@MattHJensen
Copy link
Contributor

MattHJensen commented Mar 30, 2022

@andersonfrailey, I ran make puf-files with this PR and hit several ITERATION_LIMIT and INFEASIBLE terminations during stage 2. Is this to be expected? Terminal output in this gist.

I'm going to push forward to replicate your revenue table and then create Don's suggested tables in the next few days, but thought I should check in on this. Thanks!

@donboyd5
Copy link

donboyd5 commented Mar 31, 2022

A few thoughts:

2016:
image

  • it hit the iteration limit of 100
  • obj function is NaN -- not sure what to make of this, but note that the dual objective is huge; perhaps that is what it always reports when the limit is hit; worth knowing
  • the targets were satisfied (the percentile values for ratio of calculated to desired targets are all approximately 1)
  • the ratios of new weights to old weights ranged from 0.69 to 1.86
  • but the computer code says the tolerance around the weights was 0.30 in 2016, meaning we'd want the ratio to range from 0.70 to 1.30; thus, it was only able to hit the targets by making some weights larger than it was told to make them

image

  • I doubt that raising the iteration limit would solve this
  • you might find this solution acceptable and might consider increasing the tolerance around the weights (e.g., +/- 0.50 or something like that, or make it asymmetric (with a coding change))
  • or you might examine the targets - they may be very hard to hit (and perhaps unreasonable - or unreasonable in relation to the data, meaning the data might be unreasonable in some fashion)
  • it might be possible to investigate by looking at the target values you get using the 2016 growfactors and the starting weights (I think these would be the 2015 weights, but they might be 2011, I don't remember which the code uses - base year or prior year) and seeing if some are just way different from the 2016 targets - checking to see which variables are way off; the ones that are way off might have a bad 2016 growfactor or a bad 2016 target, or it may just be that the world changed in a way that is hard for a grown 2011 file to hit

2029
image

  • iteration limit of 100 was hit
  • objective is NaN -- not sure what to make of this, but note that the dual objective is huge
  • targets were hit
  • weight ratios were almost in range -- 0.54 to 1.47 -- didn't quite make +/- 0.45
  • but notice the strange distribution -- almost all of them were driven down by ~0.45 or up by 0.45, so it is clear that the constraints can only be hit by jerking the weights around
  • again I would investigate the constraints, looking for one (or more) of three possibilities: (1) bad (implausible) targets, (2) bad data resulting from bad growfactors, or (3) plausible targets and plausible growfactors, but the world (targets) changed in ways that our very old data has a hard time mimicking

2030 (shown) and 2031
image

  • Things are really out of hand at this point and the constraints are not feasible.

I haven't looked at the targets but my intuition would be to look at the way the correction to the wage targets was implemented to see if the full set of new targets is internally inconsistent. This might be the underlying issue, and perhaps also causing undesirable results in other years, too, even though no flags were raised.

@donboyd5
Copy link

donboyd5 commented Mar 31, 2022

@MattHJensen Inconsistent targets could arise if, for example:

  • You had a wage target for each agi range, PLUS a total wage target (which would not be a great idea, because it would be redundant), AND the sum of the wage targets by range was not consistent with the total wage target, or
  • You had a wage target for each agi range, plus total targets for other income variables, plus a target for total income (e.g., agi), AND the sum of all the income targets was not consistent with the total income target

There are ways to be inconsistent, too, but these are obvious ones.

@jdebacker
Copy link
Member

@MattHJensen Can you test this branch again with the latest changes and see if you still get that error with iteration limits?

@jdebacker
Copy link
Member

An update on this branch. I've tested this with the latest versions of Julia and Tulip and still get an iteration limit hit in 2029 and "infeasible" after that.

Will look into targets in more detail next, but was hoping a new solver would do the trick...

cc @andersonfrailey @donboyd5 @MattHJensen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

$121b of wages of millionaires dropped from puf stage 2 targets??
4 participants