Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bd CPS: CPSID and matching before 1995 break #175

Open
bdecon opened this issue Mar 5, 2019 · 4 comments
Open

bd CPS: CPSID and matching before 1995 break #175

bdecon opened this issue Mar 5, 2019 · 4 comments

Comments

@bdecon
Copy link
Owner

bdecon commented Mar 5, 2019

Create matched CPSID for households from 1989 to near the 1995 break, and skipping the breaks between.

@bdecon bdecon self-assigned this Mar 5, 2019
@bdecon bdecon added this to To do in bd CPS: Version 1.0 via automation Mar 5, 2019
@bdecon
Copy link
Owner Author

bdecon commented Apr 26, 2019

bd_CPS_id.ipynb skips months prior to May 1995. I could try to generate IDs for the 1989-93 data with the current code. I could also try to customize the code to handle Jan 1994-April 1995 without trying to match to May 1995 (or later).

@bdecon
Copy link
Owner Author

bdecon commented Apr 28, 2019

Pretty confident that I've figured out the 1993-1994 match process (since variables differ). If I match by HHID, STATE, and match the last digit of HHID2 (1994-onward) to H-HHNUM (1993), the results look to be correct.

UPDATE: There's more to this--several cases have a second to last digit of HHID2 that is not 0 and therefore likely has a counterpart variable in the 1989-93 data. Perhaps these are cases where one unit gets split into two or similar. I'm not sure I'll find a counterpart variable in the pre-1994 data.

For the example that I picked, all of cases work out if I append HHID2 % 100 to the end of HHID and compare it to pre-1994's: HHID | 0 | HHNUM. I still think the 0 is a variable somewhere within the pre-1994 data, but haven't found it yet.

@bdecon
Copy link
Owner Author

bdecon commented Apr 28, 2019

Because the pre-1995-break data are not going to change (whereas the CPS_id is going to be generated for new months), I could create a new cell in bd_CPS_id.ipynb that adds checks if the pre-1995 IDs are available, and, if not, generates them, then adds them to the dictionary containing the 1995-onward IDs.

@bdecon
Copy link
Owner Author

bdecon commented Apr 28, 2019

Part of the problem here (which I run into often) is that I do not outline (in words) what is done by bd_CPS_id.ipynb meaning how the IDs are generated. This is covered in issue #171 and should be given some consideration as I make updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

1 participant