You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bd_CPS_id.ipynb skips months prior to May 1995. I could try to generate IDs for the 1989-93 data with the current code. I could also try to customize the code to handle Jan 1994-April 1995 without trying to match to May 1995 (or later).
Pretty confident that I've figured out the 1993-1994 match process (since variables differ). If I match by HHID, STATE, and match the last digit of HHID2 (1994-onward) to H-HHNUM (1993), the results look to be correct.
UPDATE: There's more to this--several cases have a second to last digit of HHID2 that is not 0 and therefore likely has a counterpart variable in the 1989-93 data. Perhaps these are cases where one unit gets split into two or similar. I'm not sure I'll find a counterpart variable in the pre-1994 data.
For the example that I picked, all of cases work out if I append HHID2 % 100 to the end of HHID and compare it to pre-1994's: HHID | 0 | HHNUM. I still think the 0 is a variable somewhere within the pre-1994 data, but haven't found it yet.
Because the pre-1995-break data are not going to change (whereas the CPS_id is going to be generated for new months), I could create a new cell in bd_CPS_id.ipynb that adds checks if the pre-1995 IDs are available, and, if not, generates them, then adds them to the dictionary containing the 1995-onward IDs.
Part of the problem here (which I run into often) is that I do not outline (in words) what is done by bd_CPS_id.ipynb meaning how the IDs are generated. This is covered in issue #171 and should be given some consideration as I make updates.
Create matched CPSID for households from 1989 to near the 1995 break, and skipping the breaks between.
The text was updated successfully, but these errors were encountered: