ipumsr 0.6.0
ipumsr 0.6.0 introduces significant updates to the client tools for the IPUMS API and revamps several IPUMS readers, particularly for IPUMS NHGIS.
Breaking Changes + Deprecations
IPUMS API
-
ipumsr now supports IPUMS API version 2, and no longer supports
either the beta version or version 1 of the IPUMS API.This means that extract definitions saved in JSON format will no longer
be compatible with ipumsr viadefine_extract_from_json()
. To load
extract definitions created under previous API versions, there are two
options:-
Rewrite the extract definition represented in the JSON file using the
define_extract_*()
function for the relevant IPUMS collection, then update
the saved file withsave_extract_to_json()
. -
Update the JSON file by converting all
snake_case
fields tocamelCase
.
For instance,"data_format"
would become"dataFormat"
.
The"api_version"
field will also need to be changed to"version"
and
set equal to2
.
See the IPUMS developer documentation for more details on API versioning
and breaking changes introduced in version 2. -
-
The
ipums_extract
object structure has been updated. For IPUMS microdata
projects,variables
andsamples
are no longer stored as character vectors,
but as lists. This accommodates new API version 2 features (see below). Use
names(x$variables)
instead ofx$variables
to access variable (and sample)
names as a character vector. -
get_recent_extracts_info_*()
functions have been deprecated. Additionally,
tabular-formatted extract history is no longer supported, and conversion
functionsextract_tbl_to_list()
andextract_list_to_tbl()
have
therefore been deprecated as well.Use
get_extract_history()
to browse previous extract definitions in list
format.
IPUMS Readers
-
read_nhgis_sf()
andread_nhgis_sp()
have been deprecated. Use
read_ipums_sf()
andread_nhgis()
to load spatial and tabular
data, respectively. Join data with anipums_shape_*_join()
function. -
data_layer
andshape_layer
arguments have been deprecated in favor of
file_select
throughout ipumsr. This provides clarity on the intended purpose
of this argument. Deprecated functions that use the original argument names
remain unchanged. -
Support for objects from the sp package has been deprecated because of the
upcoming retirement of rgdal. Useread_ipums_sf()
to load spatial data
as ansf
object. To convert to aSpatial*
object, usesf::as_Spatial()
.For more, see r-spatial's post covering the evolution of several spatial packages.
-
read_ipums_sf()
no longer defaults tobind_multiple = TRUE
. -
Individual
ipums_list_*()
functions have been moved toipums_list_files()
. -
Example files in
ipums_example()
have been updated and include new file
names.
IPUMS Terra
-
Support for IPUMS Terra has been discontinued. This includes deprecations
to allread_terra_*()
functions, thetypes = "raster"
option in
ipums_list_files()
, andread_ipums_codebook()
.NHGIS codebook reading was previously supported by
read_ipums_codebook()
.
This functionality has been moved toread_nhgis_codebook()
.For more on IPUMS Terra decommissioning, click here.
New Features
IPUMS API
-
Adds API support for IPUMS NHGIS and IPUMS International!
- Use
define_extract_nhgis()
to create an NHGIS extract definition. - Use
define_extract_ipumsi()
to create an IPUMS International extract
definition.
- Use
-
Adds support for IPUMS API version 2 features! This includes:
- Detailed variable specifications for IPUMS microdata extract
definitions, including case selections, attached characteristics,
and data quality flags. Usevar_spec()
to add these specifications
to variables in your extract definition. - Additional definition-wide parameters for IPUMS microdata extracts,
includingdata_quality_flags
andcase_select_who
. - Hierarchical extracts for IPUMS microdata extracts. Set
data_structure = "hiearchical"
to create a hierarchical extract definition. - Year selection for time series tables in IPUMS NHGIS extract definitions.
Usetst_spec()
to add year selections for time series tables.
- Detailed variable specifications for IPUMS microdata extract
-
Adds API support for IPUMS NHGIS metadata
- Use
get_metadata_nhgis()
to browse NHGIS data sources. Metadata
is available in summary form for all datasets, data tables, time series
tables, and shapefiles as well as for individual datasets, data tables, and
time series tables.
- Use
-
Allows users to set a default IPUMS collection using
set_ipums_default_collection()
. Users with a default collection do not
need to specify the IPUMS collection in functions that require it; instead,
the default collection is used. This is a convenience for users who rely
primarily on a single IPUMS collection. -
wait_for_extract()
wait intervals no longer double after each status
check. Instead, intervals increase by 10 seconds with each subsequent check.
IPUMS readers
-
Adds handling for fixed-width NHGIS extracts in
read_nhgis()
. -
read_nhgis_codebook()
allows reading
of raw codebook lines (as opposed to extracting codebook information into
anipums_ddi
object) by settingraw = TRUE
. Furthermore,var_info
generated from NHGIS codebook files has been updated to include more
contextual information about the data variables. -
read_nhgis()
now supports additional arguments to refine the data loading
process. Users can now specifycol_types
manually and read a subset of
a data file usingvars
andn_max
. -
read_nhgis()
now allows users to retain the extra header row included in
some NHGIS files. Setremove_extra_header = FALSE
to do so. In general,
the information contained in the extra header is attached to the data from
the NHGIS codebook file, but in some cases the extra header may differ
slightly from the information found in the codebook.
Miscellaneous
-
Various bug fixes
-
Updates to documentation and vignettes for clarity
Thanks to @dtburk and @renae-r for their work on these updates!