Skip to content

ipumsr 0.6.0

Compare
Choose a tag to compare
@robe2037 robe2037 released this 21 Jul 19:27

ipumsr 0.6.0 introduces significant updates to the client tools for the IPUMS API and revamps several IPUMS readers, particularly for IPUMS NHGIS.

Breaking Changes + Deprecations

IPUMS API

  • ipumsr now supports IPUMS API version 2, and no longer supports
    either the beta version or version 1 of the IPUMS API.

    This means that extract definitions saved in JSON format will no longer
    be compatible with ipumsr via define_extract_from_json(). To load
    extract definitions created under previous API versions, there are two
    options:

    • Rewrite the extract definition represented in the JSON file using the
      define_extract_*() function for the relevant IPUMS collection, then update
      the saved file with save_extract_to_json().

    • Update the JSON file by converting all snake_case fields to camelCase.
      For instance, "data_format" would become "dataFormat".
      The "api_version" field will also need to be changed to "version" and
      set equal to 2.

    See the IPUMS developer documentation for more details on API versioning
    and breaking changes introduced in version 2.

  • The ipums_extract object structure has been updated. For IPUMS microdata
    projects, variables and samples are no longer stored as character vectors,
    but as lists. This accommodates new API version 2 features (see below). Use
    names(x$variables) instead of x$variables to access variable (and sample)
    names as a character vector.

  • get_recent_extracts_info_*() functions have been deprecated. Additionally,
    tabular-formatted extract history is no longer supported, and conversion
    functions extract_tbl_to_list() and extract_list_to_tbl() have
    therefore been deprecated as well.

    Use get_extract_history() to browse previous extract definitions in list
    format.

IPUMS Readers

  • read_nhgis_sf() and read_nhgis_sp() have been deprecated. Use
    read_ipums_sf() and read_nhgis() to load spatial and tabular
    data, respectively. Join data with an ipums_shape_*_join() function.

  • data_layer and shape_layer arguments have been deprecated in favor of
    file_select throughout ipumsr. This provides clarity on the intended purpose
    of this argument. Deprecated functions that use the original argument names
    remain unchanged.

  • Support for objects from the sp package has been deprecated because of the
    upcoming retirement of rgdal. Use read_ipums_sf() to load spatial data
    as an sf object. To convert to a Spatial* object, use sf::as_Spatial().

    For more, see r-spatial's post covering the evolution of several spatial packages.

  • read_ipums_sf() no longer defaults to bind_multiple = TRUE.

  • Individual ipums_list_*() functions have been moved to ipums_list_files().

  • Example files in ipums_example() have been updated and include new file
    names.

IPUMS Terra

  • Support for IPUMS Terra has been discontinued. This includes deprecations
    to all read_terra_*() functions, the types = "raster" option in
    ipums_list_files(), and read_ipums_codebook().

    NHGIS codebook reading was previously supported by read_ipums_codebook().
    This functionality has been moved to read_nhgis_codebook().

    For more on IPUMS Terra decommissioning, click here.

New Features

IPUMS API

  • Adds API support for IPUMS NHGIS and IPUMS International!

    • Use define_extract_nhgis() to create an NHGIS extract definition.
    • Use define_extract_ipumsi() to create an IPUMS International extract
      definition.
  • Adds support for IPUMS API version 2 features! This includes:

    • Detailed variable specifications for IPUMS microdata extract
      definitions, including case selections, attached characteristics,
      and data quality flags. Use var_spec() to add these specifications
      to variables in your extract definition.
    • Additional definition-wide parameters for IPUMS microdata extracts,
      including data_quality_flags and case_select_who.
    • Hierarchical extracts for IPUMS microdata extracts. Set
      data_structure = "hiearchical" to create a hierarchical extract definition.
    • Year selection for time series tables in IPUMS NHGIS extract definitions.
      Use tst_spec() to add year selections for time series tables.
  • Adds API support for IPUMS NHGIS metadata

    • Use get_metadata_nhgis() to browse NHGIS data sources. Metadata
      is available in summary form for all datasets, data tables, time series
      tables, and shapefiles as well as for individual datasets, data tables, and
      time series tables.
  • Allows users to set a default IPUMS collection using
    set_ipums_default_collection(). Users with a default collection do not
    need to specify the IPUMS collection in functions that require it; instead,
    the default collection is used. This is a convenience for users who rely
    primarily on a single IPUMS collection.

  • wait_for_extract() wait intervals no longer double after each status
    check. Instead, intervals increase by 10 seconds with each subsequent check.

IPUMS readers

  • Adds handling for fixed-width NHGIS extracts in read_nhgis().

  • read_nhgis_codebook() allows reading
    of raw codebook lines (as opposed to extracting codebook information into
    an ipums_ddi object) by setting raw = TRUE. Furthermore, var_info
    generated from NHGIS codebook files has been updated to include more
    contextual information about the data variables.

  • read_nhgis() now supports additional arguments to refine the data loading
    process. Users can now specify col_types manually and read a subset of
    a data file using vars and n_max.

  • read_nhgis() now allows users to retain the extra header row included in
    some NHGIS files. Set remove_extra_header = FALSE to do so. In general,
    the information contained in the extra header is attached to the data from
    the NHGIS codebook file, but in some cases the extra header may differ
    slightly from the information found in the codebook.

Miscellaneous

  • Various bug fixes

  • Updates to documentation and vignettes for clarity

Thanks to @dtburk and @renae-r for their work on these updates!