Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VERPAT output structure violates VisionEval architecture #142

Open
jrawbits opened this issue Apr 5, 2021 · 3 comments
Open

VERPAT output structure violates VisionEval architecture #142

jrawbits opened this issue Apr 5, 2021 · 3 comments

Comments

@jrawbits
Copy link
Collaborator

jrawbits commented Apr 5, 2021

The data-handling functions in VisionEval used for query and extraction expect the Datastore Group/Table/Name structure to have all the "Name" vectors (Datasets) the same length (corresponding to the Table's key field, such as HhId or VehId).

The VERPAT model, however, does not respect that requirement and puts a number of differently-sized vectors into the same tables. That leads to two failures when we try to read through the model results in "tabular" or "data.frame" structure.

  1. The Vehicles table in the future year contains two series with different lengths (one keyed on HhId, and VehId and one keyed on HhIdFuture and VehIdFuture). Consequently, we cannot query or extract that table in a single operation - instead the "Table" needs to be split in two depending on the length of the datasets (the "extract" tool function in VE 3,9 has been adjusted to handle that)
  2. The Global Model table generally has single values (derived from model_parameters.json), except that CostsPolicy and CostsIdPolicy are 5-element vectors (not single values) that are generated in the CalculatePolicyVmt module. When the Model table is extracted, the single values are recycled 5 times. That's harmless, but misleading.

The correct solution is to ensure that within VERPAT and its model-specific modules, each Dataset is written into a Table with a unique number of rows corresponding to the Table's key field(s).

Two fixes are required in the module code:

  1. The Vehicle...Future datasets should be in their own Table, and
  2. The CostPolicy vector and its key should be in their own table (not in "Model"). There is precedent for creating new tables in Global.
@m-mcqueen
Copy link

Looking forward to seeing this issue resolved, as I'm planning to use VERPAT soon. Just did a run-through with the default data and encountered this error. What is a work around for the time being? Thanks!

@dflynn-volpe
Copy link
Collaborator

There are some workarounds. First of all, you can use the extract() functionality for Azones (counties) Bzones (place types), Households, and the Marea (metropolitan area). Then you can manually query the Global and Vehicles groups as needed.

Let's run the default VERPAT model:

rpat <- openModel('VERPAT')
rpat$run() 

# Select Azone, Bzone, Household, and Marea geographies.
rpat$tablesSelected <- c('Azone', 'Bzone', 'Household', 'Marea')

# Extract all outputs to csv files
rpat$extract()

To get the Vehicles outputs, you can take advantage of readDatastoretables to produce outputs that you need. The extract() functionality above is very convenient by outputting all possible outputs to csv (or as internal R objects), but readDatastoretables may actually be more useful to you by extracting just the outputs of interest. However, you do have to define what outputs you want.

First, you can query the entire datastore to find out all available variables for outputting. Assuming you have just run the demo VERPAT model, do the following (thanks to @gregorbj for building this functionality and sharing example code!):

setwd('models/VERPAT')

QPrep_ls <- prepareForDatastoreQuery(
  DstoreLocs_ = "Datastore", 
  DstoreType = "RD")

Then, make an inventory of the datastore

This creates a zip archive which documents all the datasets in the datastore.
The archive is organized by group. Within each group folder is a set of CSV files, one for each table in the group. Each CSV file lists the datasets included in the table giving the dataset name, data type, units, and description.

documentDatastoreTables(
  SaveArchiveName = "DatastoreDocumentation", 
  QueryPrep_ls = QPrep_ls)

Now look at the outputs in the documentation:

image

The Vehicle.csv file will tell you what variables are available. We can choose variables from this to extract.

First, write a list named TablesRequest_ls:

The named components are tables; each component is a named vector where the names are the names of datasets and the values are the units that the data is to be retrieved in "" means retrieve the data in the units used in the datastore.

TablesRequest_ls <- list(
  Vehicle = c(
    Azone = "",
    HhId = "",
    Mileage = "",
    Dvmt = "",
    Powertrain = ""))

Then call the readDatastoreTables function using the list of requested tables and datasets:

TableResults_ls <-
  readDatastoreTables(
    Tables_ls = TablesRequest_ls,
    Group = "2035",
    QueryPrep_ls = QPrep_ls
  )

The readDatastoreTables function returns a list having two named components: "Data" and "Missing"
The "Data" component is a named list where each named component corresponds to a requested table and the value is a data frame containing the requested datasets in the table.

lapply(TableResults_ls$Data, function(x) head(x))

The first six out of 599,198 rows:

image

@jrawbits
Copy link
Collaborator Author

Note that the VE 3.0 version of extract for model results will successfully create two Vehicle tables for VERPAT (based on the different number of rows in the alternate futures). So currently, everything "works" for extraction. I expect the integrated query system will fail for anything that might get pulled out of both "halves" of the vehicle table.

The deeper fix of restructuring VERPAT's vehicle outputs is very much still on the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants