Skip to content

Latest commit

 

History

History
244 lines (212 loc) · 9.37 KB

configuration.md

File metadata and controls

244 lines (212 loc) · 9.37 KB

dSpaceX Configuration

Preparing datasets

The first step in using dSpaceX for exploration is to preprocess datasets for loading.

Preprocessing

Preprocessing script usage is described in Data Preparation.

Morse-Smale Decomposition

At this point the dataset can be loaded into the client and various partitionings can be explored.

Appropriate decomposition of the data is important before generating models since if the data isn't sensibly partitioned, the "garbage in, garbage out" rule of machine learning will take precedence.

To explore Morse-Smale decompositions, load the preliminary preprocessed model in the dSpaceX GUI, select a datasets and field and open the Partitioning menu as shown here. The options shown include selection of the number of closest neighbors in the dataset to consider when evaluating proximity. Play with the parameters and click Recompute to explore the results, shown as a graph of selectable persistence levels below. Each persistence level consists of a number of "crystals," shown in the right side of the window. Each crystal can be selected and the samples included with it will be shown in a drawer below.

Once the desired dataset partitioning is discovered, click Export and the partitions file used for learning models will be written.

Model learning

Once a dataset has been partitioned, models can be learned using the data associated with each partition. These models can then be interpolated to produce new elements of a dataset. Examples of scripts used to learn models are described in Data Preparation.

Model interpolation

After a model has been generated, new latent space coordinates can be used by the model to generate new data members. This can be performed dynamically by the server, or such data can be pre-generated by the processing scripts. Interpolation of models such as PCA or ShapeOdds have been incorporated into the server already. Other models can be interpolated using external Python scripts callable by the server, and still others can simply be interpolated offline and the results themselves loaded for exploration as shown in the following image.

Dynamic use of Python scripts by the server is now supported. Please see External Python Model Modules for details. Custom module names can be specified in the config.yaml as described below.

Offline model interpolation is desired. See GitHub issue #187.

Configuring Datasets

The dSpaceX server reads datasets consisting of images (samples), design parameters (parameters), and quantities of interest (QoIs). These must be organized into a single directory with a config.yaml that specifies the name of the dataset, its number of samples, and the names, locations, and formats of its images, parameters, QoIs, distance matrices, embeddings (e.g., a tsne layout), and probabilistic models. The currently supported formats are csv, json, and yaml (comma-separated values, JavaScript object notation, and "YAML ain't markup language"), and png images. Here is an example of a the yaml configuration:

name: CantileverBeam

samples:
  count: 1000

parameters:
  format: csv
  file: CantileverBeam_design_parameters.csv

qois:
  format: csv
  file: CantileverBeam_QoIs.csv

thumbnails:
  format: png
  files: images/?.png
  offset: 1                     # base-1 image names (0th name is 1; if offset by 1000, names would start at 1000)
  padZeroes: false              # padded image names (min chars needed must represent offset + num_files)
  channels: 3                    # num channels in each shape (e.g., 1-greyscale, 3-RGB, 4-RGBA)

distances:
  format: csv
  file: CantileverBeam_distance_matrix.csv
  metric: euclidean

embeddings:
  - name: tsne
    format: csv
    file: CantileverBeam_tsne_layout.csv
  - name: ShapeOdds
    format: csv
    file: shapeodds_global_embedding.csv
  - name: Shared GP
    format: csv
    file: shared_gp_global_embedding.csv

models:
  - fieldname: Max Stress
    type: shapeodds                                            # shapeodds, pca, sharedgp, etc
    root: shapeodds_models_maxStress                           # directory of models for this field
    persistences: persistence-?                                # persistence files
    crystals: crystal-?                                        # in each persistence dir are its crystals
    padZeroes: false                                           # for both persistence and crystal dirs/files
    partitions: CantileverBeam_CrystalPartitions_maxStress.csv # has 20 lines of varying length and 20 persistence levels
    first_partition: 0                                         # if depth != -1 && num_persistences > 20, this is the first directory #
    mesh: false                                                 # This is a mesh model generating corresponding sets of points (each set of points has the same triangle associations)
    rotate: false                                            # the shape produced by this model needs to be rotated 90 degrees clockwise to match samples (old ShapeWorks models need this)
    ms:                                                        # Morse-Smale parameters used to compute partitions
      knn: 15                                                  # k-nearest neighbors
      sigma: 0.25                                              # 
      smooth: 15.0                                             # 
      depth: 20                                                # num persistence levels; -1 means compute them all
      noise: true                                              # add mild noise to the field to ensure inequality
      curvepoints: 50                                          # vis only? Not sure if this matters for crystal partitions 
      normalize: false                                         # vis only? Not sure if this matters for crystal partitions
    interpolations:                                            # precomputed interps
       - i1:
         params:                                               # model interpolation parameters used
           sigma: 0.15                                         # Gaussian width
           num_interps: 50                                     # precomputed interps per crystal
       - i2:
         params:
           sigma: 0.01
           num_interps: 500

- fieldname: Max Stress
  type: pca
  root: ms_partitions/test_max_stress_pca_model
  persistences: persistence-?
  crystals: crystal-?
  padZeroes: false
  partitions: ms_partitions.csv
  first_partition: 6
  rowmajor: true
  ms:
    knn: 15
    sigma: 0.25
    smooth: 15
    depth: 20
    noise: true
    curvepoints: 50
    normalize: true

  - fieldname: Angle
    type: pca
    root: pca_models/pca_model_param_Angle
    persistences: persistence-?
    crystals: crystal-?
    padZeroes: false
    partitions: crystal_partitions/cantilever_crystal_partitions_Angle.csv
    rowmajor: true
    ms:
      knn: 15
      sigma: 0.25
      smooth: 15.0
      depth: -1
      noise: true
      curvepoints: 50
      normalize: false
    interpolations:
       - i1:
         params:
           sigma: 0.15
           num_interps: 50
       - i2:
         params:
           sigma: 0.01
           num_interps: 500

- fieldname: avg_field
  type: pca
  mesh: true
  python_evaluator: None
  python_renderer: data.thumbnails          # module must have MeshRenderer class
  root: ms_partitions/avg_field_pca_model
  persistences: persistence-?
  crystals: crystal-?
  padZeroes: false
  partitions: ms_partitions.csv
  first_partition: 21
  ms:
    knn: 15
    sigma: 0.25
    smooth: 15
    depth: 20
    noise: true
    curvepoints: 50
    normalize: true

  - fieldname: Angle
    type: custom                                        # a new model type (dynamic interpolation will require external evaluators and renderers and/or precomputed should be provided)
    root: custom_models/custom_model_param_Angle
    persistences: persistence-?
    crystals: crystal-?
    padZeroes: false
    partitions: crystal_partitions/cantilever_crystal_partitions_Angle.csv
    ms:
      knn: 15
      sigma: 0.25
      smooth: 15.0
      depth: -1
      noise: true
      curvepoints: 50
      normalize: false
    interpolations:
       - i1:
         params:
           sigma: 0.15
           num_interps: 50
       - i2:
         params:
           sigma: 0.01
           num_interps: 500

Starting the server

See Running the Server for instructions on starting the server.

Connecting the client

See Running the Client for details on starting the web client.

Exploration

It's time to explore. See Using dSpaceX for guidance using the application.