Skip to content

metatron-app/discovery-prep-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

discovery-prep-tool

A command line program for data preparation. Useful for scripting, scheduling, etc.

Table of APIs

No Action URL Method Content-Type
1 Get auth token /oauth/token POST application/json
2 Get upload policy /api/preparationdatasets/file_upload GET application/json
3 Upload file chunks /api/preparationdatasets/file_upload POST multipart/form-data
4 Search data connection /api/connections/filter POST application/json
5 Create imported dataset /api/preparationdatasets POST application/json
6 Search dataset /api/preparationdatasets/search GET application/json
7 Get dataset details /api/preparationdatasets/{dsId} GET application/json
8 Get upstream map /api/preparationdataflows/{dfId}/upstreammap GET application/json
9 Swap upstream dataset /api/preparationdataflows/{dfId}/swap_upstream POST application/json
10 Generate snapshot /api/preparationdatasets/{dsId}/transform/snapshot POST application/json
11 Get snapshot details /api/preparationsnapshots/{ssId} GET application/json
12 Download snapshot /api/preparationsnapshots/{ssId}/download GET application/json
13 Search datasource /api/datasources/filter POST application/json
14 Append to datasource /api/datasources/{id}/append PUT/PATCH application/json
15 Create dataflow /api/preparationdataflows POST application/json
16 Add dataset to dataflow /api/preparationdataflows/{dfId}/update_datasets PUT application/json

1. Get auth token

(POST) /oauth/token?grant_type=password&username=admin&password=admin

Request body

Name Type Required Description Example
access_token string o

2. Get upload policy

(GET) /api/preparationdatasets/file_upload

Response body

Name Type Description Note
upload_id string Pre-generated UUID Submit upload request with this ID.
limit_size int Maximum chunk size >= 350M

3. Upload file chunks

(POST) /api/preparationdatasets/file_upload

Request parameters

Name Type Required Description Example
name string o File basename sample.csv
chunk int o Current chunk index 0
chunks int o Total chunk count 1
storage_type string o LOCAL / HDFS / S3 LOCAL
chunk_size int o Chunk size of this request 135678
total_size int o Total upload size in bytes 135678

Response body

Name Type Description Note
storedUri string Where the uploaded file is stored There's nothing like a file key. This is the only identifier for an uploaded file.

4. Search data connection

(POST) /api/connections/filter

Request parameters

Name Type Required Description Example
projection string o Projection list
page int o Page number 0
size int o Page size 20
sort string Sort order createdTime,desc

Request body

Name Type Required Description Example
containsText string o Text to search ostgre

Response body

Name Type Description Note
id string Databse connection ID In _embedded[connections]

5. Create imported dataset

(POST) /api/preparationdatasets

Request body

Name Type Required Description Example Note
dsName string o Dataset name sales dataset
dsDesc string Dataset description
dsType string o Dataset type IMPORTED
importType string o UPLOAD / URL / DATABASE / ... DATABASE
delimiter string o (file) Column delimiter , File dataset only
quoteChar string o (file) Quote character for delimiters " File dataset only
filenameBeforeUpload string o (file) Filename before upload for information sample.csv File dataset only
storageType string o (file) LOCAL / HDFS / S3 LOCAL File dataset only
sheetName string o (file) Sheet name when EXCEL sheet1 File dataset only
fileFormat string o (file) CSV / JSON / EXCEL CSV File dataset only
manualColcnt int Manually set the column count 100 File dataset only
dcId string o (database) Data connection ID d6647451-9b92-47cb-9214-91c2a024e202 Database dataset only
rsType string o (database) Result set type - TABLE / QUERY QUERY Database dataset only
queryStmt string o (database) SQL statement without ; SELECT * FROM EMP Database dataset only

Response body

Name Type Description Note
dsId string Imported dataset ID

About

A command line program for data preparation. Useful for scripting, scheduling, etc.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published