Datasets and Datastore clarification #9

aminsaied · 2020-12-07T18:53:16Z

The current page on datasets and datastores needs clarifying:

Too many ways to do the same thing. Make it clear the recommended approach (datasets)
For adding a datareference to ScriptRunConfig we mention secret environment variables. Make it clear that the user does not need to know this to use DataReference in their script.

aminsaied · 2021-03-24T16:54:42Z

It would also be good to have OutputDatasetConsumptionConfig example, as well as the interaction between backing a FileDataset with a Datastore in the context of input / output data - and why those are different (i.e. ReadOnly vs ReadWrite)

aminsaied · 2021-04-22T20:49:32Z

Include example on using dataset.as_mount() with command like this:

command=["python train.py --training-data", dataset.as_mount()]

if u want to use environment variable. if you pass in dataset.as_named_input('env_varmane').as_mount()
azureml.data.abstract_dataset.AbstractDataset class - Azure Machine Learning Python | Microsoft Docs

arnabbiswas1 · 2021-10-18T13:15:42Z

AML Documentation no longer recommends usage of DataReference Class (source):

It is no longer the recommended approach for data access and delivery in Azure Machine Learning. Dataset supports accessing data from Azure Blob storage, Azure Files, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure SQL Database, and Azure Database for PostgreSQL through unified interface with added data management capabilities. It is recommended to use dataset for reading data in your machine learning projects.

May be it makes sense to remove DataReference (link) section from the cheat sheet. Otherwise, it's easy to get confused between DataSet and DataReference.

I personally was not aware of DataReference earlier and thought it is the recommended way now (after the API enhancement).

aminsaied added aml-ds cheatsheet labels Dec 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datasets and Datastore clarification #9

Datasets and Datastore clarification #9

aminsaied commented Dec 7, 2020

aminsaied commented Mar 24, 2021

aminsaied commented Apr 22, 2021

arnabbiswas1 commented Oct 18, 2021

Datasets and Datastore clarification #9

Datasets and Datastore clarification #9

Comments

aminsaied commented Dec 7, 2020

aminsaied commented Mar 24, 2021

aminsaied commented Apr 22, 2021

arnabbiswas1 commented Oct 18, 2021