Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating local dataset versions not working as expected #2717

Open
Aid91 opened this issue Nov 18, 2021 · 3 comments
Open

Creating local dataset versions not working as expected #2717

Aid91 opened this issue Nov 18, 2021 · 3 comments

Comments

@Aid91
Copy link

Aid91 commented Nov 18, 2021

Hi,

Currently I am using the open source version of the modelDB, with the latest docker images for all components:

  • modeldb-backend:2.0.8.1
  • modeldb-proxy:2.0.8.1
  • modeldb-frontend:2.0.8.1
  • modeldb-graphql:2.0.8.1
  • Verta python client versions verta>=0.16.0 (I tried all versions newer than 0.16.0)

When I try the basic local dataset versioning, no metadata about the files/directories is shown in the frontend, and probably because of the same reason no increments in data versions are possible (always a data version of 1 is returned).

Code example:

from verta import Client
from verta.dataset import Path
import os

client = Client("http://localhost:3000")
proj = client.set_project("Test project", desc="Test project")
expt = client.set_experiment("Test experiment", desc="Test experiment")


run = client.set_experiment_run(desc="Test experiment run", attrs={})
dataset = client.set_dataset(name="Test dataset")
dataset_version = dataset.create_version(Path("data.csv"))

Result:

connection successfully established
got existing Project: Test project
got existing Experiment: Test experiment
created new ExperimentRun: Run 551637130906217477
created new Dataset: Test dataset in workspace: personal
created new Dataset Version: 1 for Test dataset

When I change the data.csv file and run the same code again I get again the dataset version 1 (no version increment):

created new Dataset Version: 1 for Test dataset

If I decrease the python client version to verta==0.15.* dataset versioning works again, but some methods like dataset.get_latest_version() throw an exception: HTTPError: 501 Server Error: Method ai.verta.modeldb.DatasetVersionService/getDatasetVersionById is unimplemented for url: ...

This leads to my final question: Is latest open source version of the ModelDB supporting local dataset versioning? If so, which component versions (modeldb-backend, modeldb-proxy, etc) and Python client version are compatible?

Thanks in advance!

@Aid91 Aid91 changed the title Logging dataset versions not working as expected Creating local dataset versions not working as expected Nov 18, 2021
@convoliution
Copy link
Contributor

Hi @Aid91, thank you for your continued interest in ModelDB!

verta==0.16.0 did involve an overhaul in how dataset versions are captured, and our OSS platform may not fully support its operations. <0.16.0 would be the best bet for core functionality, though a few methods (such as get_latest_version()) may also be absent from OSS.

In the meantime, I shall file a ticket for us at Verta to follow up on.

@Atharex
Copy link
Contributor

Atharex commented Jan 7, 2022

Hi @convoliution

I am seeing a similar error and am interested to know, if there will be any new OSS releases past 2.0.8.1?

I've tried building the server-side components from the master branch several times, but the builds never succeeded

@Atharex
Copy link
Contributor

Atharex commented Jan 7, 2022

Also asking because the 2.0.8.1 release contains a vulnerable log4j version and would really need an update
https://github.com/VertaAI/modeldb/blob/v2.0.8.1/backend/pom.xml#L22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants