Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in updating dataset description #1192

Open
ihsaan-ullah opened this issue Dec 26, 2022 · 4 comments
Open

Error in updating dataset description #1192

ihsaan-ullah opened this issue Dec 26, 2022 · 4 comments
Labels
Data OpenML concept

Comments

@ihsaan-ullah
Copy link

Description

When a dataset description is updated, it gives the following error:

Screenshot 2022-12-26 at 12 46 00 PM

Steps/Code to Reproduce

# Import OpenML
import openml 

# Configure API Key
openml.config.apikey = 'API_KEY'

# Description file : loading from .md file
micro_markdown = "PLK.md"

# Dataset ID
dataset_id = 44238

# Download dataset without data
openml.datasets.get_dataset(dataset_id, False)


# Read MD file
f_micro = open(micro_markdown, "r")
micro_md = f_micro.readlines()
f_micro.close()

# update description on OpenML
openml.datasets.edit_dataset(dataset_id, description=micro_md)
@ihsaan-ullah ihsaan-ullah changed the title Update dataset description fails Error in updating dataset description Dec 26, 2022
@PGijsbers
Copy link
Collaborator

PGijsbers commented Jan 2, 2023

This is because the description field expects a string, and the provided value is a list of strings, try using the read function instead:

- micro_md = f_micro.readlines()
+ micro_md = f_micro.read()

This results in the text being uploaded, you can see a preview here: https://test.openml.org/d/20
If you are not satisfied with that result and want to experiment around, you can also use the test server:

import openml
openml.config.start_using_configuration_for_example()
openml.config.datasets.edit_dataset(20, description=...)

@ihsaan-ullah
Copy link
Author

Thank you for the quick fix. It now leads to another issue:

Screenshot 2023-01-14 at 3 06 23 PM

Between these two screenshots the content of ".md" "file is displayed.

Screenshot 2023-01-14 at 3 06 52 PM

It looks like there is some encoding issue but not sure about it.

Descriptions are update for one dataset "PLK" but not working for the rest.
https://www.openml.org/search?type=data&sort=runs&id=44238&status=active
https://www.openml.org/search?type=data&sort=runs&id=44282&status=active
https://www.openml.org/search?type=data&sort=runs&id=44317&status=active

@PGijsbers
Copy link
Collaborator

Between these two screenshots the content of ".md" "file is displayed.

Not sure what you mean with that. Can you provide the problematic markdown file(s)? Either per e-mail as files or

here in code block format

@PGijsbers
Copy link
Collaborator

Related to openml/OpenML#911.
In my opinion, we should take the following step:

  1. change the allowed characters that are legal in a data set description
  2. improve the error message (and update local verification)

@mfeurer mfeurer added the Data OpenML concept label Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data OpenML concept
Projects
None yet
Development

No branches or pull requests

3 participants