Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement suggestion] Use pydantic to load models from YAML #3285

Open
LoveIsGrief opened this issue Feb 9, 2024 · 2 comments
Open

[Improvement suggestion] Use pydantic to load models from YAML #3285

LoveIsGrief opened this issue Feb 9, 2024 · 2 comments

Comments

@LoveIsGrief
Copy link

The custom YAML loader solves a problem that has already been solved multiple times and introduces extra work like #2438
Such work could be avoided by using https://github.com/NowanIlfideme/pydantic-yaml which is built upon pydantic.

The models would be defined in code and the JSON schema can be exported from the code.

@LoveIsGrief
Copy link
Author

@kantord I started working on this task and noticed that the courses are in a file tree structure

courses/spanish-from-english/
├── activities
│   ├── module.yaml
│   └── skills
│       ├── continuous.yaml
│       └── ser_estar.yaml
├── basics
│   ├── module.yaml
│   └── skills
│       ├── animals.yaml
│       ├── clothes.yaml
│       ├── food.yaml
│       ├── nature.yaml
│       ├── plurals.yaml
│       ├── professions.yaml
│       ├── verb_plurals.yaml
│       └── verbs.yaml
├── course.yaml
└── introduction
    ├── module.yaml
    └── skills
        ├── adjectives.yaml
        ├── phrases.yaml
        └── preferences.yaml

Is there a reason for this as opposed to using a single YAML file? Do you have some feature in mind that requires this structure?

@kantord
Copy link
Owner

kantord commented Apr 28, 2024

hi @LoveIsGrief, one reason for it to be in separate YAML files is that it would get very very big very soon otherwise. A longer sentence could have hundreds of valid translations and a course could have thousand of sentences or more. If it was a single YAML file it could become a bit harder to manage and navigate, and it would also become a bit harder to develop a GUI for editing the YAML files as somehow it would also need to handle those large files (I think even some widely used text editors and IDEs could struggle with a 10mb file.

Worth mentioning that the course collaboration, at least for the moment is hosted in git repos, and git stores all full versions of entire files from all versions in git history rather than storing only the difference. This means if you have a 50MB yaml file and you change a single character you add 50MB to the size of the repository. That would not be very practical I think.

Another thing to mentioned is that at the moment, course authors and contributors edit the course files manually, as there is no GUI available for editing courses yet. And I think that a lot of computer users are better at navigating folder structures than navigating within a large file, as the latter requires deep knowledge of a text editor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants