Consider moving zim::Metadata into libzim #785

veloman-yunkan · 2023-04-27T11:52:51Z

openzim/zim-tools#339 introduced zim::Metadata in zim-tools source tree so that it could be immediately used as a shared utility between zimwriterfs and zimcheck. However libzim is a more logical home for zim::Metadata. Then (after an easy enhancement of zim::Metadata) simple constraints (involving a single metadata item) can be checked in zim::writer::Creator::addMetadata() and full checks can be run in zim::writer::Creator::finishZimCreation().

The text was updated successfully, but these errors were encountered:

kelson42 · 2023-05-02T14:07:03Z

@mgautierfr Any opinion? I have nothing against.

mgautierfr · 2023-05-03T13:08:37Z

I'm against that.
libzim doesn't know about metadata. All the api is agnostic to the metadata (you have a getMetadata(std::string name) but no getTitle()).

This is something already discussed, but libzim is content agnostic. I consider the metadata described in libzim format more as kiwix specification than zim ones. Simply because zim files without these metadata can correctly being read by libzim (but maybe not by kiwix readers). Checking the metadata is at same level than checking the url inside the articles.
It may break how we use zim files but not how we parse them.

kelson42 · 2023-05-03T13:40:00Z

I consider the metadata described in libzim format more as kiwix specification than zim ones.

@mgautierfr I understand the way how you make the difference, but this is pretty much a personal POV. Metadata specs are part of the ZIM specification like the rest.

Beside the question of the principle, do you see any bad effect that such a move would generate?

mgautierfr · 2023-05-03T14:09:49Z

We already had this discussion in openzim/zim-tools#336 which is exactly the root issue leading to zim::Metadata.
I let you read again my comment (openzim/zim-tools#336 (comment)) you were agree with (to the point you moved the issue from libzim to zim-tools)

kelson42 · 2023-05-19T11:52:54Z

@mgautierfr I understand your opinion (about the separation of duties between the libzim and the zim-tools) which is the one we have pursued since a long time. At this stage, I either fully disagree or agree with it. Actually, so far I have been supportive of your POV like you have mentioned (but I disagree - and always have AFAIK - with the assertion that metadata size limits are not part of the ZIM specification).

But I see that this metadata checking is currently under implementation at multiple levels in scrapers and binding and zim-tools. I'm concerned about this redundancy. This new reality is why I have asked "Beside the question of the principle, do you see any bad effect that such a move would generate?" (which is not a question you have already answered in the past AFAIK).

rgaudin · 2023-05-19T12:46:53Z

Just to be clear, redundancy-wise (already wrote that somewhere else), the scrapers needs to validate Metadata early to provide direct feedback as most of those important metadata are user-provided.

This means having direct access to checks (could be possible via libzim) but also complete checks.

We are quite happy with scraperlib and I think that in this case it's more practical to have redundancy in scraperlib and JS-binding rather than bringing ISO-639-3 and PNG library to libzim.

The point is sort of the same: if the spec allows setting invalid metadata, it's hard to justify such requirements just to check them. Similar to the separate writer/reader libraries topic.

kelson42 · 2023-05-19T12:54:38Z

Just to be clear, redundancy-wise (already wrote that somewhere else), the scrapers needs to validate Metadata early to provide direct feedback as most of those important metadata are user-provided.

This is a good point I have forgotten to say. We might be more interested in the definition of the Metadata constraints in the libzim than in the check themselves. This is at least one of the altnernative I consider.

kelson42 added enhancement question labels Apr 27, 2023

kelson42 added this to the 8.3.0 milestone Apr 27, 2023

kelson42 mentioned this issue May 3, 2023

Enforce mandatory metadata openzim/python-scraperlib#95

Closed

kelson42 modified the milestones: 9.0.0, 9.1.0 Sep 26, 2023

kelson42 modified the milestones: 9.1.0, 10.0.0 Nov 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider moving zim::Metadata into libzim #785

Consider moving zim::Metadata into libzim #785

veloman-yunkan commented Apr 27, 2023

kelson42 commented May 2, 2023 •

edited

mgautierfr commented May 3, 2023

kelson42 commented May 3, 2023 •

edited

mgautierfr commented May 3, 2023

kelson42 commented May 19, 2023 •

edited

rgaudin commented May 19, 2023

kelson42 commented May 19, 2023

Consider moving zim::Metadata into libzim #785

Consider moving zim::Metadata into libzim #785

Comments

veloman-yunkan commented Apr 27, 2023

kelson42 commented May 2, 2023 • edited

mgautierfr commented May 3, 2023

kelson42 commented May 3, 2023 • edited

mgautierfr commented May 3, 2023

kelson42 commented May 19, 2023 • edited

rgaudin commented May 19, 2023

kelson42 commented May 19, 2023

kelson42 commented May 2, 2023 •

edited

kelson42 commented May 3, 2023 •

edited

kelson42 commented May 19, 2023 •

edited