Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDF/XMP generated by pikepdf is incorrect? #556

Open
jkorinth opened this issue Jan 19, 2024 · 0 comments
Open

RDF/XMP generated by pikepdf is incorrect? #556

jkorinth opened this issue Jan 19, 2024 · 0 comments

Comments

@jkorinth
Copy link

jkorinth commented Jan 19, 2024

I use pikepdf to set dc:title, dc:created and dc:contributor metadata on a PDF file. The resulting RDF looks like this (reformatted for better readability):

<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="pikepdf">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about="">
    <dc:contributor xmlns:dc="http://purl.org/dc/elements/1.1/">
      <rdf:Bag>
        <rdf:li>Test Contributor</rdf:li>
      </rdf:Bag>
    </dc:contributor>
  </rdf:Description>

  <rdf:Description rdf:about="">
    <dc:title xmlns:dc="http://purl.org/dc/elements/1.1/">
      <rdf:Alt><rdf:li xml:lang="x-default">Title</rdf:li></rdf:Alt>
    </dc:title>
  </rdf:Description>

  <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="" dc:created="D:20231225000000"/>

  <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="" xmp:MetadataDate="2024-01-19T07:45:50.786293+00:00"/>
  <rdf:Description xmlns:pdf="http://ns.adobe.com/pdf/1.3/" rdf:about="" pdf:Producer="pikepdf 8.10.1"/>
</rdf:RDF>
</x:xmpmeta>

RDF is an overengineered mess, but judging from what I gather from the specification this is wrong:

  1. There can be multiple rdf:Descriptions in a RDF, but I think they need to refer to different things using rdf:about; setting rdf:about to "" may be conformant, but I don't think it is correct. UPDATE: "" is apparently correct to reference the same file; nevertheless, all examples I could find use a single rdf:Description.
  2. I'm not sure if dc:created exists as an attribute on rdf:Description, I think it should be an element instead.

I think this would be the correct RDF for the example above:

<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="pikepdf">
<rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:dc="https://purl.org/dc/elements/1.1">
  <rdf:Description rdf:about="">
    <dc:title>
      <rdf:Alt><rdf:li xml:lang="x-default">Title</rdf:li></rdf:Alt>
    </dc:title>

    <dc:contributor>
      <rdf:Bag>
        <rdf:li>Test Contributor</rdf:li>
      </rdf:Bag>
    </dc:contributor>

    <dc:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">
      2023-12-25T10:00:00Z
    </dc:created>
  </rdf:Description>

  <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" xmp:MetadataDate="2024-01-19T07:45:50.786293+00:00"/>
  <rdf:Description xmlns:pdf="http://ns.adobe.com/pdf/1.3/" pdf:Producer="pikepdf 8.10.1"/>
</rdf:RDF>
</x:xmpmeta>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant