Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing of the unique identifier's scheme #149

Open
mickael-menu-mantano opened this issue Jun 5, 2019 · 5 comments
Open

Parsing of the unique identifier's scheme #149

mickael-menu-mantano opened this issue Jun 5, 2019 · 5 comments

Comments

@mickael-menu-mantano
Copy link
Contributor

In this document, it's mentioned that we should parse the scheme of the unique identifier.

But there's no place to put it in the RWPM model, I guess we have to expand the scheme in the URI (eg. <dc:identifier opf:scheme="ISBN">123456789X</dc:identifier> -> urn:isbn:123456789X).

But there's no explanation on how to do that on the doc. Any clues?

@HadrienGardeur
Copy link
Member

It goes into https://readium.org/webpub-manifest/contexts/default/#identifier

Since it's a URI, you have to convert the ISBN/UUID/DO into a URI.

@danielweck
Copy link
Member

And just for the sake of completeness:

https://w3c.github.io/publ-epub-revision/epub32/spec/epub-packages.html#sec-opf-dcidentifier

This specification imposes no additional restrictions or the requirements of the identifier except that it MUST be at least one character in length after white space has been trimmed. It is strongly encouraged that the identifier be a fully qualified URI, however.

@mickael-menu-mantano
Copy link
Contributor Author

Sorry, maybe my question was ambiguous.

I'm looking for instructions on how to convert an identifier into a URI, when it's not already one.

Taking this example from the specification:

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
    <dc:identifier id="pub-id">urn:doi:10.1016/j.iheduc.2008.03.001</dc:identifier>
    <meta refines="#pub-id" property="identifier-type" scheme="onix:codelist5">06</meta>
</metadata>

and let's assume that the identifier is missing the scheme, so: 10.1016/j.iheduc.2008.03.001. How do I know how to convert it to a URI?

I could look at the scheme onix:codelist5 and hard-code the list of codes (https://ns.editeur.org/onix/en/5). But then:

  • How do I know what's the correct URI scheme for each ONIX code (eg. "BNF Control number")
  • What if another scheme type is declared, how do I know the actual scheme to use for the URI?

@JayPanoz
Copy link
Contributor

JayPanoz commented Jun 6, 2019

@mmenu-mantano’s questions kinda rang a bell to me as I’ve had a quick convo on Twitter about identifiers a few months ago.

Looks like at some point in time epubcheck didn’t report faulty URIs so if say you used InDesign’s panel export for metadata, and replaced the uuid with an ISBN, InDesign would do the following:

urn:uuid:xxxxxxxxxxxxx

With xxxxxxxxxxxxx being an ISBN. So I guess we can’t even trust the identifier already being an URI.

And you would have that in EPUB files going unnoticed.

¯\(ツ)

Which also leads to this issue in EPUB revision: w3c/epub-specs#1216

@HadrienGardeur
Copy link
Member

This feels like an issue that we should address in the architecture repo before we tackle it in various implementations.

We'll encounter this issue with at least a few formats were the identifier is not required or is not necessarily a URI:

  • EPUB
  • PDF
  • CBZ

@mickael-menu mickael-menu transferred this issue from readium/r2-streamer-swift Feb 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants