Skip to content
Philippe Ombredanne edited this page Nov 29, 2017 · 14 revisions

Why creating yet another standard with a purl?

  • You were about to "XKCD" me with a link to https://xkcd.com/927/: this not entirely correct as there is no such standard or URL standard like a purl: only some attempts to define something similar. Therefore a purl is not another standard but instead can be the standard. Furthermore, it is a grass-root effort to define conventions to reference and locate packages and builds on, embraces and clarifies existing conventions used by several existing tools dealing with many software packages.

Can I use an existing URL parsing library to parse a purl?

  • Yes and this is highly encouraged! A purl is a valid URL and should be parseable by any conforming URL/URI parser (that at least can accept any scheme). You can then focus on the specific of purl component encoding and decoding, component normalization and additional parsing (e.g. for qualifiers) and component value validation.

Can I use a subpath with multiple subpaths, globs or regexes in a purl?

  • No. Use multiple purl or other attributes outside of a purl.

Why is the purl version optional?

  • This is to support pointers to any version of a package or when you do not know the version (yet). This should not be abused but is useful and used in practice: for instance a package version may depend on another package with no version specified.

Can I use the Authority (i.e. user:pass@host:port) of a URI/URI in a purl?

  • No. There are several reasons: The rules for parsing user/host/password and ports are complex and more so when you add IDNA, punycode and IPv4/v6. These would introduce subtle quirks if for instance the user/password was used as a purl version.

    Also while a host may be important to locate a package it is not required to identify it: the exact same package may exist in multiple repositories, local, remote or private mirrors. This is still the same package.

    And the Authority components come before the Path in a URL: this would break the hierarchical nature of the purl components and no longer make them nicely sortable as plain strings: this a good property when dealing with many purl in a database or even small sorted lists in a UI.

    To reference an alternative public or private package repository URL beyond the default public repository for a purl type, you can use an extra attribute outside of a purl or use the repository_url qualifiers key/value pair to specify another repository URL.

All familiar URLs contain ://. Why is there no such thing a purl?

  • Actually not all URLs contain a :// Consider for instance mailto:jane@example.com. This is a valid URL/URI but does not use a ://. In fact the URL spec that if you are not using a host or Authority (See the FAQ entry on Authority) you must not use :// but just a plain colon : after a URL scheme (i.e. a purl type). Yes, :// looks much nicer! So you can eventually use one, but parser and builder must ignore these entirely and a "canonical", normalized purl must be devoid of :// after is type and use only a : separator.

Can I use a CPE instead of a purl

  • Not really... no and yes! CPE https://en.wikipedia.org/wiki/Common_Platform_Enumeration are URIs and fairly close to purl concepts but they are rather complex and there are subtle differences: cpe:2.3:a:artifex:ghostscript:8_64:*:*:*:*:*:*:*

    CPEs started from the world of proprietary software security and require a 'vendor' attribute before the 'name' attribute, somewhat similar to a purl namespace but not exactly. These names are assigned centrally and arbitrarily by NIST and Mitre. For instance, the vendor for zlib is GNU: this does not make any sense.

    In contrast, purl names are not centrally or arbitrarily assigned or created: they are naturally and directly derived from whatever names a package author picked. Also CPEs specifies rather complex version semantics and can be hard to parse and build. Overall, they often mesh poorly with the world of software packages and are often rather hard to map to actual common software packages as used in software development.

    Yet they are a great additional reference when they exist to relate a package to known NVD vulnerabilities (CVE). A valuable side project could create and maintain mappings of purl to known CPEs.

Why not using the ISO 19770-2 spec for SWID tags instead of a purl?

  • Avoid this... This is a proprietary and opaque specification with a centrally managed pay-for-play registry (tagvault). Its purpose is primarily to help inventory installed proprietary software when managing IT assets by assigning arbitrary tags to a software binary.

    In contrast a purl is an open way to identify and locate a software package as used in modern software development with no arbitrary central name assignments needed.

I tried to use this pypi:django@1.11.1 in my web browser and it is unable to connect. What's happening?

  • Nice try. Are you sure you are using the latest version of Firefox, Chrome or Edge? Just kidding! A purl is unlikely to ever be implemented in a web browser: that's not the purpose. purl purpose is to reference a software package in a consistent way across platforms when using tools, APIs and databases, not for web browsing... though you are welcome to work on browser extensions to support this. That can be fun!

I tried to run this command pip install pypi:django@1.11.1 in a shell and I got an error. Wat?

  • Nice try. Are you sure you are using the latest version of pip, bundler, or npm? Just kidding! While purl could be implemented by package management tools in the future that's not the primary purpose. purl purpose is to reference a software package in a consistent way across platforms when using tools, APIs and databases, not for direct package installation... though you are welcome to submit PRs to your favorite tools to support this once the spec is firmed up. That would be fun!