Skip to content

Path Manifests

Kyle Beckles edited this page Oct 16, 2019 · 2 revisions

Overview

Path manifests are a simple and optional extension for Arweave gateways, they allow users to upload a small metadata transaction (the path manifest) which maps user-definable subpaths with other Arweave transaction IDs. This allows users to create logical groupings of content, for example a directory of related files, or the files and assets that make up a web application.

Motivation

Data uploaded to Arweave is wrapped in a transaction, the identifier (and thus URLs) for these transactions are a SHA-256 hashes of the transaction signature, so for all intents and purposes it is effectively random and not something the user can control.

Here's an example of a typical directory structure for a small web application.

index.html
about.html
js/app.js
css/style.css
assets/fonts/font.woff
assets/fonts/font.ttf
assets/img/logo.png
assets/img/icon.png

Once uploaded, css/style.css will now be addressed using an ID like XEfZIr3DOFXiKZ2I3XujAsPpvryTts2dVyr6dqrrmUm, and would be accessed using https://arweave.net/XEfZIr3DOFXiKZ2I3XujAsPpvryTts2dVyr6dqrrmUm (using arweave.net as an example gateway). Now the developer has a problem, their code is written with references to other files, and once uploaded to Arweave, those paths will break and need updating in any code that referenced them.

In our index.html we might have a reference to our CSS, like this

<link href="css/style.css" rel="stylesheet" />

This needs to be translated to this

<link href="XEfZIr3DOFXiKZ2I3XujAsPpvryTts2dVyr6dqrrmUm" rel="stylesheet" />

Or inlined, like this

<style type="text/css">
  /* Contents of css.style.css */
</style>

This works to a point, and tools can automatically generate and update these references, or inline the content into a single file, but this gets complicated when we consider runtime references to files that aren't clear just from the original application source code (e.g. Javascript creating dynamic references to files). This is also not so useful for people, as there is nothing in the URL logically grouping files together, and there are no human readable paths and URLs.

There is also the possibility that circular dependencies can be created which cannot be resolved. If index.html is wrapped in an Arweave transaction and given an ID, about.html needs updating to point to transaction ID for index.html. If index.html also references about.html we now have a circular dependency. By changing even a single byte in either index.html or about.html to reference the others transaction ID, we need to re-sign them which changes the ID.

Schema

Path manifests are JSON objects with the following keys.

Field Required? Type Description
manifest string The manifest type identifier, this MUST be arweave/paths.
version string The manifest specification version, currently "0.1.0". This will be updated with future updates according to semver.
index object The behavior gateways SHOULD follow when the manifest is accessed directly. When defined, index MUST contain a member describing the behavior to adopt. Currently, the only supported behavior is path. index MAY be be omitted, in which case gateways SHOULD serve a listing of all paths.
index.path string The default path to load. If defined, the field MUST reference a key in the paths object (it MUST NOT reference a transaction ID directly).
paths object The path mapping between subpaths and the content they resolve to. The object keys represent the subpaths, and the values tell us which content to resolve to.
paths[path].id string The transaction ID to resolve to for the given path.

A path manifest transaction MUST NOT contain any data other than this JSON object.

The Content-Type tag for manifest files MUST be application/x.arweave-manifest+json, users MAY add other arbitrary user defined tags in addition to this.

Example manifest

{
  "manifest": "arweave/paths",
  "version": "0.1.0",
  "index": {
    "path": "index.html"
  },
  "paths": {
    "index.html": {
      "id": "cG7Hdi_iTQPoEYgQJFqJ8NMpN4KoZ-vH_j7pG4iP7NI"
    },
    "js/app.js": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/style.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/mobile.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "assets/img/logo.png": {
      "id": "QYWh-QsozsYu2wor0ZygI5Zoa_fRYFc8_X1RkYmw_fU"
    },
    "assets/img/icon.png": {
      "id": "0543SMRGYuGKTaqLzmpOyK4AxAB96Fra2guHzYxjRGo"
    }
  }
}

Resolving paths

The following behaviour ONLY applies the /[:txid] endpoint on Arweave gateways, it DOES NOT apply to the /tx/[:txid] endpoint.

The transaction ID for the manifest file itself is considered the entry point and base URL, from which subpaths can be defined by a user.

To resolve a URL, a gateway MUST parse the requested path and parse out the manifest transaction ID, along with the subpath being requested (ignoring the trailing slash after the transaction ID), i.e. [:protocol]://[:host]/[:manifestId]/[:subpath]

The components of an example URL

URL: https://arweave.net/dtOrr7JHEI6MTy9fqUJ48inydg4EiunfSzWRmCJ0KgS/css/style.css

Protocol: https

Host: arweave.net

Path: /dtOrr7JHEI6MTy9fqUJ48inydg4EiunfSzWRmCJ0KgS/css/style.css

Manifest ID: dtOrr7JHEI6MTy9fqUJ48inydg4EiunfSzWRmCJ0KgS

Subpath: css/style.css

To resolve this URL the gateway MUST implement the following steps

  1. The gateway MUST check the Content-Type tag for any page served over the /[:txid] endpoint so that it can intercept valid path manifests and apply additional processing before responding to the request.
  2. If the Content-Type is application/x.arweave-manifest+json we continue with these steps, if the Content-Type is NOT application/x.arweave-manifest+json then we simply serve the data normally.
  3. The gateway MUST parse the data held in the manifest transaction and SHOULD validate that its contents is valid as per the schema laid out in this document.
  4. The gateway MUST search in the manifest paths for a key that matches the requested subpath, if a match is found, the gateway MUST respond to the request with the contents of the transaction ID found at manifest.paths[subpath].id, with appropriate Content-Type headers (and ETag headers, if used), and MUST use a HTTP 200 status code.

Gateways MUST NOT issue redirects (HTTP 301, 302, etc) to resolved content and the URL the user requested MUST be preserved in browsers.

If the manifest resolves to an invalid transaction ID, or a transaction ID that the gateway doesn't have access to, then the gateway MUST return a HTTP 404 status.

If a manifest is accessed directly (i.e. with no subpath, with or without a trailing slash) and it contains an index.path value, then the gateway MUST map the value of index.path to the paths object, and resolve the content defined in paths[index.path].id in the same way other subpaths are resolved. The gateway MUST NOT issue a redirect to index.path.

Gateways SHOULD NOT assume index behaviours, i.e. if there is no index.path defined then the gateway SHOULD NOT inspect the paths and use a common value like index.html, instead the gateway SHOULD return a list of all the available paths from the manifest in a human readable style (i.e. SHOULD return a HTML page with tags, so a user can quickly and easily click through to the paths).

If index.path does not map to a valid key in the manifest paths, then the gateway SHOULD return an error.

ETag headers are used to identify content for caching purposes, as such, ETag value MUST be set as the transaction ID of the resolved content being served, and MUST NOT use the ID of the manifest itself, unless the manifest is being accessed without a subpath and no index has been defined.

Examples

Base path with no subpath specified

Manifest - mmJhKOPp2o73LXze05mj5qPVlgu4MsDzqCwpWKqoc7U

{
  "manifest": "arweave/paths",
  "version": "0.1.0",
  "index": {
    "path": "index.html"
  },
  "paths": {
    "index.html": {
      "id": "cG7Hdi_iTQPoEYgQJFqJ8NMpN4KoZ-vH_j7pG4iP7NI"
    },
    "js/app.js": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/style.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/mobile.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "assets/img/logo.png": {
      "id": "QYWh-QsozsYu2wor0ZygI5Zoa_fRYFc8_X1RkYmw_fU"
    },
    "assets/img/icon.png": {
      "id": "0543SMRGYuGKTaqLzmpOyK4AxAB96Fra2guHzYxjRGo"
    }
  }
}

Request

GET /mmJhKOPp2o73LXze05mj5qPVlgu4MsDzqCwpWKqoc7U

Response

The resolved data from cG7Hdi_iTQPoEYgQJFqJ8NMpN4KoZ-vH_j7pG4iP7NI, as index.html is defined as the index.path value.

HTTP/1.1 200 OK
Content-Type: text/html
Cache-Control: public,immutable,max-age=31536000
ETag: cG7Hdi_iTQPoEYgQJFqJ8NMpN4KoZ-vH_j7pG4iP7NI

Valid subpath specified

Manifest - mmJhKOPp2o73LXze05mj5qPVlgu4MsDzqCwpWKqoc7U

{
  "manifest": "arweave/paths",
  "version": "0.1.0",
  "index": {
    "path": "index.html"
  },
  "paths": {
    "index.html": {
      "id": "cG7Hdi_iTQPoEYgQJFqJ8NMpN4KoZ-vH_j7pG4iP7NI"
    },
    "js/app.js": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/style.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/mobile.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "assets/img/logo.png": {
      "id": "QYWh-QsozsYu2wor0ZygI5Zoa_fRYFc8_X1RkYmw_fU"
    },
    "assets/img/icon.png": {
      "id": "0543SMRGYuGKTaqLzmpOyK4AxAB96Fra2guHzYxjRGo"
    }
  }
}

Request

GET /mmJhKOPp2o73LXze05mj5qPVlgu4MsDzqCwpWKqoc7U/assets/img/logo.png

Response

The resolved data from QYWh-QsozsYu2wor0ZygI5Zoa_fRYFc8_X1RkYmw_fU as assets/img/logo.png is a valid subpath defined in the manifest.

HTTP/1.1 200 OK
Content-Type: image/png
Cache-Control: public,immutable,max-age=31536000
ETag: QYWh-QsozsYu2wor0ZygI5Zoa_fRYFc8_X1RkYmw_fU

Invalid subpath

Manifest - mmJhKOPp2o73LXze05mj5qPVlgu4MsDzqCwpWKqoc7U

{
  "manifest": "arweave/paths",
  "version": "0.1.0",
  "index": {
    "path": "index.html"
  },
  "paths": {
    "index.html": {
      "id": "cG7Hdi_iTQPoEYgQJFqJ8NMpN4KoZ-vH_j7pG4iP7NI"
    },
    "js/app.js": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/style.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/mobile.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "assets/img/logo.png": {
      "id": "QYWh-QsozsYu2wor0ZygI5Zoa_fRYFc8_X1RkYmw_fU"
    },
    "assets/img/icon.png": {
      "id": "0543SMRGYuGKTaqLzmpOyK4AxAB96Fra2guHzYxjRGo"
    }
  }
}

Request

GET /mmJhKOPp2o73LXze05mj5qPVlgu4MsDzqCwpWKqoc7U/path/does/not/exist.txt

Response

No such subpath is defined, so an error is returned.

HTTP/1.1 404 Not Found

No default value defined

Manifest - Vi1e4yKq1NFLYvtXj77d4hZqFoXgZLvEYsoIg7VX530

{
  "manifest": "arweave/paths",
  "version": "0.1.0",
  "paths": {
    "index.html": {
      "id": "cG7Hdi_iTQPoEYgQJFqJ8NMpN4KoZ-vH_j7pG4iP7NI"
    },
    "js/app.js": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/style.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "css/mobile.css": {
      "id": "fZ4d7bkCAUiXSfo3zFsPiQvpLVKVtXUKB6kiLNt2XVQ"
    },
    "assets/img/logo.png": {
      "id": "QYWh-QsozsYu2wor0ZygI5Zoa_fRYFc8_X1RkYmw_fU"
    },
    "assets/img/icon.png": {
      "id": "0543SMRGYuGKTaqLzmpOyK4AxAB96Fra2guHzYxjRGo"
    }
  }
}

Request

GET /Vi1e4yKq1NFLYvtXj77d4hZqFoXgZLvEYsoIg7VX530

Response

As no index value has been defined, a listing of all the valid paths is displayed.

HTTP/1.1 200 OK
Content-Type: text/html
Cache-Control: public,immutable,max-age=31536000
ETag: Vi1e4yKq1NFLYvtXj77d4hZqFoXgZLvEYsoIg7VX530
<!DOCTYPE html>
<html>
<head></head>
<body>
<ul>
<li><a href="index.html">index.html</a></li>
<li><a href="js/app.js">js/app.js</a></li>
<li><a href="css/style.css">css/style.css</a></li>
<li><a href="css/mobile.css">css/mobile.css</a></li>
<li><a href="assets/img/logo.png">assets/img/logo.png</a></li>
<li><a href="assets/img/icon.png">assets/img/icon.png</a></li>
</ul>
</body>
</html>