Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a unique identifier for a PWA #586

Closed
adewale opened this issue Jun 22, 2017 · 122 comments · Fixed by #988
Closed

Add a unique identifier for a PWA #586

adewale opened this issue Jun 22, 2017 · 122 comments · Fixed by #988
Assignees

Comments

@adewale
Copy link

adewale commented Jun 22, 2017

If one is building a PWA Directory or App Store or search engine that detects PWAs one needs a way to uniquely identify a PWA from just the manifest.

Currently the spec doesn't explicitly say what that identifier or tuple of identifiers should be which leads to issues like: GoogleChromeLabs/gulliver#323

@marcoscaceres
Copy link
Member

@adewale, thanks! we will see how the gulliver project handles that and if a solution emerges.

@mgiuca
Copy link
Collaborator

mgiuca commented May 2, 2018

The spec does specify a tuple that uniquely identifies the app, but unfortunately it's got a huge problem (I thought there was a bug on it but I can't find one, so I just filed #668).

The steps for processing a manifest are given by the following algorithm. The algorithm takes a string text as an argument, which represents a manifest, and a URL manifest URL, which represents the location of the manifest, and a URL document URL.

This means that the entire identity of the app is uniquely determined by the tuple (text, manifest URL, document URL). Though practically, since text can be derived from manifest URL, it means just the pair (manifest URL, document URL).

I think the fact that it is a function of document URL is a problem, as outlined in #668. If we fix that, then the manifest URL becomes the sole unique identifier of an app.

(Note: The Service Worker URL does not need to be included as part of the identifier. The SW is an implementation detail of the app --- a detail that we require, but we do not need to know where it lives, what its scope is, etc.)

@ithinkihaveacat
Copy link

https://pwa-directory.appspot.com/ has a collection of 1366 manifests; of these around 71 (5%) look "versioned":

$ curl -sSL 'https://pwa-directory.appspot.com/api/pwa/?limit=4000' | jq -r '.[] | .manifestUrl' | sort | perl -ne 'print if /[0-9a-fA-F]{7}/ || /v[0-9]+/ || /v=/'
https://ademola.adegbuyi.me/_nuxt/manifest.1c4bdc21.json
https://app.mangahigh.com/fea_201803191329/misc/mobile-manifest.json
https://assets.production.spokeo.com/assets/v9/manifest-25a702bcac88b536992cff4cc78d9e75d7d40dc36f746ed69604a2c40d0aba5d.json
https://beta.mic.com/manifest.json?b=1478894131181397
...
https://ademola.adegbuyi.me/_nuxt/manifest.1c4bdc21.json
https://app.mangahigh.com/fea_201803191329/misc/mobile-manifest.json
https://assets.production.spokeo.com/assets/v9/manifest-25a702bcac88b536992cff4cc78d9e75d7d40dc36f746ed69604a2c40d0aba5d.json
https://beta.mic.com/manifest.json?b=1478894131181397
https://betcruncher.com/manifest.3183cc2d8ff6fa85748fc8c6a4f796cd2a95d2e9.json
https://big-andy.co.uk/content/themes/v5/manifest.json
https://blackjack.io/manifest.9f463e8a23e16b31f7219dce967e1df6.json
https://blendle.com/manifest-5a96b3b4ec.json
https://boardom.io/manifest.json?v=3
https://bookourplane.com/manifest.json?v=LbbRAnjJQL
https://browsersync.io/manifest.json?v=qAqkxQaJm0
https://cdn.bloodhorse.com/current/favicons/manifest.json?v=KmbG9gpjz7
https://cdn.getyourguide.com/static/c6754d394589/customer/desktop/static/manifest.json
https://cdn.lyft.com/webclient/icons-463e5ce/manifest.json
https://cdn.shopify.com/s/files/1/0014/1962/t/21/assets/manifest.json?17982843544509738478
https://choualbox.com/manifest.json?v=1282
https://clay.io/manifest.json?data=eyJpY29ucyI6W3sic3JjIjoiaHR0cHM6Ly9jZG4ud3RmL2QvaW1hZ2VzL3N1cGVybm92YS9pY29uLnBuZyIsInNpemVzIjoiMjU2eDI1NiIsInR5cGUiOiJpbWFnZS9wbmcifV0sInNob3J0X25hbWUiOiJDbGF5IEdhbWVzIiwibmFtZSI6IkNsYXkgR2FtZXMiLCJzdGFydF91cmwiOiIuLz91dG1fc291cmNlPXdlYl9hcHBfbWFuaWZlc3QiLCJiYWNrZ3JvdW5kX2NvbG9yIjoiI2ZhZmFmYSIsInRoZW1lX2NvbG9yIjoiI2ZmOGEwMCIsImRpc3BsYXkiOiJzdGFuZGFsb25lIn0=
https://cs1.wettercomassets.com/wcomv5/images/icons/favicon/manifest.json?201708031719
https://d1c42d2bmccy49.cloudfront.net/manifest.json
https://dev-quests.appspot.com/static/manifest.b9d743cdb670650edbb180662a9443e56add2d7fcbc9e7c5d7f73c7bfd20ded5.json
https://developer.chrome.com/devsummit/static/manifest.32a1e88bd98d232c73fbf2f2c5ff552b4c9782f991d6114a3ffa17c5f9390528.json
https://devpractic.es/notifmanifest.php?v=635
https://direct.asda.com/on/demandware.static/-/Sites-ASDA-Library/default/dwb2a11ac9/Manifest/manifest.json
https://ephemeral.now.sh/manifest.91ccc2dacd83c8815c8286043c23a9ae.json
https://erwinandres.github.io/tudu/manifest.json?v=2
https://facerepo.com/app/images/favicons/manifest.json?v=a701bd98
https://feeddeck.glitch.me/manifest.json
https://flat.io/manifest.json?v1
https://grocery.walmart.com/js/icons-4b00caed44fcb95f57dd4efc82d1a2c2/manifest.json
https://hn.nuxtjs.org/_nuxt/manifest.d7491a08.json
https://hpbn.co/7a58c37113db4464699ec4f4646b5566.json
https://jimdo-dolphin-static-assets-prod.freetls.fastly.net/cms/static/manifest.c4bb9662.json
https://kuranz.com/manifest.0252de652255e03775ee2f57d96ec003.json
https://m-travel.jumia.com/manifest.9cd19691.json
https://m.apkpure.com/manifest_v10.json
https://m.avito.ru/s/mobile/web-app-manifest.json?5e1ff91
https://m.badoo.com/badoo/manifest-en.json?v101
https://m.gala.de/r1519124462501/manifest.json
https://magento-imagine-2018.firebaseapp.com/_nuxt/manifest.555d3617.json
https://magnetis.com.br/assets/magnetis_app/manifest-638829635f8669ddb668e944e37aee4241964bd32d7f2dd857d1d7c8e16e8bfd.json
https://memoui.com/static/20180412001537/manifest.json
https://motog3.com/wp-content/plugins/onesignal-free-web-push-notifications/sdk_files/manifest.json.php?gcm_sender_id=995691934152
https://preact-pwa-yfxiijbzit.now.sh/manifest-a57e627c89.json
https://prpl-dot-captain-codeman.appspot.com/20170806/es6-unbundled/manifest.json
https://quillie.net/manifest.38100eca.webmanifest
https://reittiopas.foli.fi/icons-turku-6aa88e8a010a06d1d30d24205371f8d3//manifest.json
https://rofr.in/manifest.json?v6=bOO8oaa856
https://schsrch.xyz/resources/0350094f9232803bcc0fd86c3cbd31f1.json
https://sp-web.search.auone.jp/manifest_v2.json
https://ssl.tzoo-img.com/res/favicon/manifest.json?v=2kq2msw2
https://static1-ssl.dmcdn.net/images/neon/favicons/manifest.json.vb58fcfa7628c92052
https://static3.1tv.ru/assets/web/favicon/manifest-1d3e08042839f3a7499da28ea190f0d5.json
https://theomg.github.io/Lifelike/manifest.ee9a11377982a365a8aeae5b9095fe11.json
https://townwork.net/js/manifest?v=20160302001
https://travel.jumia.com/manifest.9fa818c2.json
https://unacademy.com/dist/manifest.json?1487235791853
https://unacademy.com/dist/manifest.json?1505821603929
https://weather.com/weather/assets/manifest.507fcb498f4e29acfeed7596fe002857.json
https://webamp.org/manifest.60fc98cc18ea0b3ab073cda74610efa1.json
https://www.amarujala.com/manifest.json?v=85b484467f
https://www.boldsky.com/browser.json?v=1.0.1
https://www.buzzfeed.com/static-assets/data/manifest.0edfa72a42a9e70e5bf211f64eae9384.json
https://www.colorblindsim.com/manifest.8b7a3d31.webmanifest
https://www.cookscountry.com/_search_assets/cco-manifest-707681872ff6b432492f3fe509aaae89.json
https://www.elo7.com.br/v3/manifest/webapp.json
https://www.freecharge.in/mobile/manifest.json?v=1
https://www.ft.com/assets/manifest/manifest-v6.json
https://www.gp.se/polopoly_fs/3.200.1523348202!/sites/se.gp/images/manifest.json
https://www.iheart.com/manifest.6a2f10c7f194b2a76747f18937e42951.json?rev=7.44.0
https://www.imperialcarsupermarkets.co.uk/manifest.json?v=gAEgYPxJpw
https://www.istitlaa.me/_nuxt/manifest.57352a3d.json
https://www.johnlewis.com/assets/fc539d9/favicons/manifest.json
https://www.koolsol.com/manifest-20170311-01.json
https://www.koolsol.com/manifest-20170526-01.json.php
https://www.liverpoolecho.co.uk/manifest.json?v=548e74556b39b6b25a2b7a4828f7783e
https://www.nouvelobs.com/manifest.json?1510150956
https://www.onthemarket.com/assets/52bbb4af/gzip/js/manifest.json
https://www.onthemarket.com/assets/80f6edfa/gzip/js/manifest.json
https://www.openrent.co.uk/manifest.json?v=9BaGKJ78xe
https://www.otto.de/static/all/img/global-resources/fc44d9d421d3577b/favicons/manifest.json
https://www.otto.nl/3ce8d08884c912ec9b98774bab49a8eff3604010/assets/ottonl/resources/manifest.json
https://www.padpiper.com/manifest.ab5a95547c7ae8833813533907eb0631.json
https://www.pigiame.co.ke/assets/pi-site/favicon/site-ad611bc177.webmanifest
https://www.pitchup.com/manifest.json?v=4
https://www.pricehipster.com/manifest.json?v=1
https://www.reittiopas.fi/icons-hsl-18da13427c6e362f148f4a5b783ee98c//manifest.json
https://www.selcobw.com/skin/frontend/selco/default/assets/manifest.json?6335544
https://www.sho-yamane.me/_nuxt/manifest.7e00d6b4.json
https://www.stylewe.com/manifest.json?v=9255619
https://www.thekitchn.com/assets/tk/favicons/manifest-8afd9804080ba4ee9351cb5adc20383f47f40fe276d62bd25467bdadf5d5c0d6.json
https://www.viz.com/favicon/manifest.json?v=oLLRlE8ljO
https://www.walmart.ca/assets/9d1a7c78e21cc1c3c71ae9f8a8918b0d-home-screen-manifest-en.json
https://www.yiv.com/manifest.json?2017022101

So if Chrome and others switch to using manifest URL to uniquely identify PWAs (and this data is representative of PWAs in general), then around 5% of sites will generate a new A2HS prompt when the manifest URL changes (perhaps only when the content of the manifest changes, but potentially on every deployment).

(Is Chrome using manifest URL right now? I tried changing the manifest URL on a test site and didn't get the A2HS prompt. So I suspect Chrome is currently applying a different heuristic to identify new/updated PWAs.)

@mgiuca
Copy link
Collaborator

mgiuca commented May 7, 2018

Interesting that so many of them are versioned. I wonder where this advice comes from? Could it be that "best practice" with service workers is to version all assets, and the manifest is just being versioned along with that?

So if Chrome and others switch to using manifest URL to uniquely identify PWAs ... then around 5% of sites will generate a new A2HS prompt when the manifest URL changes

I think there's still some confusion here. Chrome already uses the manifest URL to uniquely identify an app. If the manifest URL changes, it's a different app. There are no changes to Chrome that need to be made along these lines (this bug is to document this in the spec, which I think is reasonable).

Is Chrome using manifest URL right now? I tried changing the manifest URL on a test site and didn't get the A2HS prompt. So I suspect Chrome is currently applying a different heuristic to identify new/updated PWAs.

I think it is. If you change the manifest URL you should get a new app. Theories for why you aren't:

  • When you changed the manifest URL, the new manifest didn't satisfy the PWA checks.
  • Someone mentioned that Chrome for Android has a limit of 3 apps per origin. It's possible you hit that limit?
  • I'm mistaken and Chrome for Android is in fact using the start_url or something as the unique key.

@marcoscaceres
Copy link
Member

@mgiuca wrote:

Interesting that so many of them are versioned. I wonder where this advice comes from? Could it be that "best practice" with service workers is to version all assets, and the manifest is just being versioned along with that?

Yeah, @jakearchibald and friends were promoting this a while back as part of SW development (that's not to point fingers - caching is hard, and that approach works well). However, it's still a hack... and like all hacks, it has pros/cons. Additionally, the file hashing may be baked into some developer/command line tools, like webpack - but Jake probably knows more.

@jakearchibald
Copy link

Pretty sure I never recommended versioning manifests specifically. But it's good practice to version assets and treat their URLs as immutable generally. This isn't anything to do with service worker, it's just good caching practice https://jakearchibald.com/2016/caching-best-practices/.

If manifest is an exception to the rule, we need to do some dev rel'ing so folks understand why. The service worker script url is one of these exceptions, and I documented it here https://developers.google.com/web/fundamentals/primers/service-workers/lifecycle#avoid_changing_the_url_of_your_service_worker_script.

@mgiuca
Copy link
Collaborator

mgiuca commented May 8, 2018

@jakearchibald Yeah the manifest URL should have the same policy applied as the SW URL --- it's probably more important since you can migrate to a new SW URL (just takes some fiddling) but it's not generally possible to move to a new manifest URL (without segmenting your installed base).

(Aside: I think we should support updating manifest URL using HTTP 301 Moved Permanently; not sure if this needs to be specced or if we can just implement this.)

@alancutter
Copy link
Contributor

This issue has come up again in the context of updating installed PWA manifest data.

  • When a site has changed its name/scope/theme_color/start_url/manifest URL how do we know we're looking at the same app and not a different one that shares the same scope?
  • When a PWA installation is synced across devices how do we know the sync has been satisfied when sites may have arbitrary device specific differences in their metadata?

I don't think making an app identified by its manifest is reasonable long term. Sites should be able to re-architect their directory structure/web framework necessitating a change in manifest URL during the lifetime of a user install.

I think we should add an optional "id" field to the manifest that defaults to the manifest URL but can be overridden with whatever the site likes. This ID will be scoped to the start_url's origin and cannot collide with IDs from other origins. This would enable sites to update any aspect of their manifest except their origin and the id.

@adewale
Copy link
Author

adewale commented Jul 12, 2019 via email

@alancutter
Copy link
Contributor

The id field will default to the manifest URL e.g. "https://app.com/manifest.webmanifest" but can be any string e.g. "jdklklfpinionkgpmghaghehojplfjio".
The actual app ID will be a tuple of (start_url origin, manifest id) e.g. ("https://app.com/", ""jdklklfpinionkgpmghaghehojplfjio").

@mgiuca
Copy link
Collaborator

mgiuca commented Jul 12, 2019

Note that the ID itself will be a totally meaningless (to the web platform) string; it's just an opaque token that uniquely identifies the app within the origin's namespace (so there are no naming conflicts between origins, but you must be careful to uniquely identify your app within your own origin).

We would probably recommend that the ID be a URL relative to the origin, since that would guarantee uniqueness, but we wouldn't derive any meaning from it.

The default of it being the manifest URL would be to preserve the historical fact that the manifest has uniquely identified the app.

@marcoscaceres marcoscaceres changed the title Document the unique identifier or tuple of identifiers for a PWA Add a unique identifier for a PWA Apr 2, 2020
@marcoscaceres
Copy link
Member

I like @mgiuca's idea (#586 (comment)) of the id just being a meaningless URL resolved against the manifest URL.

@mgiuca
Copy link
Collaborator

mgiuca commented Apr 2, 2020

the id just being a meaningless URL resolved against the manifest URL

That's not quite what I was suggesting. I was saying it's a meaningless string (doesn't have to be a URL at all). It's an arbitrary character string, that isn't resolved against the manifest URL; it forms part of a unique key, in a pair with the origin (so that two origins with the same id won't collide).

@marcoscaceres
Copy link
Member

ah, sorry, I misread. I still like the idea :)

@benfrancis
Copy link
Member

I remember this topic being debated at some length in the sysapps working group in about 2013. My personal opinion has always been that the manifest URI alone should be treated as the identifier of a web application and a different manifest URI should be assumed to be a different application.

Some of the reasons being:

  1. It provides a simple URI as an identifier to use as an index in a database of apps, which also happens to resolve to the metadata describing the app
  2. The manifest URI can be resolved periodically by the user agent to check for updates
  3. No ambiguity over whether two applications within the same origin/sharing the same start_url/sharing the same scope/claiming the same internal ID are the same app or different apps

In implementing the manifest specification recently I found it a real pain trying to use some kind of combination of the origin/start URL/manifest URL/content hash as an identifier and in the end gave up and just used the manifest URL anyway.

I understand why people might want to version the manifest URL and caching is indeed hard, but I would argue there are other solutions to that problem. Cool URI's don't change.

@mgiuca
Copy link
Collaborator

mgiuca commented Apr 3, 2020

I appreciate the sentiment that Cool URIs don't change (especially since that page seems to have existed for 22 years at the same URL). But the reality is, developers do want to change their URLs, including the manifest, not just for versioning but to keep their site organised.

The problem as I see it is that we've never specified what makes a unique identifier for an app. So implementations can use the manifest URL, but that's essentially creating a de facto standard that developers have to divine based on the (conflicting) implementations. This isn't just some user-agent-specific logic, it actually affects how developers are allowed to run their sites (i.e., am I allowed to change my manifest URL? The spec doesn't say, I just have to try it and see if it breaks browsers.) So whatever the answer is, it should be specified and consistent across browsers.

I do like the idea of manifest URL being the key, for the reasons you said 1 and 2 (you can just point a store listing or admin install config at a manifest URL and it tells you everything you need to index and install the app).

But it has the significant drawback that developers can never change their manifest URL once the site is launched. We can possibly solve around that by adding an explicit ability to migrate users to a new manifest URL (which could be as simple as stating that a HTTP 301 redirect on the manifest URL says to update to the new location). But it would be simpler if we didn't tie the key to the manifest URL in the first place.

3: No ambiguity over whether two applications within the same origin/sharing the same start_url/sharing the same scope/claiming the same internal ID are the same app or different apps

That is true of any standardized solution. The ambiguity comes from the current reality of it not being specified.

@alancutter
Copy link
Contributor

I think we should expect to need to add an ID migration mechanism anyway to cover changing origins. Being able to ping the manifest directly is extremely attractive and perhaps having to perform a migration to change your manifest URL is worth it.

@mgiuca
Copy link
Collaborator

mgiuca commented Apr 3, 2020

Yeah, that's true. Did we have any other reasons (@alancutter) to propose the explicit id scheme, besides being able to migrate your manifest?

I suppose we should consider two separate use cases here:

  1. Once-in-awhile developer wanting to migrate their manifest URL.
  2. Manifest URL is versioned so it changes every time the manifest changes.

Doing an explicit migration is suitable for 1. But I don't think you'd want to do this for 2, otherwise you'd have to make your old manifests 301 to the new one every time. So this would probably preclude being able to version your manifests. Which as @jakearchibald said in 2018, is actually best practice (or would be, if it worked; at present it's best practice for everything but the manifest because of this problem).

@benfrancis
Copy link
Member

@mgiuca wrote:

whatever the answer is, it should be specified and consistent across browsers.

I agree.

See also: #446 and #384.

The "Updating the manifest" section of the specification has been empty since 2016 when the same-origin constraint was dropped for manifest URLs and default scope was defined as "unbounded" (later changed) which made things more complicated.

Whatever solution is eventually specified for updates will obviously be influenced by what is used as the unique identifier for an app. Having a relatively stable manifest URI that can be fetched periodically seems like the obvious solution to me. When apps can have overlapping navigation scopes and start URLs can change, something needs to be stable.

Migrating manifest URIs via redirects could work for occasional changes to app structure, but as you suggest it could get unwieldy if the developer tries to change the manifest URL every time the manifest's content is updated for caching purposes. In practice it might be simpler for a developer to just treat a significantly restructured version of the app as a new app, and use other strategies for caching/versioning.

@wanderview
Copy link
Member

wanderview commented Apr 3, 2020

Note, I expect we will need an identifier mechanism for service workers as well for similar use cases; e.g. migrating from one scope to another. It would be difficult for sites to manage the teardown of one service worker and migration to another without something like this.

Do you plan to make your proposal work for service workers as well?

The strawman I had been thinking of was something lile:

navigator.serviceWorker.register('sw.js', {
  scope: '/some/scope',
  token: 'my-origin-unique-token',
});

So if you call register again with the same token, but a different scope we would migrate the current service worker registration to the new scope.

Edit: Sorry if this was already discussed in this thread. I've only be lightly following until recently.

@benfrancis
Copy link
Member

The strawman I had been thinking of was something lile:

navigator.serviceWorker.register('sw.js', {
  scope: '/some/scope',
  token: 'my-origin-unique-token',
});

A couple of thoughts:

  1. I would find it odd if a web application was identified by anything other than a Uniform Resource Identifier. Apart from being what makes the web the web, URIs make great origin-unique tokens! I would hope there's no need to invent another type of ID namespace like Play Store/App Store style app IDs.
  2. What's the latest thinking on the mapping between a service worker and an app?

It would be really neat if there was a 1:1 mapping between the two and app scope == service worker scope, then you could use the manifest URI as the unique identifier for both and update both navigation scope and service worker scope together in a single update. (There used to be a similar kind of mapping in the manifest, but the other way around.)

But my understanding is that isn't the case and it's currently possible to have multiple service workers per app or multiple apps per service worker, or have one and not the other. If the two technologies are entirely de-coupled then maybe they have to have their own mechanisms for identifying an installed application vs. an installed service worker.

@mgiuca
Copy link
Collaborator

mgiuca commented Apr 6, 2020

I'd rather not tie any of this to service workers; it increases the complexity by an order of magnitude. (That's why we ended up removing service worker from the manifest; they are unrelated, and that's by design, as with all the pieces of the web platform, they are separate and composable.)

The way I view it, the service worker is an implementation detail of the application (something the user can't see or interact with at all), while the manifest is the user-facing concept of an application. While generally websites will want to have them at the same scope, you can for instance have a top-level SW scope but with lots of smaller-scoped app manifests. I remember discussions early on in the desktop PWA project on Chrome to have links open in the app if they were within the service worker scope, which I shot down because I don't think service worker scope should have any bearing on the way the user experiences the app. (In much the same way as the user shouldn't care about whether there's a proxy server in between the client and the real backend.)

Under that philosophy, I don't see there being a particular need for a SW-to-SW migration. If you just tear down the old SW and spin up a new one, you can just re-cache everything (or find some mechanism to transfer the cache so it doesn't have to be redownloaded). That's a very different problem to manifest migration, which can't be done in user-space because it involves changing the URL that installed OS-level "apps" are pointing to, and potentially informing the user that the application is changing.

I would find it odd if a web application was identified by anything other than a Uniform Resource Identifier. Apart from being what makes the web the web, URIs make great origin-unique tokens! I would hope there's no need to invent another type of ID namespace like Play Store/App Store style app IDs.

True. I like identifying things with URLs*. I would be OK with saying the "id" is a URL. But when we thought about it, the URL never actually gets resolved, so it would effectively just be an opaque string. If we did want to use URL syntax to express the ID, we shouldn't use the "https" scheme (since that implies it's an actual resolvable resource). We'd have to come up with our own scheme, like "webapp://example.com/user-specified-id". But in the id field, the origin would be implicit (since we can't let you specify the origin of your app's ID, it has to be the origin of your start_url/scope). So we may as well just make the id an opaque string, which if you like, can be formed into a URL like the above, but in practice, you'd never see the URL, and it would be easier to just state that "the unique identity of an application is the pair of (app origin, user-specified-id)", rather than inventing a whole new URL scheme.

*By the way, I'm avoiding use of the term "URI" simply because the URL Standard says that the term "URI" is deprecated in favour of "URL". I personally think it's useful having a distinction, but I fought this years ago, and gave up.

@benfrancis
Copy link
Member

benfrancis commented Apr 6, 2020

We'd have to come up with our own scheme, like "webapp://example.com/user-specified-id".

This is similar to what we did with Firefox OS, where we created URLs like webapp://1dd47458-abac-4637-b7e6-12c6e0ef9846. With hindsight I think creating a protocol scheme and namespace separate from the web was the biggest single technical mistake we made on the project, because over time it allowed applications to evolve into something which was missing many of the key benefits of the web, especially linkability. This is one of the key principles of "Progressive Web Apps".

*By the way, I'm avoiding use of the term "URI" simply because the URL Standard says that the term "URI" is deprecated in favour of "URL". I personally think it's useful having a distinction, but I fought this years ago, and gave up.

Yeah I was just using the terms to distinguish between a URI which identifies an app and a URL which can be resolved to locate its metadata, but what I'm advocating for is a manifest URL which serves both functions.

My understanding of the proposed use cases for an ID in this thread are:

  1. Uniquely identify a web app in a directory or app store
  2. Uniquely identify a web app when the metadata in its manifest is updated
  3. Sync installed web apps across devices, even if the server serves slightly different manifest metadata to different user agents

With the additional requirements:

  1. An ID which is guaranteed to be unique within its origin
  2. Ideally have backwards compatibility with user agents which have used manifest URL as an ID in the past
  3. Allow developers to change the URL of a web app manifest when needed and migrate to that new URL
  4. Enable caching a manifest and invalidating the cache of a manifest

Using the manifest URL to identify the app seems to me to fulfill all of these requirements. It can be used as a globally unique identifier (which is therefore also unique within its origin), can identify an app even when the contents of the manifest is updated or differs between user agents, allows cache control using cache headers and can be migrated to a new URL if necessary using HTTP redirects.

It also has the benefit that it doesn't require inventing a new URL scheme for a new non-web namespace for installed web applications and doesn't require an algorithm to derive the identity of an app from multiple inputs. And finally, it has the benefit of providing a potential simple update mechanism which the manifest specification still doesn't have a solution for.

@benfrancis
Copy link
Member

benfrancis commented Feb 18, 2021

@dmurph wrote:

This would let malware 'take over' an app, as they would just set their id field to the manifest url.

Ah yes I see that's what I was missing, so an absolute URL wouldn't work.

@philloooo wrote:

That's the reason we force the id to prefix with the start_url's origin. So if within the same origin, they serve multiple manifests with the same id , yeah they will be recognized as the same app. But I think that should be expected by the app developers for that site as this is spelled out in the spec.

Yes I agree that at least contains the problem to a single origin.

The code for web app id is deeply embedded in Chromium in various places.

Which I imagine also makes changing the current behaviour to a different solution quite tricky.

I updated the explainer with Firefox Android behavior

Thank you! Don't forget KaiOS browser.


OK, well I came here to offer my feedback, based on experience of implementing the specification a few times, that there's no need to add an id member to the manifest because manifests already have a natural universal identifier on the web which makes them directly linkable and discoverable and which de-references to a useful resource. It's unfortunate that we've gone so many years with no identifier being defined in the specification that it appears the obvious solution is no longer practical.

If there's now no option but to define an id member inside the manifest, then my recommendation is that if manifest URL can not be a default then scope is the next least worst option in the long term. If that makes manifest consistent with service workers then that's even better.

I wouldn't expect using scope as a default identifier to cause much breakage for existing applications because it is probably the most stable of all the available options inside the manifest, but I don't have any data available to me to validate that hypothesis. Start URL has always been a dubious choice as an identifier for a web application as it's the most likely to change, so it's unfortunate in my opinion that the Chrome team chose that option in the absence of a recommendation in the specification.

I also wouldn't expect breaking "implicit expectations" of developers to be a big problem because in practice developers don't just target a single browser, they are already targeting a range of desktop and mobile browsers which behave inconsistently. It may even fix problems they were previously having with changing start URL or manifest URL, without any action on their part.

As an independent Invited Expert I can only offer my feedback, I don't have a large user base behind me to give that feedback much leverage. So as long as Mozilla, Apple and KaiOS Technologies are not participating in the discussion, I expect the Chrome team will do what they believe is best for their users, and developers.

Thank you @philloooo and @dmurph for taking the time to explain your rationale.

@philloooo
Copy link
Collaborator

philloooo commented Feb 18, 2021

thanks benfrancis, do you know if KaiOS support updating installed web apps?

@wanderview
Copy link
Member

@wanderview wrote:

The plan of record in service-worker-land is to default to scope when using the legacy single scope attribute and there is no id. If you are using the new-fangled-scope attribute that supports multiple scope values, etc, then you are required to set an explicit id.

That's interesting, and eases my concern about multiple scopes inside a manifest. Is there a draft that we can read anywhere? Aligning manifest with Service Worker in this respect seems attractive.

I just want to highlight that in service-worker-land there is nothing like start_url, etc, to use as an id. So falling back to scope as the default id is mainly because scope is the identifier today (since there is nothing else unique to use). Since the fallback is for compatibility, it probably makes sense to fallback to whatever is currently being used as the id. In this case it sounds like at least two browsers use start_url for that (if I have followed correctly).

@fabricedesre
Copy link

thanks benfrancis, do you know if KaiOS support updating installed web apps?

Yes we do. I need a bit more time to digest this thread, but I plan to give feedback. Sorry for the delay!

@dmurph
Copy link
Collaborator

dmurph commented Feb 18, 2021

@fabricedesre 🙏 thank you! Excited to hear your thoughts

@benfrancis
Copy link
Member

@wanderview wrote:

I just want to highlight that in service-worker-land there is nothing like start_url, etc, to use as an id. So falling back to scope as the default id is mainly because scope is the identifier today (since there is nothing else unique to use). Since the fallback is for compatibility, it probably makes sense to fallback to whatever is currently being used as the id. In this case it sounds like at least two browsers use start_url for that (if I have followed correctly).

Yes, and by my count there are currently about 10 user agents which use manifest URL, which isn't really reflected in the explainer.

@dmurph
Copy link
Collaborator

dmurph commented Feb 22, 2021

@benfrancis wrote:
Yes, and by my count there are currently about 10 user agents which use manifest URL, which isn't really reflected in the explainer.

Just to close the loop, I think we end up with this list:

  • Android
    • Chromium (please correct me if any of these are forks of something else)
      • Chrome
      • Samsung
      • Edge
      • Opera
      • Brave
      • UC Browser
      • QQ Browser
      • Huawei Browser
      • Baidu
      • (etc)

And then I believe the following use start_url:

  • Android
    • Firefox
  • Desktop
    • Chromium
      • Chrome
      • Edge
      • Opera
      • Brave
      • UC Browser
      • QQ Browser
      • (etc)
    • ChromeOS
  • (previously Firefox, WebApps not supported anymore)

And the following don't really have a unique id for their manifest:

  • Safari iOS

WebApps not supported:

  • Safari Desktop
  • Firefox Desktop

Unknown:

  • KaiOS

You're correct - we didn't expand on the Chromium forks for Android or Desktop, instead trying to focus on different user agent implementations - Chromium, Firefox, Safari.

@dmurph
Copy link
Collaborator

dmurph commented Feb 24, 2021

@fabricedesre Gentle ping on feedback here, do you think you could take a look by the end of the week?

@fabricedesre
Copy link

@dmurph yes I'll answer tomorrow.

@fabricedesre
Copy link

For KaiOS, the currently released devices uses the Open Web Apps (OWA) manifest from Firefox OS, while the new upcoming major release will use PWA manifests.

Let me talk a bit about the history here - keep in mind this work started in 2011 :)
In OWA, we always used the manifest url as the application id. This is explicit in the API itself, where many operations take the manifest url as the only parameter. I think overall it worked well because this was a very simple model. For instance this allows to manage app updates in a straightforward way: fetch the new manifest, check with etag and/or hash if it changed, and apply the update after verification that the app name hasn't changed to prevent spoofing.
We enforce that start_url is relative to the manifest url, and we don't care about the document url or origin. There is no scope either (we added something similar later for KaiOS though). We partitioned cookies and storage per manifest url, providing something close to the current Firefox containers.

When work started on PWA manifests, I distinctly remember that a disagreement between the Mozilla and Google teams was about how to identify apps and the whole document origin/manifest origin/start url processing. To be honest I still don't understand why the document origin should have a role to play here. We felt that a lot of complexity with little justification was added to the spec. It's good to see that now there is a lot more data from the field to help make decisions.

The new PWA based implementation we do for KaiOS keeps the manifest url as the app id, and in this regard is similar to Chrome for Android. Because of the form factor of our devices (non touch, small screens), we only provide store initiated installs, which means that we have a bit more control over the manifests that are installable. However I find the list of scenari quite interesting for when we'll offer browser initiated installs. I'm not sure I agree that all of them should be supported, but the intent is valid.
We don't support data: urls for manifests, and have no plans to change that. It looks like instead a service worker could generate them with a "real" url?

Out of all the proposals in the explainer, I could live with the first one "start_url_origin + specified id". I feel this is reasonable since that doesn't introduce a new global identifier namespace, and provides a fallback for backward compatibility. Ideally, I would use "manifest origin + specified id" instead, but that prevents support of data: urls for manifests.

Kind of a side question, but do you have data on how often start url has a different origin than the manifest url?

@dmurph
Copy link
Collaborator

dmurph commented Feb 26, 2021

@fabricedesre wrote:
When work started on PWA manifests, I distinctly remember that a disagreement between the Mozilla and Google teams was about how to identify apps and the whole document origin/manifest origin/start url processing. To be honest I still don't understand why the document origin should have a role to play here. We felt that a lot of complexity with little justification was added to the spec. It's good to see that now there is a lot more data from the field to help make decisions.

This was before my time :)

I kind of see the 'rel' link as the source of truth here - if the start_url document has a <link rel="manifest" href="manifest.json"> and that manifest is the one we are using, then we are 💯 . For example, if I was running a web app directory and someone submitted a manifest, then I would want to check that the document at start_url had a rel link back to that manifest to verify it. Maybe this doesn't really matter though.

The new PWA based implementation we do for KaiOS keeps the manifest url as the app id, and in this regard is similar to Chrome for Android. Because of the form factor of our devices (non touch, small screens), we only provide store initiated installs, which means that we have a bit more control over the manifests that are installable. However I find the list of scenari quite interesting for when we'll offer browser initiated installs. I'm not sure I agree that all of them should be supported, but the intent is valid.

I think this is also a similar model to how trusted web activities work in the Google Play store - since it is a curated model where the developer uploads their app.

We don't support data: urls for manifests, and have no plans to change that. It looks like instead a service worker could generate them with a "real" url?

Yes, but not without developers changing their code / infrastructure :(

Out of all the proposals in the explainer, I could live with the first one "start_url_origin + specified id". I feel this is reasonable since that doesn't introduce a new global identifier namespace, and provides a fallback for backward compatibility. Ideally, I would use "manifest origin + specified id" instead, but that prevents support of data: urls for manifests.

Kind of a side question, but do you have data on how often start url has a different origin than the manifest url?

Surprisingly I see none in the pwa-directory data, which is weird, as this used to be in there (see the comment above), and it still has a different origin manifest:
https://www.spokeo.com
There's a chance that these maybe got filtered out somehow?

Code I tried to find manifests that are not same origin as start_url (warning: I'm bad at jq):

curl -sSL 'https://pwa-directory.appspot.com/api/pwa/?limit=4200' | jq -r '.[] | (.manifestUrl | split("/")[2]) + " " + (.absoluteStartUrl | split("/")[2])' | awk '$1 != $2'

Maybe someone else can find another directory we can query, or other examples

@ralphch0
Copy link

ralphch0 commented Mar 4, 2021

I just want to highlight that in service-worker-land there is nothing like start_url, etc, to use as an id. So falling back to scope as the default id is mainly because scope is the identifier today (since there is nothing else unique to use). Since the fallback is for compatibility, it probably makes sense to fallback to whatever is currently being used as the id. In this case it sounds like at least two browsers use start_url for that (if I have followed correctly).

Just a quick note here: I think it's definitely an advantage to try to reduce pain by making the fallback work for compatibility. But if feels scope is likely to achieve this also to a large extent, since it's likely the most stable field (maybe we can get numbers here?). The only thing that's meaningfully changing is the mental model for desktop devs (it's unclear if expectations are even well formed for most devs).

If we can make the id required at some point, as @dmurph suggested, then the non-intuitive fallback is limited to the short term. Is this likely though? Would we need to build some sort of allowlist of existing sites to exclude from the requirement, or have a migration period after which we will start breaking new installations?

If this fallback is likely stay forever, and we think that there are a lot more PWAs to build in the future than we have today, then it might make sense to pick a fallback that makes sense in the future.

That said, this is a slight preference from a design point of view and consistency with service workers. It won't meaningfully affect our own apps, if it's decided to go ahead with the current proposal for fallback.

@philloooo
Copy link
Collaborator

Hi ralphch0, I think making the id required at some point can be achieved by

  1. have a warning in the lighthouse about the new field and a date when this becomes required.
  2. make it required for the install icon to show up. The app can still be installed via shortcut if it's not installable.

I think it's quite reasonable to approach it this way, since we already have a model of constantly evolving the installabillity criteria to higher standard.

@philloooo
Copy link
Collaborator

Hi, thanks for all the feedback!
After going through & incorporating the feedback here to the explainer, I believe the current proposal (global_id = start_url_origin + manifest_id_member, default global_id = start_url) is the one that supports the use cases the best and allows the smoothest transitions from the existing ecosystem. I am moving forward with implementation on Chromium side.
You can track the status of the implementation on https://www.chromestatus.com/feature/6064014410907648

@marcoscaceres
Copy link
Member

marcoscaceres commented Mar 17, 2021

Thanks for the Chrome update @philloooo. The above turned into a bit of a monster thread... can someone give me a tl;dr of what the actual id is? Is it a URL or is it a UUID or something else? (i.e., what a some of the processing rules)

@dmurph
Copy link
Collaborator

dmurph commented Mar 17, 2021

Good question! I think it's easiest to think of these two categories of web apps:

If you are creating an app from scratch (no existing installs to update)

id is just a string, like a UUID, that uniquely identifies the app on the start_url_origin

(at https://www.new-app.com/):

{
  ...
  id: "NewAppId",
  start_url: "/index.html", // <- updatable now :)
  ...
}

The global id evaluates to https://www.example.com/ + NewAppId = https://www.example.com/NewAppId. This is not intended to be evaluate-able, but looks like a URL.

If you have an existing app (installs exist where the manifest did NOT have an id set)

id should be the relative path of the start_url so that the new manifest (with id specified) will apply, and thus update, the old manifest. EX:

old (at https://www.example.com/):

{
  ...
  start_url: "/index.html"
  ...
}

The default global id is: https://www.example.com/index.html

new:

{
  ...
  id: "index.html",
  start_url: "/index.html", // <- now this is updatable!
  ...
}

The global id evaluates to https://www.example.com/ + index.html = https://www.example.com/index.html, thus matching the old manifest.

Hopefully that makes sense?

(example of updating the start_url of the app)

{
  ...
  id: "index.html",
  start_url: "/nested/index.html",
  scope: "/nested/",
  ...
}

@marcoscaceres
Copy link
Member

Ok cool. Thanks for the clear explanation, @dmurph! It seems pretty straight forward to implement/spec. Should we give it a few months before adding this based on Chrome's rollout? Then at least we can be assured it works ok and we've not overlooked, or at least we hear about, any edge cases.

@dmurph
Copy link
Collaborator

dmurph commented Mar 18, 2021

I think that sounds good. I imagine we'll make a pull request once we have a working / testable implementation.

@marcoscaceres
Copy link
Member

I'm happy to draft up the pull request based on the above. No worries.

@marcoscaceres marcoscaceres removed the Defer Until after REC label Mar 18, 2021
@dmurph
Copy link
Collaborator

dmurph commented Mar 24, 2021

FYI @philloooo is very interested in writing the pull request here when we get to that stage to get some good spec experience 😄

@marcoscaceres
Copy link
Member

@philloooo, that's great to hear. If you need any assistance or have any questions, feel to reach out. Happy to help!

@antonyf
Copy link

antonyf commented Apr 9, 2021

I thought i would through my 2 cents in:
I use localforage to create a db and place a unique id on the user's system. Im using incremental values and once Ive added a unique id to the users system/browser i update the "last_unique_id table" in my mysql database.
I'm sure you could make some use of this method..

localforage will start with indexedDB and if its not available will move though a list of other database formats until it can land the unique id..

Hope this can help you..

PS: this a persistent storage method

@philloooo
Copy link
Collaborator

philloooo commented Jul 22, 2021

Status update: The manifest id is implemented behind a flag in Chromium and going through launch review process.
I have a pull request for the spec. #988

@hober can you take a look at the pull request to see if it makes sense to WebKit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.