Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nominal unique type brands #33038

Closed
wants to merge 1 commit into from
Closed

Conversation

weswigham
Copy link
Member

@weswigham weswigham commented Aug 23, 2019

Fixes #202
Fixes #4895

We've talked about this on and off for the last three years, and it was a major reason we chose to use unique symbol for the individual-symbol-type, since we wanted to reuse the operator for a nominal tag later. What this PR allows:

type NormalizedPath = unique string;
type AbsolutePath = unique string;
type NormalizedAbsolutePath = NormalizedPath & AbsolutePath;

declare function isNormalizedPath(x: string): x is NormalizedPath;
declare function isAbsolutePath(x: string): x is AbsolutePath;
declare function consumeNormalizedAbsolutePath(x: NormalizedAbsolutePath): void;


const p = "/a/b/c";
if (isNormalizedPath(p)) {
    if (isAbsolutePath(p)) {
        consumeNormalizedAbsolutePath(p);
    }
}

unique T (where T is any type) is allowed in any position a type is allowed, and nominally tags T with a marker that makes only T's that have come from that location be assignable to the resulting type.

This is done by adding a new NominalBrand type flag, which is a type with no structure which is unique to each symbol it is manufactured from. This is then mixed into the argument type to unique type via intersection, which is what produces all useful relationships. (The brand can have an alias if it is directly constructed via type MyBrand = unique unknown)

This does so much with so little - this reduces the jankiness written into types to enable nominalness with unique symbols or enums, while adding zero new assignability rules.

So, why bring this up now? I was thinking about how "brands" work today, with something like type MyBrand<T> = T & {[myuniquesym]: void} where T could then become a literal type like "a". We've wanted, for awhile, to be able to more eagerly reduce an intersection of an object literal and a primitive to never (to make subtype reduction and intersection reduction produce less jank and recognize more types as mutually exclusive), but these "brand" patterns keep stopping us. (Heck, we use em internally.) Well, if we ever want to change object types to actually mean object, then we're going to need to provide an alternative for the brand pattern, and ideally that alternative needs to be available for awhile. So looking on the horizon to breaks we could take into 4.0 in 9 months, this simplification of branding would be up there, provided we've had the migration path available for awhile. So I'm trying to get the conversation started on this before we're too close to that deadline to plan something like that. Plus #202 is up there on our list of all-time most requested issues, and while we've always been open to it, we've never put forward a proposal of our own - well, here one is.

On unique symbol

unique symbol's current behavior takes priority for syntactically exactly unique symbol, however if a nominal subclass of symbols is actually desired, one can write unique (symbol) to get a nominally branded symbol type instead (or symbol & unique unknown - it's exactly the same thing). The way a unique symbol behaves is like a symbol that is locked to a specific declaration, and has special abilities when it comes to narrowing and control flow because of that. Nominally branded types are more flexible in their usage but do not have the strong control-flow linkage with a host statement that unique symbols do (in fact, they don't necessarily assume a value exists at all), so there is very much reason for them to coexist. They're similar enough, that I'm pretty comfortable sharing syntax between the two.

Alternative considerations

While I've used the unique type operator here, like we've oft spoken of, on implementation, it's become plain to me that I don't need to specify an argument to unique. We could just expose unique as a unique type factory on it's own, and dispense with the indirection. The "uniqueness" we apply internally isn't actually tied to the input type argument through anything more than an intersection type, so simply shortening unique unknown to unique and reserving the argument form for just unique symbols may be preferable. All the same patterns would be possible, one would just need to write string & unique instead of unique string, thus dispensing with the sugar. It depends on the perceived complexity, I think. However, despite being exactly the same, string & unique is somehow uglier and harder to look at than unique string, which is why I've kept it around for now. It's probably worth discussing, though.

What this draft would still need to be completed:

  • New error messages for any errors involving brands (One or more unique brands is missing from type A with related information pointing at the brand location, rather than the current error involving unique unknown)
  • More tests exercising indexed access and exploring how indexed accesses on branded types are constructed (specifically, what should be done if you (attempt to) index nothing but a union of unique unknown brands)
  • Tests for unique (symbol) declaration emit (to ensure it's not rewritten as unique symbol)
  • Discussion on if keyof <unique brand type> should be never (as it is now), since the brand is top-ish (and contains no structure information itself), or if it should be preserved as an abstract keyof <unique brand> type, so that brand can apply keys-of-branded-types constraints in constraint positions
  • 🚲 🏠

@fatcerberus
Copy link

While I've used the unique type operator here, like we've oft spoken of, on implementation, it's become plain to me that I don't need to specify an argument to unique.

I don't understand this part. If these are meant to replace branded types, then:

type UString = unique string;
type BString = string & { __brand: true };

declare let ustr: UString;
declare let bstr: BString;
let str1: string = bstr;  // ok because branded string is a subtype of string (this is generally desirable).
let str2: string = ustr;  // could be ok because unique string is also a string.
let num: number = ustr;  // never ok, should be error because unique string is NOT a number!

But if we just had unique without the argument, there would be no way to make this distinction. unique would just end up being a nominal unknown (effectively an opaque Skolem constant) which doesn't sound that useful to me?

@weswigham
Copy link
Member Author

But if we just had unique without the argument, there would be no way to make this distinction.

It's useful because you can then intersect it with something. Which is all unique string is doing under the hood right now - making that intersection for you.

@fatcerberus
Copy link

fatcerberus commented Aug 23, 2019

Yeah, I need to read more closely. I just noticed the string & unique bit before you posted. But I don't like this because unique by itself isn't a type. It would be a special marker that doesn't work by itself as a type (or at least, doesn't make sense as one - what would it mean to take an argument of form arg: unique, e.g.), which I find very weird to then use in a position that nominally (no pun intended!) accepts only type operands. I guess there's precedent with ThisType<T>, but... I don't know, it rubs me the wrong way.

It might be represented as an intersection under the hood, but that strikes me as unnecessarily exposing implementation details.

@weswigham
Copy link
Member Author

weswigham commented Aug 23, 2019

unique is essentially just shorthand for unique unknown under the current model, and it does function fully as a standalone type.

@cevek
Copy link

cevek commented Aug 23, 2019

Is this proposal allows to assign literals to variable/param which has branded type?

type UserId = unique string
function foo(param: UserId) {}

foo("foo") // ok
var x: UserId = "foo"; // ok

let s = "str";
var y: UserId = s; // not ok
foo(s) // not ok

@goodmind
Copy link

How would you make nominal classes with this?

@weswigham
Copy link
Member Author

Not easily. You'd need to mix a distinct nominal brand into every class declaration, like via a property you don't use of type unique unknown or similar, and then ensure that your inheritance trees always override such a field (so a parent and child don't appear nominally identical).

I'd avoid it, if possible, tbh. Nominal classes sound like a pain :P

@weswigham
Copy link
Member Author

Is this proposal allows to assign literals to variable/param which has branded type?

Branded types can only be "made" either via cast or typeguard, as I said in the OP, so no. This is because the typesystem doesn't know what invariants a given brand is meant to maintain, and can't implicitly know if some literal satisfies them.

@jack-williams
Copy link
Collaborator

jack-williams commented Aug 23, 2019

Would something like the following ever be meaningful? Probably not right..

interface Parent extends unique unknown { }
interface ChildA extends (Parent & unique unknown) { }
interface ChildB extends (Parent & unique unknown) { }

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Aug 23, 2019

This is probably obvious to everyone but unique T shouldn't replace branding.
There are scenarios where branding is the only viable solution (at the moment).

For example,

//Can be replaced with `unique number`
type LessThan256 = number & { __rangeLt : 256 }
//Cannot be replaced with `unique number`
type LessThan<N extends number> = number & { __rangeLt : N }

More complicated example here,
#15480 (comment)


There are a few reasons why I'm generally against the idea of unique T.
(And nominal typing, and instanceof, and using symbols)

Cross-library interop

Library A may have type Radian = unique number.
Library B may have type Radian = unique number.

Both types will be considered different, even though they have the same name and declaration.

If libraries start using this unique T all over the place, you'll start needing unsafe casts (as libA.Radian) more often. This means you can accidentally write myDegree as libA.Radian and cast incorrectly. Whoops!

So, one starts thinking that a no-op casting function would be safer,

function libARadianToLibBRadian (rad : libA.Radian) : libB.Radian {
  return rad as libB.Radian;
}

This is safer because you won't accidentally convert myDegree:libA.Degree to libB.Radian

But if you have N libraries with their own Radian type, you may end up needing up to N^2 functions to convert between each of the Radian types from each library.


Cross-version interop

It's happened to me a bunch where I've had the same package, but at different versions, within a single project.

So,

v1.0.0's type Radian = unique number would be considered different from,
v1.1.0's type Radian = unique number

Now you need a casting function... Even though it's the same package.


With brands, if two libraries use the same brands, even if they're different types, they'll still be assignable to each other. (As long as they don't use unique symbol)

Library A may have type Rad = number & { __isRadian : void }.
Library B may have type Radian = number & { __isRadian : void }.

Even though they're different types, they're assignable to each other. No casting needed.
v1.0.0 and v1.1.0 of type Rad and type Radian will work with each other fine.


As an aside, I vaguely remember something from many, many years ago. I can't find it through Google anymore, though.

There was discussion about adding syntax to C++ to make typedef create a new type (rather than just functioning as a type alias),

typedef double Radian;

And this was rejected outright because of the issues I listed above.

Two libraries with their own typedef double Radian; type would be incompatible, the N^2 problem, etc.

@fatcerberus
Copy link

Re: nominal classes - classes are already nominal if they contain any private members. Just throwing that out there. 😃

@be5invis
Copy link

be5invis commented Aug 24, 2019

So we can finally have things like this? @weswigham

// Low-end refinement type :)
type NonEmptyArray<A> = unique ReadonlyArray<A>
function isNonEmpty(a: ReadonlyArray<A>): a is NonEmptyArray<A> {
    return a.length > 0
}

// INTEGERS (sort of)
type integer = unique number
function isInteger(a: number) { return a === a | 0 }

@AnyhowStep
Copy link
Contributor

@fatcerberus It's also why I avoid classes entirely and avoid private members if I do have them =P

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Aug 25, 2019

Hmm...

const normalizedPathBrand = Symbol();
const absolutePathBrand = Symbol();

type NormalizedPath = string & typeof normalizedPathBrand;
type AbsolutePath = string & typeof absolutePathBrand;
type NormalizedAbsolutePath = NormalizedPath & AbsolutePath;

declare function isNormalizedPath(x: string): x is NormalizedPath;
declare function isAbsolutePath(x: string): x is AbsolutePath;
declare function consumeNormalizedAbsolutePath(x: NormalizedAbsolutePath): void;


const p = "/a/b/c";
consumeNormalizedAbsolutePath(p); //Error
if (isNormalizedPath(p)) {
    consumeNormalizedAbsolutePath(p); //Error
    if (isAbsolutePath(p)) {
        consumeNormalizedAbsolutePath(p); //OK
    }
}

Playground


I guess the downside to this is that it's not actually a symbol.

But Symbol doesn't have very methods one would accidentally use.
image

@jack-williams
Copy link
Collaborator

jack-williams commented Aug 26, 2019

If cross version compatibility really is an issue then an alternate solution would be to make naming explicit:

Let the type unknown "name" denote the set of all values with label name. This would make the intersection type approach mandatory so a nominal path must now be written:

type NormalizedPathOne = string & unique "NormalizedPath";

where an unlabelled unique denotes a generative type as defined in the OP.

type NormalizedPathTwo = string & unique // some label that we don't care about that is auto-generated.

so while two declarations of NormalizedPathTwo produce distinct types, two declarations of NormalizedPathOne produce identical types.

FWIW I have no real preference---for me the big win of this feature is being able reduce empty intersections more aggressively.

Discussion on if keyof unique brand type ....

IMO, for all brand oblivious operations a unique type should be equivalent to unknown (which is what is proposed AFAIK).

@resynth1943
Copy link

I've implemented Opaque types like so:

type Opaque<V> = V & { readonly __opq__: unique symbol };

type AccountNumber = Opaque<number>;
type AccountBalance = Opaque<number>;

function createAccountNumber (): AccountNumber {
    return 2 as AccountNumber;
}

function getMoneyForAccount (accountNumber: AccountNumber): AccountBalance {
    return 4 as AccountBalance;
}

getMoneyForAccount(100); // -> error

@AnyhowStep
Copy link
Contributor

@resynth1943

Your version breaks given the following,

type Opaque<V> = V & { readonly __opq__: unique symbol };

type NormalizedPath = Opaque<string>;
type AbsolutePath = Opaque<string>;
type NormalizedAbsolutePath = NormalizedPath & AbsolutePath;

declare function isNormalizedPath(x: string): x is NormalizedPath;
declare function isAbsolutePath(x: string): x is AbsolutePath;
declare function consumeNormalizedAbsolutePath(x: NormalizedAbsolutePath): void;


const p = "/a/b/c";
consumeNormalizedAbsolutePath(p); //Error
if (isNormalizedPath(p)) {
    consumeNormalizedAbsolutePath(p); //Expected Error, Actual OK
    if (isAbsolutePath(p)) {
        consumeNormalizedAbsolutePath(p); //OK
    }
}

Playground

Contrast with,
#33038 (comment)

@resynth1943
Copy link

@AnyhowStep I know, but that's how I'm currently creating opaque types. I hope this Pull Request will incorporate this into the language, and make it even better than my implementation.

@mohsen1
Copy link
Contributor

mohsen1 commented Sep 3, 2019

Nominal types are pretty useful. The example I often use is the APIs that take latitude/longitude and bugs that are result of mixing up latitude with longitude which are both numbers. By making those unique types we can avoid that class of bugs.

However, unique types can cause so much pain when you have to keep importing those types to simply use an API. So I'm hoping that at least primitive types are assignable to unique primitives where I can still call my functions like this:

// lib.ts
export type Lat = unique number;
export type Lng = unique number;
export function distance(lat: Lat, lng: Lng): number;

// usage.ts
import { distance } from 'lib.ts';
distance(1234, 5678); // no need to asset types

As @AnyhowStep mentioned cross-lib and cross-version conflicting unique types can also be a source of pain. Can we limit uniqueness scope somehow? Would that be a viable solution?

@weswigham weswigham added the Experiment A fork with an experimental idea which might not make it into master label Sep 6, 2019
@weswigham weswigham marked this pull request as ready for review September 6, 2019 23:29
@weswigham
Copy link
Member Author

#33290 is now open as well, so we can have a real conversation on what the non-nominal explicit tag would look like, and if we'd prefer it.

@weswigham
Copy link
Member Author

BTW, this is now a dueling features type situation (though I've authored both) - we won't accept both a #33290 style brand and this PR's style brand, only one of the two (and we're leaning towards #33290 on initial discussion). We'll get to debating it within the team hopefully during our next design meeting (next friday), but y'all should express ideas, preferences, and reasoning therefore within both PRs.

@weswigham
Copy link
Member Author

Only repo contributors have permission to request that @typescript-bot pack this

@typescript-bot
Copy link
Collaborator

typescript-bot commented Sep 9, 2019

Heya @weswigham, I've started to run the tarball bundle task on this PR at 13968b0. You can monitor the build here. It should now contribute to this PR's status checks.

@typescript-bot
Copy link
Collaborator

Hey @weswigham, I've packed this into an installable tgz. You can install it for testing by referencing it in your package.json like so:

{
    "devDependencies": {
        "typescript": "https://typescript.visualstudio.com/cf7ac146-d525-443c-b23c-0d58337efebc/_apis/build/builds/43239/artifacts?artifactName=tgz&fileId=CC1A42393FB48F1DFF21222516C1DB28571C1E5134B16E3B234461EABAECAE1D02&fileName=/typescript-3.7.0-insiders.20190909.tgz"
    }
}

and then running npm install.

@xiaoxiangmoe
Copy link
Contributor

xiaoxiangmoe commented Sep 9, 2019

type integer = unique number
function isInteger(a: number): a is integer { return a === (a | 0) }

function Interger(a:number) {
    if(!isInteger(a)) throw new Error("not an integer");
    return a;
}

@shicks
Copy link
Contributor

shicks commented May 28, 2020

To what extent does #31894 satisfy the same requirements as this proposal? I know @DanielRosenwasser mentioned in #38510 that while nominal brands would be nice, but that it's not clear whether it's counter to the goals of 4.0 at the moment.

That said, it seems to me that one could easily enough write

placeholder type NormalizedPathBrand;
placeholder type AbsolutePathBrand;
type NormalizedPath = string & NormalizedPathBrand;
type AbsolutePath = string & AbsolutePathBrand;

where the brands are module-local and never intended to be implemented by any concrete type. Branded types already need explicit casts to instantiate them, so it's easy enough to make them. Given that that proposal seems to have more traction, does it make this one obsolete?

@dead-claudia
Copy link

Is there a status update on this PR?

@dead-claudia
Copy link

@weswigham Couple questions:

  1. Could you elaborate further on how unique symbol and unique (symbol) differ with respect to control flow-sensitive typing? I would've thought neither would have special behavior on that front.
  2. Do you have any status update for this? It's been over a year with pretty much radio silence.

@weswigham
Copy link
Member Author

Could you elaborate further on how unique symbol and unique (symbol) differ with respect to control flow-sensitive typing? I would've thought neither would have special behavior on that front.

unique symbol is a single runtime value. Specifically, the result of an invocation of Symbol() at some location. Because of this, it works as a well-known property name, and acts as a literal. Each place unique symbol is written is a distinct, single symbol. unique (symbol) is "a unique subset of all symbols" and represents an abstract set of potentially multiple symbols. This behaves differently - notably, it isn't usable as well-known property name.

Do you have any status update for this? It's been over a year with pretty much radio silence.

Oh, wow, it's been that long. Uh. Last time I presented it to the team, the response was somewhat lukewarm - there's no real excitement or drive for it right now. So I guess what we're looking to see is overwhelming demand?

@kachkaev
Copy link

kachkaev commented Oct 22, 2020

So I guess what we're looking to see is overwhelming demand?

Not sure how the demand is properly measured, but this PR is in top 5 upvoted open PRs at the moment. Just saying 😁

@dead-claudia
Copy link

@weswigham Thanks for the quick explanation! I see why both are necessary, now. Also, I definitely recommend looking into @kachkaev's comment above.

@alexweej
Copy link

Just used branding to very quickly identify the inconsistencies in a code base that was using string for 3 different logical dynamically valued types. Consider this my +1...

@ProdigySim
Copy link

ProdigySim commented Nov 30, 2021

Just a bit of data: The last codebase I was in changed from using unique symbol to string-based tag types due to their simplicity in creating compatible types across repos.

We wanted to move to a code generation system for our web API types. We had backend code that used nominal types ("tinytypes") in Scala which was the source of our branding.

Using unique-symbol based types had a couple of drawbacks here:

  1. We would need to have every nominal type exposed on the API to be predefined somewhere in TypeScript
  2. If the backend & frontend nominal type names mismatched at all, we would need to write custom mapping.
  3. One potential solution would be to have a shared library with all our "global" nominal types. But, this opens the possibility for "Dreaded Diamond" dependency patterns on these types, which would lead to incompatible nominal types (different "unique symbols" from different versions of the library)

We ended up converting all of our nominal types to string-based tag types overnight to support this project. They solved all of these problems:

  1. API code gen can just emit a generic Opaque<T, 'tag'> and not worry about references/imports
  2. If backend and frontend nomenclature mismatch, we can negotiate on the value of the 'tag' string, without any user-facing implications.
  3. Don't need a shared library in this solution. If we did opt for one, the dreaded diamond would not be an issue unless the type tags actually changed--which would probably mark an important compatibility change.

tl;dr string-based tag types seem a little more flexible with implementation details and are still good enough. They still have the hard problem of naming things, but they are fixable when there are name conflicts. We can also put out recommendations for tag patterns in shared libraries.

@sandersn
Copy link
Member

This experiment is pretty old, so I'm going to close it to reduce the number of open PRs.

@negue
Copy link

negue commented May 24, 2022

Will there be any type of these unique/opaque types or how ever these are called be added into typescript itself?

I needed something like that in a project where I had to juggle like a couple of simple "number" IDs and here something like unique number type per ID would've helped to save some time debugging :D

@shicks
Copy link
Contributor

shicks commented May 25, 2022

Is there a particular reason this experiment stalled out?

@evelant
Copy link

evelant commented May 26, 2022

@sandersn This is the 9th most voted on PR in the entire history of the repo, I think that's a clear indicator that a lot of people want it and it's probably worth pushing forward. Would you please reopen it?

IMO this is a very valuable feature. Nominal types can greatly increase type safety in any case where a value has a specific semantic meaning and thus should not be compatible with any similarly shaped type (or primitive) as is the default. Currently that's a difficult problem to solve with typescript. I would imagine nominal types could also greatly improve compiler performance by skipping structural comparisons.

IMO this should not be summarily closed simply because it hasn't had attention recently.

@weswigham since you authored this, do you think it is worth continuing the work you started or should this be closed in favor of a new discussion/implementation?

@weswigham
Copy link
Member Author

I mean, personally, I think the structural version of this PR, the tag types PR, is the more general and better one. You can make those tags into nominal ones by including a unique symbol in your tag type, but conversely you can't make these into a structural tag, and structural tags have a lot of nice properties for interface merging and cross-library tag type management.

@evelant
Copy link

evelant commented May 26, 2022

@weswigham I'm not familiar with the PR you're referring to, would you link to it please?

@weswigham
Copy link
Member Author

#33290

@evelant
Copy link

evelant commented May 26, 2022

Thanks, I agree that PR seems like it is a better approach. Unfortunately it looks like that one got summarily closed as well. Given that there seems to be a lot of interest in support for some form of nominal typing perhaps it should be reopened?

@sandersn
Copy link
Member

From the last comment on the other PR:

This is effectively an implementation sketch for #202. We'd prefer people interact with the suggestion issues than the PRs, since the implementation of a feature is mechanical once the design has been worked out, and nominal types are still very much under discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Experiment A fork with an experimental idea which might not make it into master
Projects
Design Meeting Docket
  
Awaiting triage
Development

Successfully merging this pull request may close these issues.

Tag types Support some non-structural (nominal) type matching