Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to one-of construct #747

Open
Oblongs opened this issue Apr 26, 2024 · 24 comments
Open

Enhancements to one-of construct #747

Oblongs opened this issue Apr 26, 2024 · 24 comments
Labels
enhancement New feature or request open for discussion There is no clear or immediate solution, so discussion is encouraged subject: syntax This issue is about the syntax of Rosetta

Comments

@Oblongs
Copy link

Oblongs commented Apr 26, 2024

Background

  • The CDM Asset Refactoring Taskforce is working to enhance the modelling of financial assets in the CDM.
  • This includes adding additional data types to the existing product model which will introduce additional levels into the hierarchy and increased use of the “one of” syntax to provide conditional selection of multiple subsidiary product types.
  • Example
    • An example of a new data type introduced in the refactoring is Asset (simplified):
  type Asset
    basket Basket (0..1)
    loan Loan (0..1)
    security Security (0..1)
       condition: one-of
  • The sub-types in the Asset definition will contain common attributes, for example an identifier.

Rationale

  • The proposed enhancements to the DSL will:
    • Increase understanding of modularity in models built using Rune.
    • Improve readability of Rune DSL code.
    • Differentiate one-of selections from other data types.
    • Reduce long path traversals in complex constructs.
    • Enable ease-of-use enhancements in DSL tools.

Requirements

  1. Simplify the one-of construct by enabling a special kind of data type which:
    • Is composed of two or more constituents (which are themselves defined as data types).
    • Contains a choice of one and only one of the constituents that can be used in any instance.
    • Restricts the cardinality of all constituents to one.
    • Imposes that the name of a constituent (i.e. the attribute) is the same as that of the data type.
  2. Where multiple child data types of a parent data type have common attributes, it should be possible to access the common attributes directly.
    • For example, where identifier is common across all Asset sub-types, the current syntax would require IF…THEN logic to identify which sub-type is present, e.g.:
  if basket exists
  then asset -> basket -> identifier
  else if loan exists
    then asset -> loan -> identifier
    else asset -> security -> identifier
  • A simpler access path is required, for example:

  asset -> identifier

Summary Deck

The attachment documents these requirements.

RUNE DSL Enhancement for One Of.pptx

@Oblongs
Copy link
Author

Oblongs commented Apr 26, 2024

Related CDM initiative: finos/common-domain-model#2805

@SimonCockx
Copy link
Contributor

SimonCockx commented May 2, 2024

Here is a comparison of three different solutions. I evaluate each of them based on the following five questions.

  1. How is the type represented in the model? How are common attributes represented? Is there duplication?
  2. How can one access common attributes in an expression?
  3. How is the type represented in serial form (JSON)?
  4. How can one discriminate between different types?
  5. Is it possible to guarantee that all cases are covered? (e.g., by red-underlying the Rune code when a modeller forgot to cover a case)

All comparisons are based on a toy version of the Asset (Basket/Loan/Security) model.

Enhance the current solution (one-of)

Summary:

  1. Common attributes are duplicated for each one-of case.
  2. Accessing common attributes is verbose. A proposed enhancement (->>) can eliminate this verbosity.
  3. Serialisation is nested. The upside is that the field name allows us to discriminate between different types during deserialisation.
  4. Discriminating types and accessing their attributes is verbose. A proposed enhancement (switch) can eliminate this problem.
  5. No validation to guarantee all cases are covered. A proposed enhancement (switch) can eliminate this problem.

Model

type Asset:
  basket Basket (0..1)
  loan Loan (0..1)
  security Security (0..1)

  condition Choice:
    one-of

type Basket:
  identifier string (1..1)
  basketAttribute int (0..1)

type Loan:
  identifier string (1..1)
  loanAttribute number (0..*)

type Security:
  identifier string (1..1)
  securityAttribute date (2..2)

Access common attributes

if asset -> basket exists
then asset -> basket -> identifier
else if asset -> loan exists
then asset -> loan -> identifier
else if asset -> security exists
then asset -> security -> identifier

Access common attributes (proposed enhancement: automatically detect common attributes and make them directly available)

// Option 1: implicit
asset -> identifier
// Option 2: explicit
asset ->> identifier

Serialisation (of a basket)

{
  "basket": {
    "identifier": "abc123"
    "basketAttribute": 42
  }
}

Discriminate types

if asset -> basket exists
then // do something with asset -> basket -> basketAttribute
else if asset -> loan exists
then // do something with asset -> loan -> loanAttribute
else if asset -> security exists
then // do something with asset -> security -> securityAttribute

Discriminate types (proposed enhancement: switch over one-of types to guarantee coverage and reduce verbosity)

switch asset
  case basket then // do something with basket -> basketAttribute
  case loan then // do something with loan -> loanAttribute
  case security then // do something with security -> securityAttribute

Using extends

Summary:

  1. Common attributes are not repeated. Good!
  2. Accessing common attributes is concise. Good!
  3. Cannot determine the actual type when deserialising. A proposed enhancement (@type) can eliminate this problem.
  4. Need a way to discriminate between different types, either via an of-type operation, or via "dispatching".
  5. Validation to guarantee all cases are covered is impossible since subtyping is open for extension.

Model

type Asset:
  identifier string (1..1)

type Basket extends Asset:
  basketAttribute int (0..1)

type Loan extends Asset:
  loanAttribute number (0..*)

type Security extends Asset:
  securityAttribute date (2..2)

Access common attributes

asset -> identifier

Serialisation (of a basket) (with proposed solution to determine the actual type)

{
  "@type": "Basket"
  "identifier": "abc123",
  "basketAttribute": 42
}

Discriminate types (new feature!)

// Option 1: à la `instanceof` (with type narrowing?)
if asset of-type Basket
then // do something with asset -> basketAttribute
else if asset of-type Loan
then // do something with asset -> loanAttribute
else if asset of-type Security
then // do something with asset -> securityAttribute
// else ... (potentially add a default case here)

// Option 2: dispatching
dispatch func DoTheThing(asset Asset):
  output: result Foo (1..1)

  // set result: ... potentially add a default case here

implement func DoTheThing(basket Basket):
  set result: // do something with basket -> basketAttribute

implement func DoTheThing(loan Loan):
  set result: // do something with loan -> loanAttribute

implement func DoTheThing(security Security):
  set result: // do something with security -> securityAttribute

Introducing a new union type (alternative: choice type)

Summary:

  1. Common attributes are duplicated for each union case.
  2. Accessing common attributes is concise. Good!
  3. Need to serialise the type name as well.
  4. Discriminating types is concise. Good!
  5. Validation to guarantee coverage of all cases is possible. Good!

Model

union Asset:
  Basket
  Loan
  Security

type Basket:
  identifier string (1..1)
  basketAttribute int (0..1)

type Loan:
  identifier string (1..1)
  loanAttribute number (0..*)

type Security:
  identifier string (1..1)
  securityAttribute date (2..2)

Access common attributes

asset -> identifier

Serialisation (of a basket)

{
  "@type": "Basket"
  "identifier": "abc123",
  "basketAttribute": 42
}

Discriminate types

switch asset
  case Basket then // do something with asset -> basketAttribute
  case Loan then // do something with asset -> loanAttribute
  case Security then // do something with asset -> securityAttribute

@SimonCockx
Copy link
Contributor

Based on this analysis, I would propose the following.

  • The end goal is to support union types as described in the last section.
  • As a migration strategy, we first support the two enhancements (-> for nested attributes and switch) for one-of types.

@SimonCockx SimonCockx added enhancement New feature or request subject: syntax This issue is about the syntax of Rosetta open for discussion There is no clear or immediate solution, so discussion is encouraged labels May 2, 2024
@Oblongs
Copy link
Author

Oblongs commented May 3, 2024

Just a clarification on the model, as we are currently writing it, will look like this with the common elements in AssetBase:

union Asset:
  Basket
  Loan
  Security
 
type AssetBase:
  identifier string (1..1)
 
type Basket extends AssetBase:
  basketAttribute int (0..1)
 
type Loan extends AssetBase:
  loanAttribute number (0..*)
 
type Security extends AssetBase:
  securityAttribute date (2..2)

Does this change your analysis at all?

In fact, we have also made this slightly worse, as follows

type AssetBase:
  identifier assetIdentifier (1..1)
 
type AssetIdentifier extends Identifier:
  identifierType assetIdTypeEnum (1..1)
 
type Identifier:
  identifier string (1..1)

Which means, as it currently stands, when we need to reference an identifier, we need to do this:

basket -> identifier -> identifier

So there is an even stronger case for

asset ->> identifier

@Oblongs
Copy link
Author

Oblongs commented May 3, 2024

On the proposed migration strategy, can we leverage Minesh’s “pre-processing” concept to implement union on the front end that is actually implemented as one-of in the DSL? That is:

View in Rosetta:

union Foo:     
  Bar1
  Bar2
  Bar3 

Implementation

type Foo:
  bar1 Bar1 (0..1)
  bar2 Bar2 (0..1)
  bar3 Bar3 (0..1)
    condition: one-of

@SimonCockx
Copy link
Contributor

SimonCockx commented May 6, 2024

Just a clarification on the model, as we are currently writing it, will look like this with the common elements in AssetBase:

union Asset:
  Basket
  Loan
  Security
 
type AssetBase:
  identifier string (1..1)
 
type Basket extends AssetBase:
  basketAttribute int (0..1)
 
type Loan extends AssetBase:
  loanAttribute number (0..*)
 
type Security extends AssetBase:
  securityAttribute date (2..2)

Does this change your analysis at all?

This should work fine!

In fact, we have also made this slightly worse, as follows

type AssetBase:
  identifier assetIdentifier (1..1)
 
type AssetIdentifier extends Identifier:
  identifierType assetIdTypeEnum (1..1)
 
type Identifier:
  identifier string (1..1)

Which means, as it currently stands, when we need to reference an identifier, we need to do this:

basket -> identifier -> identifier

So there is an even stronger case for

asset ->> identifier

Hm, the current proposal adds ->> support for one-of and union types only. Since the type Identifier is neither,
it wouldn't be possible to do that. What you describe here seems like a different use case as the ones in the original issue. Is this another requirement? Are there alternatives? E.g., typeAlias Identifier: string.

On the proposed migration strategy, can we leverage Minesh’s “pre-processing” concept to implement union on the front end that is actually implemented as one-of in the DSL? That is:

View in Rosetta:

union Foo:     
  Bar1
  Bar2
  Bar3 

Implementation

type Foo:
  bar1 Bar1 (0..1)
  bar2 Bar2 (0..1)
  bar3 Bar3 (0..1)
    condition: one-of

Currently investigating this. I took a quick look together with Minesh, and we came to the conclusion that it's easier said than done. There is a path I haven't explored yet - more to follow.

@lolabeis
Copy link
Contributor

lolabeis commented May 7, 2024

Proposal looks good. A few comments and questions:

  1. In the 3rd proposal (union) when you use switch, does it assume that the path starts at (in the Basket case) asset -> basket, so you would directly start typing basketAttribute?
  2. Can some of a union underlying types be a union as well? In this case, can you specify how the switch statement, which likely needs nesting, would work? Please use the following example:
union Instrument:
  Security
  Loan

union Asset:
  Basket
  Instrument
  1. Would you allow the following expression: asset -> basket -> basketAttribute (which means the DSL must associate some default name to each attribute), or would you only allow switch statements or calling the common attributes on a union type?
  2. Finally I think the choice naming alternative is more appropriate than union indeed.

@lolabeis
Copy link
Contributor

lolabeis commented May 7, 2024

@Oblongs With regards to your point about:

basket -> identifier -> identifier

I don't think the proposal would allow simply to shorten as:

asset ->> identifier

Instead, the approach we discussed to eliminate the extra level on this one is to define:

AssetBase extends Identifier:
  identifierType assetIdTypeEnum (1..1)

But it's separate from the issue being discussed here.

@lolabeis
Copy link
Contributor

lolabeis commented May 7, 2024

@SimonCockx There is another requirement that we'd like you to consider. Although it's another "killer-feature", it's independent from the above and not on the critical path of migration.

When defining a union, it should be possible to declare an associated enum:

union Asset:
  Basket
  Loan
  Security
  as-enum AssetTypeEnum

And then it would be possible to use AssetTypeEnum as if it was explicity declared, e.g.:

type Collateral
  assetType AssetTypeEnum

Also how would that work in the "nested" union case?

@SimonCockx
Copy link
Contributor

SimonCockx commented May 10, 2024

@lolabeis Great points, some of which I have been consciously "forgetting", given they were not listed as requirements yet.

Proposal looks good. A few comments and questions:

  1. In the 3rd proposal (union) when you use switch, does it assume that the path starts at (in the Basket case) asset -> basket, so you would directly start typing basketAttribute?

I see two options here. Just to recapitulate, the question is: how do I access basketAttribute in the following location? (xxx)

switch asset
  case Basket then xxx
  ...

Either:

  1. In each branch, we use item to indicate asset with its type narrowed down to the specific case, e.g., Basket. This would mean you could directly type basketAttribute, which would be syntactic sugar for item -> basketAttribute:
    switch asset
      case Basket then basketAttribute
      ...
    
    In case you are switching on a more complex expression, this improves conciseness even more, e.g.,
    switch this -> is -> some -> long -> path -> asset
      case Basket then basketAttribute        // instead of having to write this -> is -> some -> long -> path -> asset -> basketAttribute
      ...
    
    One potential downside is that it redefines item, which when combined with other operations that define item (such as extract) can be confusing. Fictive example:
    reportableEvents
      extract switch reportableInformation -> asset // suppose `reportableInformation` has an asset attached to it.
        case Basket then Process(basketAttribute, reportableInformation) // This won't work: reportableInformation suddenly becomes "unavailable". A modeller would have to name their `reportableEvent` explicitly. 
        ...
    
  2. In each branch, the type of asset will change to the actual narrower type, e.g., Basket. This would mean a modeller would have to refer to asset -> basketAttribute to access to attribute:
    switch asset
      case Basket then asset -> basketAttribute
      ...
    
    When switching over long expressions, this could be cumbersome, although rewriting the expression can be avoided using extract, e.g.,
    this -> is -> some -> long -> path
      extract
        switch asset
          case Basket then asset -> basketAttribute
          ...
    

Currently I'm leaning towards the first option.

  1. Can some of a union underlying types be a union as well?

Yes, each of the union cases can be of any type, including data types, enumerations, basic types and other union types. In the long term I see additional benefits such as being able to conform to regulations that require us to either output a number or a string, e.g.,

union NumberOrString:
  number
  string

type Foo:
  bar NumberOrString (1..1)

which can then be serialised into

{
  bar: 42
}

or

{
  bar: "42 ounces"
}

This is something which currently is impossible to model with Rune, and for which clients have asked support for in the past.

In this case, can you specify how the switch statement, which likely needs nesting, would work? Please use the following example:

union Instrument:
  Security
  Loan

union Asset:
  Basket
  Instrument

Great question! I think supporting a "flat" switch, even for nested unions, will be the most easy to read and write, so you would be able to do something like

switch asset
  case Basket then ...
  case Security then ...
  case Loan then ...
// or, if only the common attributes of Security and Loan are relevant:
switch asset
  case Basket then ...
  case Instrument then ...

Potentially, they could also be "mixed" to provide default cases for nested unions. Suppose that Instrument had a third option called AnotherInstrument, then one could write something like this:

switch asset
  case Basket then ...
  case Security then ... // this catches the first case of `Instrument`
  case Instrument then ... // this catches all other `Instrument` cases, i.e., `Loan` and `AnotherInstrument`

Note that the order of cases then starts to matter. Writing case Instrument and then case Security should be forbidden by the DSL, since the latter case will never be reached.

  1. Would you allow the following expression: asset -> basket -> basketAttribute (which means the DSL must associate some default name to each attribute), or would you only allow switch statements or calling the common attributes on a union type?

This would be part of the migration strategy, but in the end I would disallow this kind of direct access of attributes that are not common, unless a compelling use case arises.

  1. Finally I think the choice naming alternative is more appropriate than union indeed.

To give it a try, I will use choice in my following responses. :)

@SimonCockx
Copy link
Contributor

On the proposed migration strategy, can we leverage Minesh’s “pre-processing” concept to implement union on the front end that is actually implemented as one-of in the DSL?

Update on this one: this is starting to look promising. We will probably follow this strategy as a quick win, and then incrementally start improving it.

@SimonCockx
Copy link
Contributor

SimonCockx commented May 10, 2024

@SimonCockx There is another requirement that we'd like you to consider. Although it's another "killer-feature", it's independent form the above and not on the critical path of migration:

When defining a union, it should be possible to declare an associated enum:

union Asset:
  Basket
  Loan
  Security
  as-enum AssetTypeEnum

And then it would be possible to use AssetTypeEnum as if it was explicity declared, e.g.:

type Collateral
  assetType AssetTypeEnum

Also how would that work in the "nested" union case?

Interesting. I think this wouldn't be too hard to add. Like you mention, the trickiness lies in how to handle nested choice types. To take your example from before:

choice Instrument as-enum InstrumentEnum: // another syntax suggestion
  Security
  Loan

choice Asset as-enum AssetEnum:
  Basket
  Instrument

I think the most useful interpretation is to flatten again. I assume the use case of representing a choice type as an enum is to indicate the actual type of an instance. Since an actual Asset will always be either a Basket, a Security or a Loan, and never an Instrument, I think that should be the case for the enum as well. I.e., AssetEnum would be equivalent to the following.

enum AssetEnum:
  Basket
  Security
  Loan

It depends on the use case of course. My interpretation could be wrong.

But perhaps it's best to continue this discussion in a separate issue.

@SimonCockx
Copy link
Contributor

SimonCockx commented May 10, 2024

I would like to add an alternative switch syntax to the discussion that @lolabeis proposed, which I also quite like:

asset switch
  Basket then <expr>,
  Security then <expr>,
  Loan then <expr>,
  <default expr> // this is optional

This would be better aligned with other operators such as extract.

@SimonCockx
Copy link
Contributor

SimonCockx commented May 10, 2024

One "use case" I didn't add to the comparison, but which would have been useful, is how to go from a specific type to a choice type, e.g., given a function that accepts an Asset as input, and given a variable of type Basket, how do I call this function?

Current solution (one-of)

Need to wrap it in an Asset constructor.

ProcessAsset(Asset { basket: basket, ... })

This is cumbersome!

Using extends

Works out of the box:

ProcessAsset(basket)

Using the proposed choice types

Works out of the box. No need to wrap!

ProcessAsset(basket)

@SimonCockx
Copy link
Contributor

SimonCockx commented May 10, 2024

Summary of the implementation plan

Below are four steps to get us from the current state to full support for choice types.

1. Choice types are supported as syntactic sugar of one-of types.

This is a quick win. E.g.,

choice Asset:
  Basket
  Loan
  Security

is syntactic sugar to

type Asset:
  Basket Basket (0..1)
  Loan Loan (0..1)
  Security Security (0..1)

  condition Choice:
    one-of

2. Killer-feature: common nested attributes of one-of (and choice types) can be accessed via a new ->> operator.

E.g.,

asset ->> identifier

This should also work for attributes that are nested with multiple levels of one-of types.

3. Support dedicated choice types (remove syntactic sugar), add support for switch expressions, support accessing common attributes with ->, and support (de)serialisation with @type (except for basic types).

A couple of things change at this point.

  1. A basket can now directly be passed to a function expecting an asset, instead of wrapping it in Asset { basket: basket, ... }. The Asset {...} constructor syntax disappears.
  2. Accessing common attributes in choice types changes from ->> to ->. For one-of types, the ->> syntax stays the same.
  3. Discriminating a choice type now must happen by using a switch. All checks of the form if asset -> basket exists then will need to be refactored. E.g.,
    asset switch
      Basket then <item is now of type Basket>,
      Loan then <item is now of type Loan>,
      Security then <item is now of type Security>,
      default then <optional default case>
    
    Nested choice types are flattened out.
  4. Serialisation becomes less nested. E.g., what used to be
    {
      "Basket": {
        "identifier": "abc123"
        "basketAttribute": 42
      }
    }
    
    now becomes
    {
      "@type": "Basket",
      "identifier": "abc123",
      "basketAttribute": 42
    }
    

Note that this can only be done after the migration to Translate 2.0.

4. Killer-feature: add support for using choice types as enum.

E.g.,

choice Asset as-enum AssetTypeEnum:
  ...

type Underlier:
  assetType AssetTypeEnum (1..1)

In an expression:

if underlier -> assetType = AssetTypeEnum -> Basket
then ...

@lolabeis
Copy link
Contributor

I would like to add an alternative switch syntax to the discussion that @lolabeis proposed, which I also quite like:

asset switch
  Basket then <expr>,
  Security then <expr>,
  Loan then <expr>,
  <default expr> // this is optional

This would be better aligned with other operators such as extract.

Fully support this, and I was about to suggest it 😄.

I think this would also allow you to do nested choice more elegantly - and I suggest using square bracket [] to be consistent with nesting of list operators. Re-using the same example as above :

asset switch
  Basket then <expr>,
  Instrument then switch [
    Security then <expr>,
    Loan then <expr>
    ],
  ...

@lolabeis
Copy link
Contributor

Currently I'm leaning towards the first option.

Agree with this.

With regards to the issue of redefining item, I think it's consistent with the nesting of list operators: to access a previously defined item, it must be named.

@lolabeis
Copy link
Contributor

lolabeis commented May 13, 2024

Also with the switch syntax now redefined to be aligned onto the list operator syntax, all of the below should be allowed.

Direct attribute access (in line with simpler rule syntax):

asset switch
  Basket then basketAttribute -> ... ,
  Security then securityAttribute -> ... ,
  Loan then loanAttribute -> ...

Using default item:

asset switch
  Basket then item -> basketAttribute -> ... ,
  Security then item -> securityAttribute -> ... ,
  Loan then item -> loanAttribute -> ...

Using named item:

asset switch a [
  Basket then a -> basketAttribute -> ... ,
  Security then a -> securityAttribute -> ... ,
  Loan then a -> loanAttribute -> ...
  ]

@lolabeis
Copy link
Contributor

But perhaps it's best to continue this discussion in a separate issue.

Agree, let's start a separate issue.

@lolabeis
Copy link
Contributor

One "use case" I didn't add to the comparison, but which would have been useful, is how to go from a specific type to a choice type, e.g., given a function that accepts an Asset as input, and given a variable of type Basket, how do I call this function?

Current solution (one-of)

Need to wrap it in an Asset constructor.

ProcessAsset(Asset { basket: basket, ... })

This is cumbersome!

Using extends

Works out of the box:

ProcessAsset(basket)

Using the proposed choice types

Works out of the box. No need to wrap!

ProcessAsset(basket)

So you could pass a variable of type basket to a function that takes Asset as input - this is cool!

@lolabeis
Copy link
Contributor

I think supporting a "flat" switch, even for nested unions, will be the most easy to read and write

With the way you redefined the switch syntax, it allows you to do nesting more easily - See above ☝️.

Your flat switch suggestion works and is quite concise, but at the expense of introducing an ordering concern, as you point out. It involves a little magic, whereas the explicit switch nesting is more transparent.

@lolabeis
Copy link
Contributor

Below are four steps to get us from the current state to full support for choice types.

Your implementation plan looks sensible. There is potentially a step 5, where we may be able to get rid of the one-of syntax (and consequently of ->>) altogether, if we manage to replace all occurences using choice - TBD.

@Oblongs
Copy link
Author

Oblongs commented May 13, 2024

The inverse scenario to defining a choice as also available as an enum also exists.

We already have this enum (simplified):

enum currencyEnum:
   EUR
   GBP
   USD

It might be interesting to be able to say

enum currencyEnum as-choice Cash:
   EUR
   GBP
   USD

Of course, it would be possible to refactor currencyEnum to become a choice data type with as-enum but its primary use will be as an enumerator and only edge case as a choice data type.

@SimonCockx
Copy link
Contributor

Thanks for all of the feedback. Since there is a consensus for the initial plan (steps 1 and 2 of #747 (comment)), I will start development for those. Once we get to a stage were we can start the rest of the proposal, we can summarise and continue these threads in a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request open for discussion There is no clear or immediate solution, so discussion is encouraged subject: syntax This issue is about the syntax of Rosetta
Projects
None yet
Development

No branches or pull requests

3 participants