Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-package composite dars #18000

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

samuel-williams-da
Copy link
Contributor

@samuel-williams-da samuel-williams-da commented Dec 8, 2023

#18000! 馃帀

Implements composite-dars (Often unofficially called fat-dars)
This is an easy way to bunch together a set of Dars into a single dar, without needing an extra daml.yaml and source files that import every module.

It takes the form of an extension to multi-package.yaml and a flag.

multi-package.yaml

packages:
  - ./my-package1
  - ...
projects:
  - ./my-sub-project
# \/ New! \/
composite-dars:
  - name: main-production # Name of the package containing the listed sub-packages
    version: 0.0.0 # Version of above ^
    packages: # Paths to directories with daml.yamls to include in the composite dar
      - ./test-proj3 
      - ./test-proj4
    dars: # Paths to explicit dars to include in the composite dar
      - ./libs/dars/my-lib.dar
    # Note that both packages and dars are option, there must simply be at least one package OR dar
    # Finally, required output path
    path: ./my-composite-dar.dar
    # Several composite dars can be specified
  - name: another-composite-dar
    ...

To build this dar, we add the --composite-dar=NAME flag to daml build
daml build --composite-dar main-production
This will build the composite dar itself, and any packages it depends on.

We also add --all-composite-dars which does as you'd expect.

Notes:

  • --all does not build composite dars, but can be used with the composite-dar flag to give the union of their behaviours
  • composite dars can only be built from the "top" multi-package.yaml, sub-projects cannot have their composite dars built without explicitly setting --multi-package-path to point to them. The CLI will warn you if you attempt to do this, or if you shadow composite-dar names in sub-projects. This behaviour is to avoid awkward shadowing and requiring another flag to specify which composite-dar you want when they collide.
  • Since the root level packages: and projects: fields in multi-package.yaml are optional, the composite-dar feature can be used without using multi-package build behaviour, by only specifying composite-dars in the multi-package.yaml, and only using the dars list within each dar.
  • The --composite-dar flag can be specified multiple times to build many composite-dars in one daml build invocation.

Tested locally that codegen still works on the generated dars.

TODO:

  • Write proper tests for this new behaviour
  • Check with curtis if this covers the needs of our clients
  • Consider how this might play with an external build system

Comment on lines 131 to 132
extractNameAndPackageIdFromPath :: FilePath -> (T.Text, LF.PackageId)
extractNameAndPackageIdFromPath = bimap (T.pack . intercalate "-") (LF.PackageId . T.pack) . fromJust . unsnoc . wordsBy (=='-') . takeBaseName
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Reconsider this safety...
The PackageId can be derived from the ArchivePayload
The unit-id requires the entire package to be read through proto, so we can read the metadata. Not ideal, perhaps theres some way to lazily parse?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been looking into this for longer than I wanted but couldn't get anywhere; the main blocker is that proto3-suite/proto3-wire are strict by default because it reads in a Parser monad so later results depend on earlier ones.

One thing we could do is make a stripped down version of the proto files with just ArchivePayload, Package, PackageMetadata and the interned strings, plus the attending haskell files for that, and use that for this purpose, but that sounds like a lot of copying and pasting.

We could also take the daml2/3 break to move the metadata up into an intermediate level, I don't think we've made any promises for the daml3 encoded dars yet?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IThe proto format of LF2 is definitely not set in stone, but I'd double-check with @remyhaemmerle-da whether that's worth it because it would be our first truly breaking change.


let lfVersion =
case nubOrd $ LF.versionMajor <$> darLfVersions of
-- Note that this will not select dev. We may want to detect if any of the packages are `X.dev` and use this if so (and warn)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I think here we should just treat X.dev as the upper bound, not as a special case. If any of the included dars are on LF X.dev, that must have been an intentional choice in a nearby daml.yaml, or someone is trying to make a composite dar with an X.dev .dalf that doesn't otherwise interact with the rest, in which case canton will let them know.

From another point of view, other teams will soon start building projects on 2.dev, so this would prevent them from packaging those projects

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that .dev should work and be taken as the max - esp. for the dogfooding reasons Moises mentioned.

conf = mkConfFile name (Just version) [] Nothing [] pkgId
-- Dalfs included unique by packageId
dalfs = nubOrdOn thd3 $ concatMap dalfsToList darDalfs
dar = createArchive name (Just version) unresolvedBuiltinSdkVersion pkgId dalf dalfs "." [] [conf] []
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is using unresolvedBuiltinSdkVersion reasonable here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh tough one but I think that's the best we can do, I don't think it would make sense to take e.g. the max of each included dar's sdk because that wouldn't play well with 0.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but that said - we have custom ordering on our versions such that 0.0.0 is maximum, so that would work. But I do think it's misleading to say a dar was "created" with an sdk version it wasn't. We don't maintain sdk versions of dependencies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have custom ordering on our versions such that 0.0.0 is maximum

oh I didn't know/remember this, thanks!

But I do think it's misleading to say a dar was "created" with an sdk version it wasn't

yes exactly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - the unresolved version is sufficient here. We don't need to resolve versions to order them as greatest.

run-all-tests: true
@samuel-williams-da
Copy link
Contributor Author

In a separate PR (after a little discussion), we will add glob support to all .dar fields in the daml.yaml, including composite dars, for ease of use and consistency.

Comment on lines +231 to +232
pkg = LF.Package { LF.packageLfVersion = lfVersion, LF.packageModules = NM.empty, LF.packageMetadata = Just pkgMeta }
(dalf, pkgId) = encodeArchiveAndHash pkg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I started reviewing this PR I was very curious what would be the main dalf of a composite .dar; it turned out to be a package without any modules. I wonder if anything on daml or canton relies on there being at least one module in a dalf, but I guess we'll find out :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a bit anxious about this - atm all composite DAR functionality is only in multi-package.yaml, so we can expect the only people who use it will know it's experimental.

Comment on lines 130 to 131
extractNameAndPackageIdFromPath :: FilePath -> (T.Text, LF.PackageId)
extractNameAndPackageIdFromPath = second LF.PackageId . T.breakOnEnd "-" . T.pack . takeBaseName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry I forgot to mention, fst . T.breakOnEnd "-" will keep a trailing hyphen so we need something like this

Suggested change
extractNameAndPackageIdFromPath :: FilePath -> (T.Text, LF.PackageId)
extractNameAndPackageIdFromPath = second LF.PackageId . T.breakOnEnd "-" . T.pack . takeBaseName
extractNameAndPackageIdFromPath :: FilePath -> Maybe (T.Text, LF.PackageId)
extractNameAndPackageIdFromPath = bimapM (T.stripSuffix "-") (pure . LF.PackageId) . T.breakOnEnd "-" . T.pack . takeBaseName

One benefit here is that it forces us to handle the Nothing case, where the basename breaks our assumptions and doesn't actually contain a hyphen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh i saw this in the docs and just completely forgot when implementing, cheers.

Comment on lines +453 to +454
, test "Building a transitive composite dar gives an error including where the dar is" ["--composite-dar=unshadowed-transitive"] "./first" compositeDarShadowingProject
$ Left "Found definition for unshadowed-transitive in the following sub-projects"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this fail if there's a single definition of unshadowed-transitive?

Copy link
Contributor

@akrmn akrmn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this through @samuel-williams-da!
Just one remaining question about the test case Building a transitive composite dar gives an error including where the dar is

Copy link
Contributor

@dylant-da dylant-da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me too, thank you!

compositeDarObjects <- fromMaybe [] <$> queryMultiPackageConfig ["composite-dars"] multiPackage
mpCompositeDars <- traverse parseCompositeDar compositeDarObjects
-- n^2 but this list is usually so small that its not worth complicating the logic
forM_ mpCompositeDars $ \cd ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth spinning this into its own utility function "findDuplicates", since we do this pretty often throughout the codebase I find. We can save it for gardening.

<$> (MultiPackageConfigFields <$> traverse canonicalizePath packagePaths)
<$>
( MultiPackageConfigFields
<$> traverse canonicalizePath (mpPackagePaths mpc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth getting rid of canonicalizePath here for something without IO, again potential gardening

[] -> do
hPutStrLn stderr $ "Couldn't find composite dar with the name " <> T.unpack (LF.unPackageName darName) <> " in " <> mpPath multiPackageConfig
warnTransitiveCompositeDar stderr multiPackageConfig darName
exitFailure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we exit with failure after forM is done running, s.t. an error/warning is printed for every composite dar asked for? This maximizes the amount of useful info for a call.

then toNormalizedFilePath' <$> mpPackagePaths multiPackageConfig
else nubOrd $ concatMap (fmap toNormalizedFilePath' . cdPackages) compositeDars
void $ uses_ BuildMulti packagePaths
liftIO $ traverse_ (buildAndWriteCompositeDar loggerH buildableDataDeps) compositeDars
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call and its behaviour feels like something that could live in Shake, but this PR is big already so best to put into another PR

Comment on lines +231 to +232
pkg = LF.Package { LF.packageLfVersion = lfVersion, LF.packageModules = NM.empty, LF.packageMetadata = Just pkgMeta }
(dalf, pkgId) = encodeArchiveAndHash pkg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a bit anxious about this - atm all composite DAR functionality is only in multi-package.yaml, so we can expect the only people who use it will know it's experimental.

conf = mkConfFile name (Just version) [] Nothing [] pkgId
-- Dalfs included unique by packageId
dalfs = nubOrdOn thd3 $ concatMap dalfsToList darDalfs
dar = createArchive name (Just version) unresolvedBuiltinSdkVersion pkgId dalf dalfs "." [] [conf] []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - the unresolved version is sufficient here. We don't need to resolve versions to order them as greatest.


let lfVersion =
case nubOrd $ LF.versionMajor <$> darLfVersions of
-- Note that this will not select dev. We may want to detect if any of the packages are `X.dev` and use this if so (and warn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that .dev should work and be taken as the max - esp. for the dogfooding reasons Moises mentioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants