Skip to content

Releases: digital-asset/canton

canton v2.8.5

26 Apr 16:30
bdbb9b2
Compare
Choose a tag to compare

Release of Canton 2.8.5

Canton 2.8.5 has been released on April 26, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.

Summary

This is a maintenance release, containing a bugfix. Users are advised to upgrade during their maintenance window.

Bugfixes

(24-008, Major): Deadlock in Topology Dispatcher

Issue Description

When a topology change is sent to the sequencer and there is a problem with the transmission or sequencing, the topology dispatcher waits forever to receive back the corresponding topology transaction. Later topology changes queue behind this outstanding topology change until the node is restarted.

Affected Deployments

Participant, Domain and Domain Topology Manager nodes

Affected Versions

All 2.3-2.7 2.8.0-2.8.4

Impact

The node cannot issue topology changes anymore.

Symptom

Varies with the affected topology transaction: Party allocations time out, uploaded packages cannot be used in transactions (NO_DOMAIN_FOR_SUBMISSION), cryptographic keys can not be changed, etc.

The failure of sequencing can be seen in DEBUG logging on the node:

DEBUG c.d.c.t.StoreBasedDomainOutbox:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Attempting to push .. topology transactions to Domain ...
DEBUG c.d.c.s.c.t.GrpcSequencerClientTransport:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Sending request send-async-versioned/f77dd135-9c6a-4bd8-a6ed-a9a9f3ef43ca to sequencer.
DEBUG c.d.c.s.c.SendTracker:... tid:1da8f7fff488dad2fc4c9c0177633a7e - Sequencer send [f77dd135-9c6a-4bd8-a6ed-a9a9f3ef43ca] has timed out at ...

where the last log line by default comes 5 minutes after the first two.

Workaround

Restart the node.

Likeliness

Occurs for unstable network conditions between the node and the sequencer (e.g., frequent termination of a subset of the connections by firewalls) and when the sequencer silently drops submission requests.

Recommendation

Upgrade to 2.8.5

Compatibility

The following Canton protocol versions are supported:

Dependency Version
Canton protocol versions 3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency Version
Java Runtime OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode)
Postgres Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2)
Oracle 19.20.0

canton v2.8.4

16 Apr 15:48
7f2f46b
Compare
Choose a tag to compare

Release of Canton 2.8.4

Canton 2.8.4 has been released on April 16, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.

Summary

This is a maintenance release containing small improvements and bug fixes.

What’s New

Node's Exit on Fatal Failures

When a node encounters a fatal failure that Canton unexpectedly cannot handle gracefully,
the new default behavior is that the node will exit/stop the process and relies on an external process or service monitor to restart the node's process. Without this exit, the
node would remain in a half-broken state, requiring a manual restart.

Failures that are currently logged as an error but not automatically recovered from are:

  1. Unhandled exceptions when processing events from a domain, which currently leads to a disconnect from that domain.
  2. Failed transition from an active replica to a passive replica, which may result in an invalid state of the node.

The new behavior can be reverted by setting: canton.parameters.exit-on-fatal-failures = false in the configuration, but
we encourage users to adopt the new behaviour.

Minor Improvements

Error code changes

  • When an access token expires and ledger api stream is terminated an ABORTED(ACCESS_TOKEN_EXPIRED) error is returned.

Fixed error cause truncation

Before, error causes of the documented errors were truncated after 512 characters. This behaviour is necessary when transporting errors through GRPC,
as GRPC imposes size limits, but the limit was also applied to errors in logs, which caused relevant information to be truncated.
Now, the 512 bytes limit is not applied to the errors written to the logs anymore.

Configuration Changes

Token Expiry Grace Period for Streams

When a token used in the ledger api request to open a stream expires, the stream is terminated. This normally happens
several minutes or hours after the stream initiation.
Users can now configure a grace period that will protect the stream termination beyond the token expiry:

   canton.participants.participant1.parameters.ledger-api-server-parameters.token-expiry-grace-period-for-streams=600.seconds

Grace period can be any non-negative duration where both the value and the units must be defined e.g. "600.seconds" or "10.minutes".
When parameter is omitted, grace period defaults to zero. When the configured value is Inf the stream is never terminated.

(24-007, Major): Domain reconnect does not complete timely during participant failover

Issue Description

During participant failover the newly active participant does not complete reconnecting to the domain and fail to declare itself as active in a timely manner.
This can happen when the passive replica, which has become active, has been idle for a longer time while there has been many transactions processed by the former active replica.

Affected Deployments

Participant

Affected Versions

2.3-2.6,
2.7.0-2.7.7
2.8.0-2.8.3

Impact

During participant failover the newly active participant does not become active and does not serve commands and transactions in a timely manner.

Symptom

"One of the last log entries before the participant only outputs storage health checks anymore is like:

INFO c.d.c.p.s.d.DbMultiDomainEventLog:participant=participant1 Fetch unpublished from domain::122345... up to Some(1234)

The participant reports itself as not active and not connected to any domains in its health status."

Workaround

Restart the participant node.

Likeliness

It depends on how long the passive replica has been idle, the number and size of transactions that the active replica has processed in the meantime.
These factors influence how timely the failover and in particular reconnect to the domain completes.

Recommendation

Upgrade to 2.8.4

Compatibility

The following Canton protocol versions are supported:

Dependency Version
Canton protocol versions 3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency Version
Java Runtime OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode)
Postgres Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2)
Oracle 19.20.0

canton v2.3.19

28 Mar 14:35
a1281d7
Compare
Choose a tag to compare

Release of Canton 2.3.19

Canton 2.3.19 has been released on March 28, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.

Summary

Canton 2.3.19 has been released as part of Daml 2.3.19 with no additional change on Canton

Compatibility

The following Canton protocol and Ethereum sequencer contract versions are supported:

Dependency Version
Canton protocol versions 2.0.0, 3.0.0
Ethereum contract versions 1.0.0, 1.0.1

Canton has been tested against the following versions of its dependencies:

Dependency Version
Java Runtime OpenJDK 64-Bit Server VM 18.9 (build 11.0.16+8, mixed mode, sharing)
Postgres postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2)
Oracle 19.15.0
Besu besu/v21.10.9/linux-x86_64/openjdk-java-11
Fabric 2.2.2

canton v2.7.9

20 Mar 17:27
3fee380
Compare
Choose a tag to compare

Release of Canton 2.7.9

Canton 2.7.9 has been released on March 20, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.

Summary

This is a maintenance release, providing a workaround for a participant crash recovery issue on protocol version 4.

Bugfixes

This release provides a workaround for a specific participant crash recovery issue under load on protocol version 4 triggered by the bug 23-021 (fixed in protocol version 5).
The workaround can be enabled for a participant through the following configuration, but should only be done when advised so by Digital Asset:
canton.participants.XXX.parameters.disable-duplicate-contract-check = yes.

Compatibility

The following Canton protocol and Ethereum sequencer contract versions are supported:

Dependency Version
Canton protocol versions 3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency Version
Java Runtime OpenJDK 64-Bit Server VM Zulu11.70+15-CA (build 11.0.22+7-LTS, mixed mode)
Postgres postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2)
Oracle 19.18.0

canton v2.7.8

13 Mar 12:57
027f3c2
Compare
Choose a tag to compare
Update 2024-03-12.23 (#144)

Reference commit: 3f22ca358d

Co-authored-by: Canton <canton@digitalasset.com>

canton v2.3.18

15 Mar 11:54
027f3c2
Compare
Choose a tag to compare

Release of Canton 2.3.18

Canton 2.3.18 has been released on March 15, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.

Summary

Canton 2.3.18 has been released as part of Daml 2.3.18 with no additional change on Canton.

Compatibility

The following Canton protocol and Ethereum sequencer contract versions are supported:

Dependency Version
Canton protocol versions 2.0.0, 3.0.0
Ethereum contract versions 1.0.0, 1.0.1

Canton has been tested against the following versions of its dependencies:

Dependency Version
Java Runtime OpenJDK 64-Bit Server VM 18.9 (build 11.0.16+8, mixed mode, sharing)
Postgres postgres (PostgreSQL) 14.11 (Debian 14.11-1.pgdg120+2)
Oracle 19.15.0
Besu besu/v21.10.9/linux-x86_64/openjdk-java-11
Fabric 2.2.2

canton v2.8.3

21 Feb 14:20
beab3a5
Compare
Choose a tag to compare

Release of Canton 2.8.3

Canton 2.8.3 has been released on February 21, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.

Summary

This is a maintenance release, fixing critical issue which can occur if overly large transactions are submitted to the participant.
The bootstrap domain command has been slightly improved, but it may now throw an error if you try to bootstrap an already
initialized domain while the domain manager node is still starting up.

Console changes

Console commands that allow to download an ACS snapshot now take a new mandatory argument to indicate whether
the snapshot will be used in the context of a party offboarding (party replication or not). This allows Canton to
performance additional checks and makes party offboarding safer.

Affected console commands:

  • participant.repair.export_acs
  • participant.repair_download_acs (deprecated method)

New argument: partiesOffboarding: Boolean.

Bugfixes

(24-003, Moderate): Cannot exercise keyed contract after party replication

Issue Description

When replicating a party who is a maintainer on a contract key, the contract cannot be exercised anymore

Affected Deployments

Participant

Affected Versions

  • 2.7
  • 2.8.0-2.8.1

Impact

Contract become unusable

Symptom

The following error is emitted when trying to exercise a choice on the contract:

java.lang.IllegalStateException:
Unknown keys are to be reassigned. Either the persisted ledger state corrupted
or this is a malformed transaction. Unknown keys:

Workaround

None.

Likeliness

Deterministic.

Recommendation

Do not use the party migration macros with contract keys in version 2.8.1. Upgrade to 2.8.2 if you want to use them.

(24-002, Critical): Looping DB errors for transactions with many (32k) key updates

Issue Description

If a transaction with a very large number of key updates is submitted to the participant, the SQL query to update the contract keys table will fail, leading to a database retry loop, stopping transaction processing. The issue is caused by a limit of 32k rows in a prepared statement in the JDBC Postgres driver.

Affected Deployments

Participant on Postgres

Affected Versions

2.3-2.6,
2.7.0-2.7.6
2.8.0-2.8.1

Impact

Transactions can be submitted to the participant, but the transaction stream stops emitting transactions and commands don't appear to succeed.

Symptom

A stuck participant, continuously logging about "Now retrying operation 'com.digitalasset.canton.participant.store.db.DbContractKeyJournal.addKeyStateUpdates'" and "Tried to send an out-of-range integer as a 2-byte value: 40225"

Workaround

Upgrade to a version containing the fix. Alternatively, such a transaction may be ignored, but this must be done with care to avoid a ledger fork in case of multiple involved participants. Contact support.

Likeliness

Deterministic with very large (32k key updates) transactions.

Recommendation

Upgrade at your convenience or if experiencing the error.

(24-004, Major): DB write operations fail due to read-only connections as part of a DB failover

Issue Description

Given a Postgres HA setup with a write and read replica using AWS RDS. During DB failover, Canton connects and remains connected to the former write replica, which results in the DB connections being read-only. This fails all write DB operations on which we retry indefinitely.

Affected Deployments

All nodes with Postgres HA. Only observed with AWS RDS so far, not on Azure.

Affected Versions

All 2.3-2.6
2.7.0-2.7.6
2.8.0-2.8.1

Impact

A node becomes unusable due to all DB write operations failing and being retried indefinitely.

Symptom

The following kind of exceptions are logged:
org.postgresql.util.PSQLException: ERROR: cannot execute UPDATE in a read-only transaction

Workaround

Restart the node to reconnect to the current write replica. Set DNS TTL in JVM networkaddress.cache.ttl=0 reduces the likelihood of this problem, but disables DNS caching entirely

Likeliness

Probable in case of DB failover, only observed in AWS RDS so far.

Recommendation

Upgrade to 2.8.2 if you are using AWS RDS

(24-005, Major): Race condition during domain manager startup may prevent startup after upgrade

Issue Description

A race condition in the domain bootstrap macro and domain manager node initialisation may result in a domain manager overriding the previously stored domain parameters. This can cause the domain manager to be unable to reconnect to the domain after upgrading to version 2.7 or beyond without a hard domain migration to pv=5, as with 2.7, the default protocol version changed to pv=5.

Affected Deployments

Domain Manager Nodes

Impact

The domain manager node is unable to connect to the sequencer. Topology transactions such as party additions have no effect on the domain, as they are not verified and forwarded by the domain manager.

Symptom

Parties added to the participant do not appear on the domain. Transactions referring to such parties are rejected with UNKNOWN_INFORMEE. The domain manager logs complain about "The unversioned subscribe endpoints must be used with protocol version 4"

Workaround

If you don't run bootstrap_domain after the domain was already initialised, the issue won't happen. If a node is affected, you can recover the node explicitly by seting the protocol version to v=4 in the configuration and use the reset-stored-static-config parameter to correct the wrongly stored parameters.

Likeliness

Race condition which can happen if you run the bootstrap_domain command repeatedly on an already bootstrapped domain during startup of the domain manager node. As there is no need to run the bootstrap_domain command repeatedly, this issue can easily be avoided.

Recommendation

Do not run the bootstrap_domain command after the domain was initialised.
A fix has been added to prevent future mistakes.

(24-006, Minor): Large error returned from ACS query cannot be stuffed into HTTP2 headers

Issue Description

A bad ledger api client request results in an error that contains a large metadata resource list and fails the serialization in the netty layer. As a consequence what is sent to the client is an INTERNAL error that doesn't correspond to the actual problem.

Affected Deployments

Participant Nodes

Impact

Incorrect error returned to the ledger client which may complicate trouble-shooting

Symptom

Ledger client observes following error:

INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR

At the same time the participant node reports following error:

io.grpc.netty.NettyServerHandler - Stream Error
io.netty.handler.codec.http2.Http2Exception$HeaderListSizeException: Header size exceeded max allowed size (8192)

Workaround

Rely on participant logs when investigating ledger client issues with logs containing the message

INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR

Likeliness

It is rather difficult for ledger api clients to create bad requests that will end up in errors with long list of resources. One example is when client requests a long list of non existing templates in a filter of a ACS ledger api request.
There was an unrelated bug in the trigger service that could produce long list of templates when AllInDar construction was used in the trigger installation invocation.

Recommendation

Upgrade at your convenience.

Compatibility

The following Canton protocol versions are supported:

Dependency Version
Canton protocol versions 3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency Version
Java Runtime OpenJDK 64-Bit Server VM Zulu11.66+15-CA (build 11.0.20+8-LTS, mixed mode)
Postgres Recommended: PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2), PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2), PostgreSQL 15.6 (Debian 15.6-1.pgdg120+2)
Oracle 19.20.0

canton v2.7.7

20 Feb 10:25
885c1da
Compare
Choose a tag to compare
Update 2024-02-14.08 (#133)

Reference commit: bb64a1d2e9

Co-authored-by: Canton <canton@digitalasset.com>

canton v2.8.1

31 Jan 16:12
aadfd21
Compare
Choose a tag to compare

Release of Canton 2.8.1

Canton 2.8.1 has been released on January 31, 2024. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.

Summary

This is a maintenance release of Canton including reliability fixes and user experience improvements.
Users of the Fabric driver are encouraged to upgrade at their disposal.

What’s New

Minor Improvements

Improved reference configuration example

We have reworked the reference configuration example. The examples/03_advanced_configuration example has been replaced
by a set of reference configuration files which can be found in the config directory. While the previous example
contained pieces which could be assembled into a full configuration, the new examples now contain the full configuration
which can be simplified by removing parts which are not needed.
The installation guide <https://docs.daml.com/canton/usermanual/installation.html>_ has been updated accordingly.

Improved Party Replication Macros

The enterprise version now supports replicating a party from one participant node to another for either migration or
to have multiple participants hosting the same party. Please consult the documentation <https://docs.daml.com/2.9.0/canton/usermanual/identity_management.html#replicate-party-to-another-participant-node>_
on how to use this feature.

Reduced Background Journal Cleaning Load

We have improved the background journal cleaning to produce less load on the database by using smaller transactions to clean up
the journal.

Executor Service Metrics removed

The metrics for the execution services have been removed:

  • daml.executor.runtime.completed*
  • daml.executor.runtime.duration*
  • daml.executor.runtime.idle*
  • daml.executor.runtime.running*
  • daml.executor.runtime.submitted*
  • daml_executor_pool_size
  • daml_executor_pool_core
  • daml_executor_pool_max
  • daml_executor_pool_largest
  • daml_executor_threads_active
  • daml_executor_threads_running
  • daml_executor_tasks_queued
  • daml_executor_tasks_executing_queued
  • daml_executor_tasks_stolen
  • daml_executor_tasks_submitted
  • daml_executor_tasks_completed
  • daml_executor_tasks_queue_remaining

Bugfixes

(24-001, Major): Fabric Block Sequencer may deadlock when catching up

Issue Description

Fabric Ledger block processing runs into an unintended shutdown and fails to process blocks when the
in-memory blocks exceeds 5000. This is caused when catching up after downtime and the Fabric Ledger
size has increased substantially in the meantime.

Affected Deployments

Fabric Sequencer Node

Affected Versions

2.3 - 2.7 and 2.8.0

Impact

The sequencer stops feeding new blocks.

Symptom

Participant gets stuck in an old state and does not visibly catch up against a Fabric Ledger.
Therefore, a domains.reconnect call on the participant may appear as if it is hanging.

On the sequencer node, each processed message is logged using Observed Send with messageId including a
timestamp. Once the emission of these log lines stops, the sequencer is stuck.

Workaround

Restart the sequencer node if it is stuck.

Likeliness

Rarely, only occurs when catching up to a Fabric Ledger that the sequencer node has been an active part of before.

Recommendation

Upgrade to 2.8.1 at your convenience.

Compatibility

The following Canton protocol versions are supported:

Dependency Version
Canton protocol versions 3, 4, 5

Canton has been tested against the following versions of its dependencies:

Dependency Version
Java Runtime OpenJDK 64-Bit Server VM Zulu11.66+15-CA (build 11.0.20+8-LTS, mixed mode)
Postgres Recommended: PostgreSQL 12.17 (Debian 12.17-1.pgdg120+1) – Also tested: PostgreSQL 11.16 (Debian 11.16-1.pgdg90+1), PostgreSQL 13.13 (Debian 13.13-1.pgdg120+1), PostgreSQL 14.10 (Debian 14.10-1.pgdg120+1), PostgreSQL 15.5 (Debian 15.5-1.pgdg120+1)
Oracle 19.20.0

canton v2.8.0-rc4

14 Dec 14:23
067eebf
Compare
Choose a tag to compare

Release candidates such as 2.8.0-rc4 don't come with release notes