Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 100: Public API for information on synchronization #318

Open
preguica opened this issue Oct 3, 2017 · 11 comments
Open

RFC 100: Public API for information on synchronization #318

preguica opened this issue Oct 3, 2017 · 11 comments

Comments

@preguica
Copy link

preguica commented Oct 3, 2017

RFC 100: Public API for information on synchronization

Authors: Nuno Preguiça, Gonçalo Cabrita, Jean Araújo

Status: Under discussion
(Under discussion / Rejected / Approved: assigned to X / Implemented: pull request X)

Context and Goal

Context

In AntidoteDB, updates are propagated asynchronously among DCs. A transaction accesses a snapshot of the stable updates known at the local DC (with stable meaning that atomicity of updates in a transaction and causal dependencies across transactions are satisfied).

Thus, when a transaction reads a value in a DC, it is possible some updates to that value have already been committed in some other DCs. Additionally, during the execution of a transaction, some other transaction may modify the read value in the same (or other) DC -- these updates are also not visible to the running transaction.

Goal

Provide an API to allow applications to have information about the up-to-date'ness of read data. This allows the application to reason about the staleness of read data, based on information available in the local DC.

We can define potential staleness as the time period for which the read data may be stale. This can be computed from the last synchronization with remote DCs, which establish the earliest time when an unseen update may have been issued (at some other DC).

Use Case

Synchronization status

A user-facing application may want to provide information to the user on the last time the local replica has been synchronized with the remote DCs.

Information on known changes

An application may want to prevent executing transactions on data that it knows has been modified concurrently.

API changes

External: AntidoteDB + protobuf

  • last_sync([bound_object()], TxId) -> {ok, [clock_time()]}.
    For each object, it returns the earliest time it synchronized for the last time with some DC replicating the object -- let t1 be the last time it synchronized with DC 1, the result for object o is min(ti), with i the DCs where o is replicated.

  • last_sync_detail([bound_object()], TxId) -> {ok, [vectorclock()]}.
    The same as before, but detailing the information of last synchronization with each DC.

  • known_stale([bound_object()], TxId) -> {ok, [boolean()]}.
    For each data item, it returns whether the local DC has received an update that has not been reflected in the transaction snapshot.

Internal: transaction manager

  • last_sync([bound_object()], TxId) -> {ok, [clock_time()]}.
  • last_sync_detail([bound_object()], TxId) -> {ok, [vectorclock()]}.
  • known_stale([bound_object()], TxId) -> {ok, [boolean()]}.
    As in the external API.

Internal: log

  • has_new_updates([bound_object(),vectorclock()]) -> {ok, [integer()]}.
    Returns the number of known updates not reflects in the given vectorclock.

Design

  • last_sync([bound_object()], TxId) -> {ok, [clock_time()]}.

The return value can be computed from the snapshot time of the transaction. It is only necessary to process the operation in the component that records the snapshot time for the transaction.

API.last_sync([bound_object()], TxId) -> 
	TM.last_sync([bound_object()], TxId). 

TM.last_sync( list, TxId) -> 
        snapshot_time = get_snapshot_time( TxId)
        return {ok, list.map(key -> 
                            replicas = get_replica_set( key)
                            min( snapshot_time.filter( replicas)) ) }  

NOTE: this implementation supports partial replication.

  • last_sync_detail([bound_object()], TxId) -> {ok, [vectorclock()]}.

The return value can be computed from the snapshot time of the transaction. It is only necessary to process the operation in the component that records the snapshot time for the transaction.

API.last_sync_detail([bound_object()], TxId) -> 
	TM.last_sync_detail([bound_object()], TxId). 

TM.last_sync_detail( list, TxId) -> 
        snapshot_time = get_snapshot_time( TxId)
        return {ok, list.map(key -> 
                            replicas = get_replica_set( key)
                            snapshot_time.filter( replicas) ) }  
  • known_stale([bound_object()], TxId) -> {ok, [boolean()]}.
API.known_stale([bound_object()], TxId) -> 
	TM.known_stale([bound_object()], TxId). 

TM.known_stale([bound_object()], TxId) -> 
        snapshot_time = get_snapshot_time( TxId)
        return {ok, list.map(key -> 
                            LOG.has_new_updates([key,snapshot_time] > 0) ) }  
  • LOG.has_new_updates([bound_object(),vectorclock()]) -> {ok, [integer()]}.

Return value based on the log information.

Implementation

Straightforward from the design.

Propagate calls to clocksi_readitem_server (the same way a read is done), where the snapshot time of the transaction is known.

Prior discussion

This feature has been discussed in the past in:

  • Syncfree deliverable D3.2 [https://syncfree.lip6.fr/attachments/article/46/d3.2.pdf]
  • Deepthi Devaki Akkoorath, Viktória Fördós, Annette Bieniusa: Observing the consistency of distributed systems. Erlang Workshop 2016: 54-55.

Record of changes

04-10-2017: first complete version

@bieniusa
Copy link
Contributor

bieniusa commented Oct 4, 2017

The last_sync information is very similar to the potential staleness that we are already calculating, right?
One issue that I see is that this information is divergent between different nodes. This means depending on where the respective clocksi_readitem_server is located, you might get different results. Are you planning to adapt the meta-data distribution to prevent this behaviour?

known_stale should be comparatively easy to compute; though, this will depend on the intraDC replication that we will establish.

@deepthidevaki What is your take?

@deepthidevaki
Copy link
Member

  • last_sync if calculated from the snapshot_time, can be used to calculate the potential staleness. last_sync gives the exact time, while staleness is the difference between current time and what is being read(i.e. snapshot time). In this case, we don't need to propagate the call to clocksi_readitem_server, because transaction coordinator knows the snapshot time. Another way to get last_sync is to get it from the vector_clock of the partition which will be updated everytime when it receives a remote update. In this case, it does not reflect what is being read in the transaction, but what is available in the DC.
  • known_stale can be computed by materializer by checking the number of updates not included in the snapshot. Remote updates are immediately cached in the materializer, so it is up-to-date with what is in the log.

@preguica
Copy link
Author

preguica commented Oct 4, 2017 via email

@preguica
Copy link
Author

preguica commented Oct 4, 2017 via email

@bieniusa
Copy link
Contributor

bieniusa commented Oct 4, 2017 via email

@preguica
Copy link
Author

preguica commented Oct 4, 2017 via email

@deepthidevaki
Copy link
Member

It seemed to me that this would be only true if the materializer receives
all updates and this might not be the case for remote updates, but this is
an interesting question: what is the inter-relation between the
materializer and the inter-dc replication protocol?

Materializer receives all updates received by inter-dc protocol. Remote updates are cached in the materializer at the same time it is written to the log.

@preguica
Copy link
Author

preguica commented Oct 6, 2017 via email

@bieniusa
Copy link
Contributor

bieniusa commented Oct 6, 2017 via email

@deepthidevaki
Copy link
Member

In the proposed design last_sync and last_sync_detail are calculated from the snapshot_time of the transaction irrespective of the list of bound_objects being passed. So bound_objects as parameters is unnecessary. Do you think, in a future implementation this will be needed, where last_sync is calculated differently for each keys irrespective of transaction?

@preguica
Copy link
Author

preguica commented Oct 10, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants