Skip to content

Tsunaou/mongodb

 
 

Repository files navigation

Checking Snapshot Isolation of MongoDB

This is the Jepsen runner of MongoDB for our paper in [2111.14946] Verifying Transactional Consistency of MongoDB (arxiv.org).

This project is responsible for running MongoDB transactions under both replica set and sharded cluster deployments and collecting their execution histories into local files.

Execution history will generated in ./store/latest/ after each test and 3 main file will be taken into checking, including:

  1. history.edn: The execution history of transactions executed by Jepsen test framework.
  2. txns.json: The information of transactions parsing from the oplogstored in MongoDB's collection oplog.rs.
  3. mongod.json: The information of transactions parsing from log file mongod.log in each primary.

Our MongoDB-SI-Checker will take them as input and then check this execution history against its corresponding snapshot variant: REALTIME-SI for replica set and SESSION-SI for sharded cluster.

The architecture of the project is provided as following.

0. Update

  • The version v0.2 with proof will be announced in arxiv at Tue, 7 Dec 2021 01:00:00 GMT.

1. Configuration

1.1 Directly Configuration

To run this project, you NEED:

  • 1 control node for running this Jepsen testing.
    • More details for control node can be can be found in jepsen-io.
    • You also need python 3.6+ environment and install pymongo package.
  • At least 3 nodes for deploying a MongoDB replica set deployment.
  • At least 9 nodes for deploying a MongoDB sharded cluster deployment.

1.2 Docker Compose Configuration

Since the requirements for directly configuration is a bit high, we provide a docker-compose that helps you to run this project as easy as possible and and you can reproduce our experiments at low cost. Read docker-compose-README for more details.

2. Getting started

2.1 Usage

# Usage
lein run
lein run test --help

2.2 Examples

lein run test-all -w wr -d replica-set \
--nodes-file ~/nodes-replica -r 1000 --concurrency 3n --timeout-per-txn 10000 --max-txn-length 8 \
--time-limit 60 --max-writes-per-key 128 \
--txn-read-concern snapshot --txn-write-concern majority \
--nemesis-interval 1 --nemesis partition --test-count 1 --leave-db-running

This example will run a 60 seconds read-write-register test with write concern "majority" and read concern "snapshot" in each transaction in a replica-set. 3n clients will concurrently issue operations to the database where n is the size of a replica set we test. There are at most 8 operations or each transaction and for each key there are at most 128 write operations write into it.

Actually, We disable the nemesis in the source code, so the partition here is meaningless.

  • 60 seconds
    • (--time-limit 60)
  • read-write-register test
    • (-w wr)
  • with write concern "majority" and read concern "snapshot" in each transaction
    • (--txn-read-concern snapshot --txn-write-concern majority)
  • in a replica-set.
    • (-d replica-set)
  • 3n clients will concurrently issue operations to the database where n is the size of a replica set we test.
    • (--concurrency 3n)
  • There are at most 8 operations or each transaction
    • (--max-txn-length 8)
  • and for each key there are at most 128 write operations write into it.
    • (--max-writes-per-key 128 )
  • Actually, We disable the nemesis in the source code, so the partition here is meaningless.

More details for the parameters can be found by lein run test --help

You can get the help information by running lein run test --help.

2.3 Scripts

For checking, we need more processing excluding running a pure jepsen test, like parsing oplog or mongod.log. We have encapsulated the steps of getting supplementary files for checking the transactions in replica set and the shard set in a scripts. You can quickly test it by running:

  • run-replica.sh
  • run-sharded.sh

You need input additional 2 arguments

  • args[0]: time-limit(seconds) for this testing.
  • args[1]: max-txn-length for each transaction.

3. Datas

A sample of the run output can be obtained in Original Data(Execution History) of MongoDB(replica set and sharded cluster)

4. License

Copyright © 2021 Hongrong Ouyang

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.

About

Checking Causal and Transactional Consistency of MongoDB

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 91.1%
  • Clojure 5.0%
  • Shell 2.5%
  • Python 1.2%
  • Dockerfile 0.2%