Skip to content

diku-dk/EventBenchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Online Marketplace Microservice Benchmark Driver

======================================

Online Marketplace is a benchmark modeling an event-driven microservice system in the marketplace application domain. It is design to reflect emerging data management requirements and challenges faced by microservice developers in practice. This project contains the benchmark driver for Online Marketplace. The driver is responsible to manage the lifecycle of an experiment, including data generation, data population, workload submission, and metrics collection.

Table of Contents

Online Marketplace Benchmark Driver

Prerequisites

  • .NET Framework 7
  • IDE (if you want to modify or debug the code): Visual Studio or VSCode
  • A multi-core machine with appropriate memory size in case generated data is kept in memory
  • Linux- or MacOS-based operating system

Application Domain

Online Marketplace models the workload of an online marketplace platform. Experiencing a growing popularity, such platforms offer an e-commerce technology infrastructure so multiple retailers can offer their products or services to a large consumer base.

Required APIs

The driver requires some HTTP APIs to be exposed in order to setup the target data platform prior to workload submission and to be able to actually submit transaction requests.

API HTTP Request Type Miroservice Description
/cart/{customerId}/add PUT Cart Add a product to a customer's cart
/cart/{customerId}/checkout POST Cart Checkout a cart
/cart/{customerId}/seal POST Cart Reset a cart
/customer POST Customer Register a new customer
/product POST Product Register a new product
/product PATCH Product Update a product's price
/product PUT Product Replace a product
/seller POST Seller Register a new seller
/seller/dashboard/{sellerId} GET Seller Retrieve seller's dashboard for a given a seller
/shipment/{tid} PATCH Shipment Update packages to 'delivered' status
/stock POST Stock Register a new stock item

For the requests that modify microservices' state (POST/PATCH/PUT), refer to classes present in Common to understand the expected payload.

Implementations

There are two stable implementations of Online Marketplace available: Orleans and Statefun. In case you want to reproduce experiments, their repositories contain instructions on how to configure and deploy Online Marketplace.

The Dapr implementation is available, but outdated and possibly show bugs. Use with precaution. We intend to update the Online Marketplace on Dapr as soon as time allows.

Benchmark Driver

Description

The driver is written in C# and takes advantage over the thread management facilities provided by the .NET framework. It is strongly recommended to analyze the subprojects Orleans and Statefun to understand how to extend the driver to run experiments in other data platforms. Further instructions will be included soon.

Data Generation

The driver uses DuckDB to store and query generated data during the workload submission. Besides storing data in DuckDB filesystem, it is worthy noting that users can also generate data in memory to use in experiments. More info about can be found in Config. The benefit of persisting data in DuckDB is that such data can be safely reused in other experiments, thus decreasing experiment runs' overall time.

The library DuckDB.NET is used to bridge .NET with DuckDB. However, the library only supports Unix-based operating systems right now. As the driver depends on the data stored in DuckDB, unfortunately it is not possible to run the benchmark in Windows-based operating systems.

Furthermore, we use additional libraries to support the data generation process. Dapper is used to map rows to objects. Bogus is used to generate faithful synthetic data.

Configuration

The driver requires a configuration file to be passed as input at startup. The configuration prescribes several important aspects of the experiment, including the transaction ratio, the target microservice API addresses, the data set parameters, the degree of concurrency, and more. An example configuration, with comments included when the parameter name is not auto-explanable, is shown below.

{
    "connectionString": "Data Source=file.db", // defines the data source. if in-memory, set "Data Source=:memory"
    "numCustomers": 100000,
    "numProdPerSeller": 10,
    "qtyPerProduct": 10000,
    "executionTime": 60000, // prescribes each experiment's run total time
    "epoch": 10000, // defines whether the output result will show metrics 
    "delayBetweenRequests": 0,
    "delayBetweenRuns": 0,
    // the transaction ratio
    "transactionDistribution": {
        "CUSTOMER_SESSION": 30,
        "QUERY_DASHBOARD": 35,
        "PRICE_UPDATE": 38,
        "UPDATE_PRODUCT": 40,
        "UPDATE_DELIVERY": 100
    },
    "concurrencyLevel": 48,
    "ingestionConfig": {
        "strategy": "WORKER_PER_CPU",
        "concurrencyLevel": 32,
        // these entries are mandatory
        "mapTableToUrl": {
            "sellers": "http://orleans:8081/seller",
            "customers": "http://orleans:8081/customer",
            "stock_items": "http://orleans:8081/stock",
            "products": "http://orleans:8081/product"
        }
    },
    // it defines the possible multiple runs this experiment contains
    "runs": [
        {
            "numProducts": 100000,
            "sellerDistribution": "UNIFORM",
            "keyDistribution": "UNIFORM"
        }
    ],
    // defines the APIs that should be contact at the end of every run
    "postRunTasks": [
    ],
    // defines the APIs that should be contact at the end of the experiment
    "postExperimentTasks": [
        {
            "name": "cleanup",
            "url": "http://orleans:8081/cleanup"
        }
    ],
    // defines aspects related to customer session
    "customerWorkerConfig": {
        "maxNumberKeysToAddToCart": 10,
        "minMaxQtyRange": {
            "min": 1,
            "max": 10
        },
        "checkoutProbability": 100,
        "voucherProbability": 5,
        "productUrl": "http://orleans:8081/product",
        "cartUrl": "http://orleans:8081/cart",
        // track which tids have been submitted
        "trackTids": true
    },
    "sellerWorkerConfig": {
        // adjust price percentage range
        "adjustRange": {
            "min": 1,
            "max": 10
        },
        "sellerUrl": "http://orleans:8081/seller",
        "productUrl": "http://orleans:8081/product",
        // track product update history
        "trackUpdates": false
    },
    "deliveryWorkerConfig": {
        "shipmentUrl": "http://orleans:8081/shipment"
    }
}

Other example configuration files are found in Configuration.

Running an Experiment

Once the configuration is set, and assuming the target data platform is up and running (i.e., ready to receive requests), we can initialize the benchmark driver. In the project root folder, run the following commands for the respective data platforms:

  • Orleans
dotnet run --project Orleans <configuration file path>
  • Statefun
dotnet run --project Statefun <configuration file path>

Driver Menu

In both cases, the following menu will be shown to the user:

 Select an option:
 1 - Generate Data
 2 - Ingest Data
 3 - Run Experiment
 4 - Ingest and Run (2 and 3)
 5 - Parse New Configuration
 q - Exit

Through the menu, the user can select specific benchmark tasks, including data generation (1), data ingestion into the data platform (2), and workload submission (3). In case the configuration file has been modified, one can also request the driver to read the new configuration (5) without the need to restart the driver.

At the end of an experiment cycle, the results collected along the execution are shown in the screen and stored automatically in a text file. The text file indicates the execution time, as well as some of the parameters used for faster identification of a specific run.

Supplemental Material

Design

Data Generation. Ingestion Manager. Workload Manager.

Workers

  • Customer worker. Simulate a customer session
  • Seller worker. Simulate a seller session
  • Delivery worker. Simulate an external system requesting package updates

Statistics Collection.

Tracking Replication Correctness

The Online Marketplace implementation targeting Microsoft Orleans supports tracking the cart history (make sure that the options StreamReplication and TrackCartHistory are set to true). By tracking the cart history, we can match the items in the carts with the history of product updates. That enables the identification of possible causal anomalies related to updates in multiple objects.

To enable such anomaly detection in the driver, make sure the options "trackTids" in customerWorkerConfig and "trackUpdates" in sellerWorkerConfig in the configuration file are set to true. By tracking the history of TIDs for each customer cart, we can request customer actors in Orleans about the content of their respective carts submitted for checkout. With the cart history, we match historic cart items with the history of product updates (tracked by driver's seller workers) to identify anomalies.

We understand these settings are sensible and prone to error. We are looking forward to improve such settings in the near future.

Driver Scalability

The project DriverBench can run simulated workload to test the driver scalability. That is, the driver's ability to submit more requests as more computational resources are added.

There are three impediments that refrain the driver from being scalable: a - Insufficient computational resources b - Contended workload c - The target platform itself

"a" can be mitigated with more CPUs and memory (to hold data in memory if necessary) "b" does not occur if uniform distribution is used. However, when using non-uniform distribution, the task is tricky because there could be some level of synchronization in the driver to make sure updates to a product are linearizable. Adjusting the zipfian constant can alleviate the problem in case non-uniform distribution is really necessary. "c" can be mitigated by (i) tuning the target data platform, (ii) increasing computational resources in the target platform, (iii) co-locating the driver with the data platform (remove network latency)

Future Work

We intend to count the "add item to cart" operation as a measured query in the driver. In the current implementation, although the add item operation is not counted as part of the latency of a customer checkout, capturing the cost of an "add item" allows capturing the overall latency of the customer session as a whole and not only the checkout operation.

Etc

Useful links

About

Driver for Online Marketplace Microservice Benchmark

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages