Skip to content
This repository has been archived by the owner on Oct 7, 2021. It is now read-only.
Andy Pavlo edited this page Aug 8, 2017 · 13 revisions

Welcome to the Carnegie Mellon Database Application Catalog (CMDBAC) wiki.

What is CMDBAC?

The CMDBAC is a collection of open-source database applications that you can run locally for benchmarking and experimentation. We have created an on-line repository that allows you to search for applications that have workload properties that are relevant to your research.

How Does CMDBAC Accomplish Its Goals?

The first component is a crawler that finds database applications hosted on open-source repositories (e.g., GitHub). The crawler uses heuristics that allows it to identify whether a project uses a database for storage. We target web-based applications that use well-known web frameworks. Thus, we can identify whether a project is relevant if its source code references libraries from one of these frameworks.

We then developed a tool for automatically deploying an application in a VM sandbox. Targeting applications that use the common web frameworks listed above makes this step easier because they provide an object-relational mapping library that does not depend on a particular DBMS. Their configurations are also likely to be the same (e.g. setting the DBMS credentials in a common configuration file).

Current Status

The CMDBAC currently contains over 1000 applications of varying complexity. We target Web applications based on popular programming frameworks because (1) they are easier to find and (2) we can automate the deployment process. We support applications that use the Django, Ruby on Rails, Drupal, Node.js, and Grails frameworks.