Skip to content

Releases: dataspread/releases

Release 0.1

05 Jan 15:55
Compare
Choose a tag to compare

DataSpread: A Spreadsheet-Database Hybrid System

dataspread-fiverr2-cropped

DataSpread is a spreadsheet-database hybrid system, with a spreadsheet frontend, and a database backend. Thus, DataSpread inherits the flexibility and ease-of-use of spreadsheets, as well as the scalability and power of databases. A paper describing DataSpread's architecture, design decisions, and optimization can be found here. DataSpread is a multi-year project, supported by the National Science Foundation via award number 1633755.

Version

The current version is 0.1.

Features

DataSpread is built using PostgreSQL and ZKSpreadsheet, an open-source web-based spreadsheet tool.

DataSpread's version 0.1 enables users to scale to billions of cells and return results for common spreadsheet operations within seconds. It does so via on-demand loading of spreadsheet data.

Like traditional spreadsheet software, DataSpread supports standard spreadsheet book and sheet operations like Load, Rename, Delete, and Import (via XLS and XLSX, and CSV). Any updates to the spreadsheets are automatically saved.

Like traditional spreadsheet software, DataSpread supports the use of 225+ spreadsheet functions, along with formatting and styling operations. It also supports row and column operations like insert, delete, cut, copy, and paste; during insertion and deletion, formulae are updated as is the case in traditional spreadsheet software.

It supports all these operations while scaling to arbitrarily large spreadsheets.

In future releases, DataSpread will support SQL on the spreadsheet frontend, along with other relational algebra-based interactions. It will also support joint formula evaluation and optimization.

Key Design Innovations

  • DataSpread employs a flexible hybrid data model to represent spreadsheet data within a database.
  • DataSpread uses positional indexing techniques to both locate data by position, and keep it up-to-date as the data is updated.
  • DataSpread also employs a LRU caching mechanism to retrieve and keep in memory data from the database on demand.
  • DataSpread also employs speculative fetching to fetch additional data beyond the user's current spreadsheet window.

Setup Instructions:

You can directly use DataSpread via our cloud-hosted site.

To host DataSpread locally you can either use one of the pre-build war files, available here, or build the war file yourself from the source.

Required Software

Building Instructions (To generate a war file)

  1. Clone the DataSpread repository. Alternatively, you can download the source as a zip or tar.gz.

  2. Use Ant to build the war file using the following command. After the build completes the war is available at out/artifacts/DataSpread_war.

    ant
    

Deploying DataSpread locally.

  1. Install PostgreSQL database. Postgres.app is a quick way to get PostgreSQL working on Mac. For other operating systems check out the guides here.

  2. Create a database and an user who has access to the database. Note the database name, username and password. Typically when you have PostgreSQL installed locally the password is blank.

  3. Install Apache Tomcat. You can use the guide here. Make a note of the directory where tomcat is installed. This is known as TOMCAT_HOME in all documentation.

  4. Update the Tomcat configuration. You need to update the following two files, which are present in conf folder under TOMCAT_HOME folder.

    1. web.xml by adding the following text at the end of the file before the closing XML tag.
    <listener>
        <listener-class>org.zkoss.zss.model.impl.DBHandler</listener-class>
    </listener>
    
    1. context.xml by adding the following text at the end of the file before the closing XML tag.
    <Resource name="jdbc/ibd" auth="Container"
              type="javax.sql.DataSource" driverClassName="org.postgresql.Driver"
              url="jdbc:postgresql://127.0.0.1:5432/<database_name>"
              username="<username>" password="<password>"
                  maxTotal="20" maxIdle="10" maxWaitMillis="-1" defaultAutoCommit="false" accessToUnderlyingConnectionAllowed="true"/>
    

    Replace <database_name>, <username> and <password> with your PostgreSQL's database name, user name and password respectively.

  5. Copy postgresql-9.4.1208 (Download from here) to lib folder under TOMCAT_HOME. It is crucial to have the exact version of this file.

  6. Deploy the war file within Tomcat. This can be done via Tomcat's web interface or by manually copying the war file in the webapps folder under TOMCAT_HOME.

  7. Now you are ready to run the program. Visit the url where Tomcat is installed. It will be typically http://localhost:8080/DataSpread_war/ for a local install.

License

MIT