Skip to content

TsinghuaDatabaseGroup/DITA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 

Repository files navigation

DITA: Distributed In-Memory Trajectory Analytics

DITA is a distributed in-memory trajectory analytics system based on Apache Spark 2.2.0.

Development

Since we use IntelliJ for development, you can consult the official guide for setuping up the IDE. Besides, you should do the following things:

  • Go to View > Tool Windows > Maven Projects and add hadoop-2.6, hive-provided, hive-thriftserver, yarn in Profiles (there are some default profiles as well, don't change them). Then Reimport All Maven Projects (the first button on upper-right corner), Generate Sources and Update Folders For All Projects (the second button on upper-right corner).
  • Rebuild the whole project, which would fail but is essential for following steps.
  • Marking Generated Sources:
    • Go to File > Project Structure > Project Settings > Modules. Find spark-streaming-flume-sink, and mark target/scala-2.11/src_managed/main/compiled_avro as source. (Click on the Sources on the top to mark)
    • Go to File > Project Structure > Project Settings > Modules. Find spark-hive-thriftserver, and mark src/gen/java as source. (Click on the Sources on the top to mark)
  • Rebuild the whole project again, which should work well now. If there still exist some compilation errors for not finding some classes, you may return to last step and marking corresponding sources if not included.

Examples

Usage

The master branch is the version integrated with Spark SQL, and the standalone branch is a stand-alone version just with DITA code.

Contributors

  • Zeyuan Shang: zeyuanxy [at] gmail [dot] com
  • Guoliang Li: liguoliang [at] tsinghua [dot] edu [dot] cn
  • Zhifeng Bao: zhifeng.bao [at] rmit [dot] edu [dot] au

About

Distributed In-Memory Trajectory Analytics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published