curated list of awesome tools and libraries for specific domains
-
Updated
Jun 1, 2024
curated list of awesome tools and libraries for specific domains
Apache DataFusion SQL Query Engine
汐洛彖夲肜矩阵(Sillot T☳Converbenk Matrix),致力于服务智慧新彖乄
Scalable, redundant, and distributed object store for Apache Hadoop
YTsaurus is a scalable and fault-tolerant open-source big data platform.
The library for developing distributed Erlang applications
🚄 FASTJSON2 is a Java JSON library with excellent performance.
ClickHouse® is a real-time analytics DBMS
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
An open source time-series database for fast ingest and SQL queries
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
AI + Data, online. https://vespa.ai
Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)
CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and artificial intelligence functionality along with secure user access and data governance features.
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."