ClickHouse® is a real-time analytics DBMS
-
Updated
Jun 1, 2024 - C++
ClickHouse® is a real-time analytics DBMS
Apache DataFusion SQL Query Engine
Scalable, redundant, and distributed object store for Apache Hadoop
The Open Source Feature Store for Machine Learning
An open source time-series database for fast ingest and SQL queries
curated list of awesome tools and libraries for specific domains
🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.
Module 22 challenge: Using Google Colab to work on Big Data queries with PySpark SQL, parquet, and cache partitions
SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
CovsirPhy: Python library for COVID-19 analysis with phase-dependent SIR-derived ODE models.
A collection of my data science journey - projects, code, and notes.
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
YTsaurus is a scalable and fault-tolerant open-source big data platform.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
汐洛彖夲肜矩阵(Sillot T☳Converbenk Matrix),致力于服务智慧新彖乄
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."