big-data
Here are 3,977 public repositories matching this topic...
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
Updated
Mar 20, 2024 - Python
ClickHouse® is a free analytics DBMS for big data
-
Updated
May 13, 2024 - C++
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
-
Updated
Apr 25, 2024
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
Updated
May 13, 2024 - Java
CMAK is a tool for managing Apache Kafka clusters
-
Updated
Aug 2, 2023 - Scala
The Data Engineering Cookbook
-
Updated
Mar 20, 2024
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
-
Updated
May 13, 2024 - Jupyter Notebook
PredictionIO, a machine learning server for developers and ML engineers.
-
Updated
Jan 9, 2021 - Scala
Apache Ignite
-
Updated
May 12, 2024 - Java
Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.
-
Updated
May 10, 2024 - Java
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
-
Updated
May 6, 2024 - Java
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
-
Updated
May 13, 2024 - Java
Improve this page
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."