big-data-processing

Star

Here are 65 public repositories matching this topic...

impresso / impresso-text-acquisition

Star

Python library to import OCR data in various formats into the canonical JSON format defined by the Impresso project.

big-data-processing historical-newspapers impresso-project

Updated May 1, 2024
Jupyter Notebook

john-fotis / Movie-Recommender

Star

A movie recommender written in Go that suggests movies considering various factors within a particular dataset, encompassing users, movies, and movie ratings.

go golang big-data web-application recommender-system cosine-similarity cli-application jaccard-similarity movie-recommendation-system pearson-correlation dice-coefficient corellation big-data-processing

Updated Apr 21, 2024
Go

Lefteris-Souflas / Redis-MongoDB-Assignment

Star

Analyzing classified ads data from the used motorcycles market. Tasks involve utilizing Redis Bitmaps for analytics on seller actions and MongoDB for analyzing bike listings. Includes data installation, cleaning, and analysis.

redis json r bitmap mongo-database big-data-processing redis-vs-rdbms-comparison

Updated Apr 17, 2024
R

drshahizan / BDM

Star

Course covers big data fundamentals, processes, technologies, platform ecosystem, and management for practical application development.

big-data big-data-analytics big-data-processing big-data-architecture

Updated Apr 7, 2024
Jupyter Notebook

Srking501 / csc8101_coursework

Star

A summative coursework for CSC8101 Engineering for AI

data-science big-data apache-spark pyspark parquet-files apache-parquet big-data-analytics databricks-notebooks nyc-taxi-dataset azure-databricks big-data-processing databri delta-file

Updated Mar 8, 2024
Jupyter Notebook

adnanrahin / NFL-Big-Data-Bowl-2022

Star

The 2022 Big Data Bowl data contains Next Gen Stats player tracking, play, game, player, and PFF scouting data for all 2018-2020 Special Teams play. Here, you'll find a summary of each data set in the 2022 Data Bowl, a list of key variables to join on, and a description of each variable.

scala big-data spark rdd spark-sql big-data-processing

Updated Feb 26, 2024
Scala

almersesunan / Portofolio

Star

Welcome, feel free to navigate through my project. Detail information about each project can be found inside specified directory.

cloud-computing data-engineer big-data-processing

Updated Feb 19, 2024
Jupyter Notebook

Adi3042 / Data_Science

Star

Data Science Assignment file

machine-learning natural-language-processing computer-vision deep-learning clustering exploratory-data-analysis data-visualization statistical-analysis classification dimensionality-reduction ensemble-learning feature-engineering anomaly-detection model-deployment regression-analysis big-data-processing data-cleaning-and-preprocessing time-series-analysis-and-forecasting model-selection-and-evaluation

Updated Feb 12, 2024
Jupyter Notebook

IncredibleProgress / sweetheart.py

Star

rock-solid pillars for enterprise-grade solutions

python vue jupyter ubuntu rethinkdb rhel rust-lang nginx-unit tailwindcss big-data-processing py-script

Updated Feb 5, 2024
Python

Rifat392000 / BigDataAnalytics

Star

visualization sql clustering eclipse virtual-machine python3 rdbms hue hadoop-filesystem hadoop-mapreduce cloudera-hadoop pyspark-notebook big-data-analytics java-mapreduce big-data-processing google-colab-notebook

Updated Jan 17, 2024
Jupyter Notebook

RghdE / CapstoneTwo_EducationalLandscape

Star

Big Data and AI Engineering bootcamp 2nd capstone project. Using Big Data Tools to predict the probability of university enrollment for Egypt's High School students. 🏫 📚 🔬

data-science machine-learning big-data pyspark apache-pig big-data-analytics big-data-visualization big-data-projects big-data-processing

Updated Dec 14, 2023
Jupyter Notebook

airscholar / FlinkCommerce

Star

This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessary infrastructure components, including Apache Flink, Elasticsearch, and Postgres

python big-data apache-flink big-data-processing realtime-streaming

Updated Dec 4, 2023
Java

JamesHanZhang / table-data-format-transform-app

Star

excel, markdown, csv, sql 数据源批量/单独格式互相转换

easy-to-use data-preprocessing etl-framework big-data-processing csv-to-excel csv-to-sql multifileupload data-cleaning-pipeline excel-to-md

Updated Nov 23, 2023
Python

Neri-kun / Licenta

Star

Degree diploma project

machine-learning recommender-systems big-data-analytics big-data-processing

Updated Oct 4, 2023
Jupyter Notebook

VincianeDesbois / Hopitaux_Production

Star

Study of French hospital production. (2021)

python econometrics big-data-processing

Updated Sep 19, 2023
Jupyter Notebook

eskimo-sh / eskimo

Star

Eskimo is a state of the art Big Data Infrastructure and Management Web Console to build, manage and operate Big Data 2.0 Analytics clusters on Kubernetes. This is the git repository of Eskimo Community Edition.

Updated Sep 14, 2023
Java

vvittis / FlinkSampling

Star

Reservoir Sampling for Group-By Queries in Flink Platform. Answering effectively Single Aggregate.

java topic stratum apache-flink sampling reservoir-sampling streaming-data big-data-analytics group-by big-data-processing streaming-tuples

Updated Aug 12, 2023
Java

pranjalihande / Ethereum-Analysis

Star

Analysis of Ethereum Transactions and Smart Contracts

hadoop ethereum pyspark big-data-processing

Updated Jul 26, 2023
Jupyter Notebook

zaid-24 / Crack-Detection-using-CNN

Star

Crack Detection model using yolov7

python cnn pytorch big-data-processing yolov7

Updated Jul 2, 2023
Jupyter Notebook

Ayoub-etoullali / Activites-Pratiques-BigData

Star

MapReduce Job Development, RDDs Programming, Medical Data Management, Sales Analysis, And Efficient Data Integration For Big Data Analysis. Spark: Big Data Processing, SQOOP Integration, And Spark Structured Streaming For Real-Time Data.

real-time spark apache-spark pyspark data-integration mapreduce real-time-data sqoop mapreduce-jobs sales-analysis spark-structured-streaming mapreduce-java real-time-database big-data-processing rdds sqoop-export sqoop-import big-data-analysis medical-data-management

Updated Jun 7, 2023
Java

Improve this page

Add a description, image, and links to the big-data-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the big-data-processing topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

big-data-processing

Here are 65 public repositories matching this topic...

impresso / impresso-text-acquisition

john-fotis / Movie-Recommender

Lefteris-Souflas / Redis-MongoDB-Assignment

drshahizan / BDM

Srking501 / csc8101_coursework

adnanrahin / NFL-Big-Data-Bowl-2022

almersesunan / Portofolio

Adi3042 / Data_Science

IncredibleProgress / sweetheart.py

Rifat392000 / BigDataAnalytics

RghdE / CapstoneTwo_EducationalLandscape

airscholar / FlinkCommerce

JamesHanZhang / table-data-format-transform-app

Neri-kun / Licenta

VincianeDesbois / Hopitaux_Production

eskimo-sh / eskimo

vvittis / FlinkSampling

pranjalihande / Ethereum-Analysis

zaid-24 / Crack-Detection-using-CNN

Ayoub-etoullali / Activites-Pratiques-BigData

Improve this page

Add this topic to your repo