company tomake appropriate business strategies to enhance their revenue by analyzing customers behaviors and send offers and royalties to customers respectively
-
Updated
May 23, 2023 - Jupyter Notebook
company tomake appropriate business strategies to enhance their revenue by analyzing customers behaviors and send offers and royalties to customers respectively
A machine learning model is built using PySpark's MLlib library to automatically flag suspicious job postings on Indeed.com. The dataset includes 18,000 job descriptions, out of which about 800 are fake.
Writing dummy snippets of code to read, manipulate, and build a simple ML model with PySpark.
Given a set of documents and the minimum required similarity threshold find the number of document pairs that exceed the threshold
This notebook contains detailed code for spark and machine learning and databricks
A laboratory to carry out experiments with PySpark
An ETL pipeline for I94 immigration, global land temperatures and US demographics datasets is created to form an analytics database on immigration events. A data model is established with pandas and pyspark to find patterns of immigration to the United States.
Trying best case apache spark working environment for robust data pipelines
An academic project carried out for the Distributed Data Analysis and Mining course (a. y. 2022/2023)
MapReduce Job Development, RDDs Programming, Medical Data Management, Sales Analysis, And Efficient Data Integration For Big Data Analysis. Spark: Big Data Processing, SQOOP Integration, And Spark Structured Streaming For Real-Time Data.
A project for the development of rich geospatial data from the city of SΓ£o Paulo for use in Machine Learning models.
Using the Thunder Library for Image Processing with Spark ML Lib
π π π Demo notebooks to show Jupyter Notebooks capabilities for programming teaching, learning and research
An Apache Spark application for OpenShift using Pyspark, Flask, and the Dataverse API
Add a description, image, and links to the pyspark topic page so that developers can more easily learn about it.
To associate your repository with the pyspark topic, visit your repo's landing page and select "manage topics."