spark-java-udf-demo

This project is based on Spark 2.0.2 API which uses Dataset approach instead of Traditional RDDs.

This project describes capibilities of Spark Aggegrate User Defined Functions which can be used to extend default available aggregate features.

Input Datasource is in below format:

swid|device_category|time_spent|video_start

A,mob,5,1
A,desk,5,2
A,desk,5,3
A,mob,5,2
A,mob,20,16
B,desk,5,2
B,mob,5,2
B,mob,5,2
B,desk,5,2
B,desk,5,2
C,desk,5,2
C,OTT,5,2

Which is available in below project location. /src/main/resources/user_activities.txt

and my expected output is:

[B,{"mob":"40.00%","desk":"60.00%"},25,10]
[C,{"desk":"50.00%","OTT":"50.00%"},10,4]
[A,{"mob":"75.00%","desk":"25.00%"},40,24]

Which is Aggregation on user(swid) and for each user its finding total_time_spent, total_video_starts and also device usage distribution in terms of percentage. This gives us more insight to understand user's device usage. This can be used to build user usage driven recommendation system.

You need to also provide S3 location or HDFS location of the input file.

How to run this:

spark-submit --class com.parmarh.driver.UserAggDriver spark-java-udf-demo-0.0.1-SNAPSHOT.jar

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
src/main		src/main
README.md		README.md
_config.yml		_config.yml
pom.xml		pom.xml
spark-deploy.sh		spark-deploy.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src/main

src/main

README.md

README.md

_config.yml

_config.yml

pom.xml

pom.xml

spark-deploy.sh

spark-deploy.sh

Repository files navigation

spark-java-udf-demo

Input Datasource is in below format:

and my expected output is:

How to run this:

About

Releases

Packages

Languages

himanshu-parmar-bigdata/spark-java-udaf-demo

Folders and files

Latest commit

History

Repository files navigation

spark-java-udf-demo

Input Datasource is in below format:

and my expected output is:

How to run this:

About

Topics

Resources

Stars

Watchers

Forks

Languages