comp90024-project2

COMP90024 - Cluster and Cloud Computing - 2020 S1 - Project 2

Contributors

_XuLinYang
💻

_{rudy renjie meng}
💻

_{Morry Niu}
💻

_{Mingyu Su}
💻

_Olivia
💻

Videos

https://www.youtube.com/playlist?list=PL3rzNLhlw3lEjaleheqtzTwldK0IhOWBx

report

report

Repository Structure

| /.github 
      - github app configure file
  /ansible
      - ansible scripts
  /backend
      - backend server
  /crawler
      - twitter crawler
  /docs 
      - documentations
  /frontend
      - frontend application
  /sentiment-analysis
      - sentiment-analyser
  .all-contributorsrc 
      - all contributers infomation
  .travis.yml
      - travic.ci setup

COVID-19 data resource

https://github.com/covid-19-au/covid-19-au.github.io

Due: Wednesday 20th May (by 12 noon!).

results

32/40
The report is pleasant to read but contains a few impressions; there are may attempted analysis of correlation of tweets with various socio-economic variables, but the statistical analysis is lacking and the charts are sometimes poorly chosen; no use of Docker Swarm, only HA/scalability is given by the CouchDB cluster; the homemade load-balancer is nice, but of little help with CouchDB views; redundant code in the MapReduce views; Python dependencies are tracked via requirements.txt; the application has many functionalities and nice animations but a bit unpolished.

Then we ask Richard for our marking detail

  发件人: Richard Sinnott
  发送时间: 2020年6月11日 15:28
  收件人: Xulin
  主题: Re: COMP90024 Assignment 2 mark query Group 03
  
  I can remark if you want, but I note that if the score goes down then you can’t complain, e.g. say you wanted the previous score etc.
  
  Shall I proceed?
  
  >>>comments below…
  
  R.
  
  --
  Prof. Richard O. Sinnott
  Professor of Applied Computing Systems
  School of Computing and Information Systems, The University of Melbourne
  333 Exhibition Street (lvl 2), Melbourne, Vic 3010, Australia
  Tel: +61-(0)435-964-844 Email: rsinnott@unimelb.edu.au
  Web: www.eresearch.unimelb.edu.au
  
  
  
  From: Xulin <xuliny@student.unimelb.edu.au>
  Date: Thursday, 11 June 2020 at 2:04 pm
  To: Richard Sinnott <rsinnott@unimelb.edu.au>
  Subject: COMP90024 Assignment 2 mark query Group 03
  
  Hi Richard,
  We have receive the mark and the feedback of the assignment 2. We get 32/40 with the feedback: 
  “The report is pleasant to read but contains a few impressions; there are may attempted analysis of correlation of tweets with various socio-economic variables, but the statistical analysis is lacking and the charts are sometimes poorly chosen; no use of Docker Swarm, only HA/scalability is given by the CouchDB cluster; the homemade load-balancer is nice, but of little help with CouchDB views; redundant code in the MapReduce views; Python dependencies are tracked via requirements.txt; the application has many functionalities and nice animations but a bit unpolished.”
   
  I have several questions:
  1.	“no use of Docker Swarm, only HA/scalability is given by the CouchDB cluster;” As noted by the specification, the use of docker technology is not mandatory. So why we lose marks here for not using docker?
  
  >>>pretty sure you didn’t lose marks for this – it was more of a comment
  
  2.	“Python dependencies are tracked via requirements.txt;” I think it is ok to track python dependencies in requirements.txt. What’s more, in lecture it seems that we haven’t discuss such technology. And track dependencies in txt is quite comfortable in my perspective (This is the first time I track dependencies for python application). So why we lose marks here?
  
  >>>not sure you did
  
  3.	Application is unpolished. I am not sure what it means? We have error handling for the application and required content displayed. I think the application is OK. Can you explain it?
  4.	I wonder where our 8 marks lost? How much in the 2 demonstrations? How much in the report?
  
  >>>see attached for the score breakdown
   
  By XuLin Yang
  Kind Regards

The focus of this assignment

harvest tweets from across the cities of Australia on the UniMelb Research Cloud
and undertake a variety of social media data analytics scenarios that tell interesting stories of life in Australian cities
and importantly how the Twitter data can be used alongside/compared with/augment the data available within the AURIN platform to improve our knowledge of life in the cities of Australia.

Implementation Requirements

develop a Cloud-based solution that exploits a multitude of virtual machines (VMs) across the UniMelb Research Cloud for harvesting tweets through the Twitter APIs (using both the Streaming and the Search API interfaces).
produce a solution that can be run (in principle) across any node of the UniMelb Research Cloud to harvest and store tweets and scale up/down as required.
may want to explore other sources of data they find on the Internet
- e.g. information on weather, sport events, TV shows, visiting celebrities, stock market rise/falls, images from Instagram etc, however these are not compulsory to complete the work.
For the implementation, teams are recommended to use a commonly understood language across team members – most likely Java or Python.
Information on building and using Twitter harvesters can be found on the web, e.g. see https://dev.twitter.com/ and related links to resources such as Tweepy and Twitter4J.

Demo requirements

A working demonstration of the Cloud-based solution with dynamic deployment – 25% marks
A working demonstration of – 25% marks
1. tweet harvesting and
2. CouchDB utilization for specific analytics scenarios

Name		Name	Last commit message	Last commit date
Latest commit History 525 Commits
.github		.github
ansible		ansible
backend		backend
crawler		crawler
docs		docs
frontend		frontend
geojson		geojson
sentiment-analysis		sentiment-analysis
.all-contributorsrc		.all-contributorsrc
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md

License

yangxvlin/comp90024-project2

Folders and files

Latest commit

History

Repository files navigation

comp90024-project2

Contributors

Videos

report

Repository Structure

COVID-19 data resource

Due: Wednesday 20th May (by 12 noon!).

results

The focus of this assignment

Implementation Requirements

Demo requirements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages