Skip to content

Master Thesis (e4 RCP) - Echo Chamber Detection in Social Media

License

Notifications You must be signed in to change notification settings

MacHundt/SocialOcean

Repository files navigation

SocialOcean

SocialOcean enables users to explore geo-tagged social media data. In the context of my Master Thesis, it is tailored to Echo Chamber detection. Depending on the pre-processed features, it can easily be adapted for other purposes as well. The tool utilizes a Lucene index and a corresponding PostgreSQL database. A script to create the Lucene index is included. This repository is an Eclipse RCP project, so it enables plugin-creation.

SocialOcean Tool Interface

The initial idea and a prototype was presented at the EuroVis2017. A demonstration video, a poster and a short paper can be downloaded at: https://lighthouse-bodensee.de/michaelhundt/eurovis2017/
The Master Thesis, the slides of my final presentation and a short introduction video can be downloaded at: https://lighthouse-bodensee.de/michaelhundt/socialocean/

Setup

Possible Problems

Depending on the system that you use, you may have to adapt the configuration of the target platform or the .product file.

SocialOcean.product --> Configuration --> Configuration File (maxosx, solaris, win32)
SocialOcean.product --> Contents --> Add Required Plug-ins

Pre-Processing

For the pre-processing we need a tweets table and users table. In the following we describe which fields are obligatory. There are three scripts, that offer some basic pre-processing.

src/scripts:
	(1) AddCategoryScript.java
	(2) AddSentimentScript.java
	(3) Geocoding.java
	(4) IndexTweets.java

The first two (1) and (2) scripts need the following database fields:

tweet_id, long
tweet_content,  String

The indexing script (4) in the current form needs the following database fields from a tweets table:

  • tweet_id, long
  • tweet_creationdate, String, timestamp of the form "yyyy-dd-MM hh:mm:ss", example: "2013-08-01 01:15:00"
  • tweet_content, String
  • relationship, String (Tweet, Followed)
  • latitude, double
  • longitude, double
  • hasurl, boolean
  • user___screenname, String
  • source, String
  • user___language, String
  • positive, int (result of SentiStrength.jar)
  • negative, int (result of SentiStrength.jar)
  • category, String (1)
  • sentiment, String (2)

And the following fields from the users table:

  • gender, String (default: unknown)
  • user___statusescount, int
  • user___followerscount, int
  • user___friendscount, int
  • user___listedcount, int
  • desc___score, double ( [0,1] value that rates the text of the user description )
  • latitude, double (3)*
  • longitude, double (3)*

*import reference data: cities1000 from geonames and timezone_shapes:

Useful Tools

  • If it is not yet included and you would like to have a GUI tool for the database, you could download DataGrip or PgAdmin

Further Reading

http://wiki.eclipse.org/Eclipse4/RCP

http://www.vogella.com/tutorials/EclipseRCP/article.html