Skip to content

Visualizing the evolution of social networks with node2vec + UMAP + bqplot

License

Notifications You must be signed in to change notification settings

annaaniol/evolNET

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EvolNET

The project presents a possible approach to visualizing the evolution of social networks. The action flow contains of the following:

  • Creating a networkx graph from the dataset including a year label for each edge
  • Performing community detection based on the graph structure using the Greedy Modularity Communities algorithm (alternatively the Girvan-Newman or any other community detection algorithm may be used here). Community membership determines further coloring of each node
  • For each year Y:
    • extracting a subgraph S containing the edges labeled with year <= Y
    • obtaining 32-dim embeddings for all nodes of S using the node2vec algorithm
    • dimensionality reduction from 32 to 2 dimensions based on UMAP in order to obtain x and y coordinates for each node from S
    • visualizing all nodes of S using the obtained coordinates and precomputed community-related coloring

The visualisation is based on the bloomberg/bqplot project.

Set up

Package requirements

All required packages are listed in req.txt file. You can create an identical conda environment by executing:

conda create -n your_env_name --file req.txt

Enabling Jupyter extensions

In order to enable the required notebook extensions, run:

jupyter nbextension enable --py widgetsnbextension

Dataset

I use a DBLP Computer Science citation dataset. If you want to use the same dataset, please download DBLP-citation-Jan8.tar.bz2 from here. Extract it and place the DBLP-citation-Jan8.txt file in the main folder. Next, open the parser.ipynb notebook and run all the cells. It will produce dblp_nodes.csv and dblp_edges.csv files representing the graph structure of the dataset. The attached notebook persists only articles written by top 100 publishing authors and having not fewer than 10 citations. You can adjust these parameters in the notebook.

In order to use any other dataset, you should create your own parser and adjust the main bqplot notebook accordingly.

Result

TODO

[-] dynamicly adjust node2vec embedding size according to the graph size

Releases

No releases published

Packages

No packages published