Skip to content

bi-graph/Bigraph

Repository files navigation

BiGraph

PyPI version PyPI version PyPI - Python Version

GitHub Repo stars

BiGraph is a Python package for Link prediction in bipartite networks.

Node based similarities and Katz has been implemented. you can find algorithms in bigraph module. Algorithms implemented so far:

Algorithms table
Number Algorithm
1 jaccard
2 adamic adar
3 common neighbors
4 preferential attachment
5 katz similarity

Installation

Install the latest version of BiGraph:

$ pip install bigraph

Simple example

Predicting new links in a randomly generated graph using Adamic-Adar algorithm:

from bigraph.predict import aa_predict
from bigraph.preprocessing import import_files, make_graph


def adamic_adar_prediction():
    """
    Link prediction on bipartite networks
    :return: A dictionary containing predicted links
    """

    df, df_nodes = import_files()
    print(df)
    print(f"Graph Nodes: ", df_nodes)
    G = make_graph(df)
    print(G)
    predicted = aa_predict(G)  # Here we have called Adamic Adar method from bigraph module
    return predicted


# Executing the function

if __name__ == '__main__':
    adamic_adar_prediction()

Evaluating Adamic-Adar algorithm.
You can try other provided prediction algorithms by replacing the "aa" argument.

from bigraph.evaluation.evaluation import evaluate
from bigraph.preprocessing import import_files, make_graph


def adamic_adar_evaluation():
    """
    Evaluate Adamic-Adar algorithm using 10-Fold cross-validation 
    :return: A dictionary containing the evaluation results
    """
    df, df_nodes = import_files()
    G = make_graph(df)
    results = evaluate(G, k=10,
                       method='aa')  # Here we have evaluated adamic-adar
    # methods using evaluation module. Methods are 'jc', 'aa', 'pa', 'cn'
    return results


# Executing the function
if __name__ == '__main__':
    adamic_adar_evaluation()

Call for Contributions

The Bigraph project welcomes your expertise and enthusiasm!

Ways to contribute to Bigraph:

  • Writing code
  • Review pull requests
  • Develop tutorials, presentations, and other educational materials
  • Translate documentation and readme contents

Issues

If you happened to encounter any issue in the codes, please report it here. A better way is to fork the repository on Github and/or create a pull request.

Metrics

Metrics that are calculated during evaluation:

Metrics table
Number Evaluattion metrics
1 Precision
2 AUC
3 ROC
4 returns fpr*
5 returns tpr*
  • For further usages and calculating different metrics

Dataset format

Your dataset should be in the following format (Exclude the 'Row' column):

Sample edges (links) dataset
Row left_side right_side Weight*
1 u0 v1 1
2 u2 v1 1
3 u1 v2 1
4 u3 v3 1
5 u4 v3 2
  • Note that running
    from bigraph.preprocessing import import_files
    df, df_nodes = import_files()
    will create a sample graph for you and will place it in the inputs directory.
  • Although the weight has not been involved in current version, but, the format will be the same.

More examples

Predicting new links in a randomly generated graph using following algorithms:

  • Preferential attachment
  • Jaccard similarity
  • Common neighbours
from bigraph.predict import pa_predict, jc_predict, cn_predict
from bigraph.preprocessing import import_files, make_graph


def main():
    """
    Link prediction on bipartite networks
    :return:
    """
    df, df_nodes = import_files()
    G = make_graph(df)
    pa_predict(G)  # Preferential attachment
    jc_predict(G)  # Jaccard coefficient
    cn_predict(G)  # Common neighbors


# Executing the function
if __name__ == '__main__':
    main()

References

References table
Number Reference Year
1 Yang, Y., Lichtenwalter, R.N. & Chawla, N.V. Evaluating link prediction methods. Knowl Inf Syst 45, 751–782 (2015). https://doi.org/10.1007/s10115-014-0789-0 2015
2 Liben-nowell, David & Kleinberg, Jon. (2003). The Link Prediction Problem for Social Networks. Journal of the American Society for Information Science and Technology.https://doi.org/58.10.1002/asi.20591 2003
2 ... ...

Future work

  • Modulate the functions
  • Add more algorithms
  • Run on CUDA cores
  • Make it faster using vectorization etc.
  • Add more preprocessors
  • Add dataset, graph, and dataframe manipulations
  • Unify and reconstruct the architecture and eliminate redundancy

Notes

  • It can export the graph in .json and .gexf format for further usages. For instance: Gephi etc.

If you found it helpful, please give us a

License

Released under the BSD license

Copyright © 2017-2021 BiGraph Developers
Soran Ghadri (soran.gdr.cs@gmail.com)
Taleb Zarhesh (taleb.zarhesh@gmail.com)