Skip to content

Earth-movers distance based graph distance metric for financial statements.

Notifications You must be signed in to change notification settings

snoels/earth-movers-graph-distance-metric

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Earth Mover's Distance Based Graph Distance Metric For Financial Statements

This repository contains a Python implementation of the distance metric described in the paper: An Earth Mover's Distance Based Graph Distance Metric For Financial Statements

Paper: https://ieeexplore.ieee.org/document/9776204

If you find the code useful, please consider citing this paper.

@INPROCEEDINGS{9776204,
  author={Noels, Sander and Vandermarliere, Benjamin and Bastiaensen, Ken and De Bie, Tijl},
  booktitle={2022 IEEE Symposium on Computational Intelligence for Financial Engineering and Economics (CIFEr)}, 
  title={An Earth Mover's Distance Based Graph Distance Metric For Financial Statements}, 
  year={2022},
  volume={},
  number={},
  pages={1-8},
  doi={10.1109/CIFEr52523.2022.9776204}}

image

Table of Contents
  1. About The Project
  2. Getting Started
  3. Dataset
  4. Contact

About The Project

Quantifying the similarity between a group of companies has proven to be useful for several purposes, including company benchmarking, fraud detection, and searching for investment opportunities. This exercise can be done using a variety of data sources, such as company activity data and financial data. However, ledger account data is widely available and is standardized to a large extent. Such ledger accounts within a financial statement can be represented by means of a tree, i.e. a special type of graph, representing both the values of the ledger accounts and the relationships between them. Given their broad availability and rich information content, financial statements form a prime data source based on which company similarities or distances could be computed.

We present a graph distance metric that enables one to compute the similarity between the financial statements of two companies. This method may be useful for investors looking for investment opportunities, government officials attempting to identify fraudulent companies, and accountants looking to benchmark a group of companies based on their financial statements.

(back to top)

Built With

The following frameworks/libraries were utilized to get this project started:

(back to top)

Getting Started

Instructions for setting up this project locally can be found here. Follow the simple installation steps to get your local up and running.

Installation

  1. Clone the repo
    git clone https://github.com/snoels/earth-movers-graph-distance-metric.git
  2. Change your directory to the repo
    cd earth-movers-graph-distance-metric/
  3. Create the conda environment env-edm-gdm
    conda env create -f environment.yml
  4. Install pygraphviz (Ubuntu and Debian)
    sudo apt-get install graphviz graphviz-dev
    pip install pygraphviz==1.6

(back to top)

Dataset

Ten example vertex-weighted company representations can be found in the following file: ./synthetic_data/synthetic_company_graph_data.pkl.

This data is synthetical data inspired on the vertex-weighted balance sheets representation of a balance sheet and by no means represents real company data.

(back to top)

Contact

This repository is currently maintained by me. You can reach me at sander.noels@ugent.be.

You can find me on LinkedIn.

(back to top)

Releases

No releases published

Packages

No packages published