Skip to content

Commit 9fe0757

Browse files
Release v2.3.0
- add support for MCT algorithm - update documentation - fix minor bugs
1 parent 8e5235e commit 9fe0757

File tree

5 files changed

+51
-28
lines changed

5 files changed

+51
-28
lines changed

README.md

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# Markov Chain Type 4 Rank Aggregation
2-
**implementation of MC4 Rank Aggregation algorithm using Python**
2+
**implementation of MC4 and MCT Rank Aggregation algorithm using Python**
33

44
## Description
55

6-
This project is all about implementing one of the most popular rank aggregation algorithms **Markov Chain Type 4** or **MC4**. In the field of Machine Learning and many other scientific problems, several items are often needed to be ranked based on some criterion. However, different ranking schemes order the items based on different preference criteria. Hence the rankings produced by them may differ greatly.
6+
This project is all about implementing two of the most popular rank aggregation algorithms, **Markov Chain Type 4** or **MC4** and **MCT**. In the field of Machine Learning and many other scientific problems, several items are often needed to be ranked based on some criterion. However, different ranking schemes order the items based on different preference criteria. Hence the rankings produced by them may differ greatly.
77

8-
Therefore a rank aggregation technique is often used for combining the individual rank lists into a single aggregated ranking. Though there are many rank aggregation algorithms, MC4 is one of the most renowned ones.
8+
Therefore a rank aggregation technique is often used for combining the individual rank lists into a single aggregated ranking. Though there are many rank aggregation algorithms, MC4 and MCT are two of the most renowned ones.
99

1010
## Resource
1111

@@ -23,24 +23,31 @@ For a specific release, `pip install mc4=={version}` such as `pip install mc4==1
2323

2424
## General Usage
2525

26-
Using this package is very easy. You just need the following three lines of code to use the package.
26+
Using this package is very easy.
27+
28+
1. Prepare a dataset containing ranks of all the items provided by different algorithms. See [here](https://github.com/kalyaniuniversity/MC4/blob/master/test_datasets/README.md) for sample datasets and more info.
29+
30+
2. Use following lines of code to use the package. Make sure to pass arguments according to your dataset otherwise answers will be incorrect.
2731

2832
```python
2933
from mc4.algorithm import mc4_aggregator
34+
import pandas as pd
3035

31-
aggregated_ranks = mc4_aggregator('dataset.csv')
36+
# Method 1
37+
aggregated_ranks = mc4_aggregator('test_dataset_1.csv', header_row = 0, index_col = 0)
3238

33-
# or
34-
35-
aggregated_ranks = mc4_aggregator(df)
39+
# or Method 2
40+
df = pd.read_csv('test_dataset_1.csv', header = 0, index_col = 0)
41+
aggregated_ranks = mc4_aggregator(df, header_row = 0, index_col = 0)
3642

3743
print(aggregated_ranks)
3844
```
39-
here `dataset.csv` or `df` are lists of ranks provided by different ranking algorithms or rank lists. *You can refer [here](https://github.com/kalyaniuniversity/MC4/blob/master/test_datasets/datasets.md) for more info and some test datasets.*
45+
here `test_dataset_1.csv` is a sample dataset containing ranks of different items provided by different algorithms.
4046

41-
`mc4_aggregator` takes some additional arguments as well.
47+
`mc4_aggregator` takes some mandatory and optional arguments -
4248

43-
* `order (string)`: order of the dataset, default is `'row'`. More on this, [here](https://github.com/kalyaniuniversity/MC4/blob/master/test_datasets/datasets.md).
49+
* `algo (string)`: algorithm for rank aggregation, `mc4` or `mct`, default is `mc4`
50+
* `order (string)`: order of the dataset, `row` or `column`, default is `row`. More on this, [here](https://github.com/kalyaniuniversity/MC4/blob/master/test_datasets/README.md).
4451
* `header_row (int or None)`: row number of the dataset containing the header, default is `None`
4552
* `index_col (int or None)`: column number of the dataset containing the index, default is `None`
4653
* `precision (float)`: acceptable error margin for convergence, default is `1e-07`
@@ -49,49 +56,56 @@ here `dataset.csv` or `df` are lists of ranks provided by different ranking algo
4956

5057
## Command Line Usage
5158

59+
You can directly use this package from command line if you have the dataset prepared already.
60+
5261
* To get help and usage details,
5362
```shell
5463
~$ mc4_aggregator -h or --help
5564
```
5665

5766
* Use with default settings,
5867
```shell
59-
~$ mc4_aggregator <data source> e.g. mc4_aggregator dataset.csv
68+
~$ mc4_aggregator dataset.csv
69+
```
70+
71+
* Specify the algorithm for rank aggregation using `-a` or `--algo`, options: `mc4` or `mct`, default is `mc4`
72+
```shell
73+
~$ mc4_aggregator dataset.csv -a mct
6074
```
6175

62-
* Specify order using `-o`or `--order`, default is `row`
76+
* Specify order using `-o`or `--order`, options: `row` or `column`, default is `row`
6377
```shell
64-
~$ mc4_aggregator <data source> -o <order> e.g. mc4_aggregator dataset.csv -o column
78+
~$ mc4_aggregator dataset.csv -o column
6579
```
6680

6781
* Specify header row using `-hr` or `--header_row`, default is `None`
6882
```shell
69-
~$ mc4_aggregator <data source> -hr <header row> e.g. mc4_aggregator dataset.csv -hr 1
83+
~$ mc4_aggregator dataset.csv -hr 0
7084
```
7185

7286
* Specify index column using `-ic` or `--index_col`, default is `None`
7387
```shell
74-
~$ mc4_aggregator <data source> -ic <index column> e.g. mc4_aggregator dataset.csv -ic 1
88+
~$ mc4_aggregator dataset.csv -ic 0
7589
```
7690

7791
* Specify precision using `-p` or `--precision`, default is `1e-07`
7892
```shell
79-
~$ mc4_aggregator <data source> -p <precision> e.g. mc4_aggregator dataset.csv -p 0.000001
93+
~$ mc4_aggregator dataset.csv -p 0.000001
8094
```
8195

8296
* Specify iterations using `-i` or `--iterations`, default is `200`
8397
```shell
84-
~$ mc4_aggregator <data source> -i <iterations> e.g. mc4_aggregator dataset.csv -i 300
98+
~$ mc4_aggregator dataset.csv -i 300
8599
```
86100

87101
* Specify ergodic number using `-e` or `--erg_number`, default is `0.15`
88102
```shell
89-
~$ mc4_aggregator <data source> -p <precision> e.g. mc4_aggregator dataset.csv -e 0.20
103+
~$ mc4_aggregator dataset.csv -e 0.20
90104
```
91105

92106
* All together,
93107
```shell
94-
~$ mc4_aggregator dataset.csv -o column -hr 1 -ic 1 -p 0.000001 -i 300 -e 0.20
108+
~$ mc4_aggregator dataset.csv -a mct -o column -hr 0 -ic 0 -p 0.000001 -i 300 -e 0.20
95109
```
96110

97111
## Output

mc4/algorithm.py

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -48,12 +48,13 @@ def get_matrix_shape(df):
4848
return rows, cols
4949

5050

51-
def get_partial_transition_matrix(df, items, lists):
51+
def get_partial_transition_matrix(df, algo, items, lists):
5252

5353
"""Returns the partial transition matrix from the dataframe containing different ranks
5454
5555
Args:
5656
df (pandas.core.DataFrame): dataframe object containing different ranks
57+
algo (string): mc4 or mct
5758
items (int): number of items
5859
lists (int): number of lists
5960
@@ -70,10 +71,13 @@ def get_partial_transition_matrix(df, items, lists):
7071

7172
if result == 0 and i==j:
7273
val = -1
73-
elif result >= (lists/2):
74+
elif result > (lists/2):
7475
val = 0
7576
else:
76-
val = 1
77+
if algo == 'mc4':
78+
val = 1
79+
else:
80+
val = (lists-result) / lists
7781

7882
matrix_input = val
7983

@@ -216,12 +220,13 @@ def get_mapped_final_ranks(df, final_ranks, index_col):
216220
return ranks
217221

218222

219-
def mc4_aggregator(source, order = 'row', header_row=None, index_col=None, precision=0.0000001, iterations=200, erg_number=0.15):
223+
def mc4_aggregator(source, algo='mc4', order = 'row', header_row=None, index_col=None, precision=0.0000001, iterations=200, erg_number=0.15):
220224

221225
"""Performs aggregation on different ranks using Markov Chain Type 4 Rank Aggeregation algorithm and returns the aggregated ranks
222226
223227
Args:
224228
file_path (string): path of the dataset file containing all different ranks
229+
algo (string): mc4 or mct, default is mc4
225230
order (string): order of the dataset, default is row i.e. row-major
226231
header_row (int or None): row number of the dataset containing the header, default is None
227232
index_col (int or None): column number of the dataset containing the index, default is None
@@ -233,6 +238,9 @@ def mc4_aggregator(source, order = 'row', header_row=None, index_col=None, preci
233238
list: contestantwise aggregated ranks
234239
"""
235240

241+
if algo not in ['mc4', 'mct']:
242+
raise Exception(f"Invalid ranking algorithm '{algo}'")
243+
236244
if isinstance(source, str) and is_csv(source):
237245

238246
if is_valid_path(source):
@@ -251,9 +259,10 @@ def mc4_aggregator(source, order = 'row', header_row=None, index_col=None, preci
251259
else:
252260
raise Exception(f"Unsupported data source '{get_filename(source)}'")
253261

262+
254263
rows, cols = get_matrix_shape(df)
255264

256-
partial_transition_matrix = get_partial_transition_matrix(df, rows, cols)
265+
partial_transition_matrix = get_partial_transition_matrix(df, algo, rows, cols)
257266

258267
normalized_transition_matrix = get_normalized_transition_matrix(partial_transition_matrix, rows)
259268

mc4/command_line.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
parser = argparse.ArgumentParser(description='Takes necessary inputs for mc4_aggegator')
55

66
parser.add_argument('source', type=str, help='source of the lists of ranks')
7+
parser.add_argument('-a', '--algo', type=str, default='mc4', help='rank aggregation algorithm, mc4 or mct, default is mc4', choices=['mc4', 'mct'])
78
parser.add_argument('-o', '--order', type=str, default='row', help='order of the dataset, default is row', choices=['row', 'column'])
89
parser.add_argument('-hr', '--header_row', type=int, help='row number of the header, default is None')
910
parser.add_argument('-ic', '--index_col', type=int, help='column number of the index, default is None')
@@ -14,5 +15,4 @@
1415
args = parser.parse_args()
1516

1617
def main():
17-
print(mc4_aggregator(args.source, args.order, args.header_row, args.index_col, args.precision, args.iterations, args.erg_number))
18-
18+
print(mc4_aggregator(args.source, args.algo ,args.order, args.header_row, args.index_col, args.precision, args.iterations, args.erg_number))

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
setup(
77
name="mc4",
8-
version="2.2.1",
8+
version="2.3.0",
99
author="Ayan Kumar Saha",
1010
author_email="ayankumarsaha96@gmail.com",
1111
description="A python package for implementing Markov Chain Type 4 rank aggregation",
File renamed without changes.

0 commit comments

Comments
 (0)