KEFE

KEFE is an approach that exploits the information of app description and user reviews (written in Chinese) to identify key features that have a significant relationship with app rating scores.

The application of KEFE involves three main steps: 1) applying a textual pattern-based approach and a deep machine learning classifier to extract features from app description; 2) applying another classifier to match features with their relevant user reviews; and 3) applying regression analysis to identify key features.

More details of KEFE can be found in the following paper:

Huayao Wu, Wenjun Deng, Xintao Niu, and Changhai Nie. Identifying Key Features from App User Reviews. International Conference on Software Engineering (ICSE), pp. 922-932, 2021

Usage

KEFE is developed and tested using Python 3, pyltp and tensorflow. Please install the following packages of specific versions:

pip install pyltp=0.2.1
pip install tensorflow=1.15.0

More instructions for installing pyltp and tensorflow can be found in their respective websites: pyltp, tensorflow.

Download the model files, which include:
- pyltp model files: ltp-model
- pre-trained BERT model: chinese_L-12_H-768_A-12
- classification model of feature extraction: model-extract
- classification model of user review matching: model-match
The ltp-model should be put into the pyltp-resource directory, and the other three should be put into the bert-master directory.

To extract feature-describing phrases from a given app description, run:

python feature_extraction.py -i [app_description].csv
# for example
# python feature_extraction.py -i example/description.csv

To identify key features of a given app, run:
```
python feature_identification.py -f [features].txt -r [reviews].txt
# for example
# python feature_identification.py -f example/alipay_features.txt -r example/alipay_reviews.txt
```
The above command will first apply the classification model to match features and user reviews (this will take a long time if there is a large volume of user reviews), and then identify key features. If an existing file of matching between features and user reviews is available, run:
```
python feature_identification.py -f [features].txt -m [matching].txt
# for example
# python feature_identification.py -f example/alipay_features.txt -m example/alipay_matching.txt
```
Format of Files
- The [reviews].txt file should be organised as the following format per line: [review_text]-*-[review_date]-*-[rating_score]
- The [matching].txt file should be organised as the following format per line: [feature]-*-[review_text]-*-[review_date]-*-[rating_score]-*-[label], where label = 0 and 1 indicate non-matching and matching pairs, respectively.

Dataset and Replication Package

Dataset (app descriptions and raw user reviews collected) and replication package can be downlowded from the following links:

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
bert-master		bert-master
example		example
pyltp-resource		pyltp-resource
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
candidate_phrase.py		candidate_phrase.py
feature_extraction.py		feature_extraction.py
feature_identification.py		feature_identification.py
key_feature.py		key_feature.py
preprocess_review.py		preprocess_review.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bert-master

bert-master

example

example

pyltp-resource

pyltp-resource

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

candidate_phrase.py

candidate_phrase.py

feature_extraction.py

feature_extraction.py

feature_identification.py

feature_identification.py

key_feature.py

key_feature.py

preprocess_review.py

preprocess_review.py

Repository files navigation

KEFE

Usage

Dataset and Replication Package

About

Releases 1

Packages

Languages

License

GIST-NJU/KEFE

Folders and files

Latest commit

History

Repository files navigation

KEFE

Usage

Dataset and Replication Package

About

Topics

Resources

License

Stars

Watchers

Forks

Languages