Skip to content

Implementation of KL Divergence and inverted vector model for plagiarism detection in text files

Notifications You must be signed in to change notification settings

Moozzaart23/PlagiarismChecker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Plagiarism Checker

A Plagiarism Checker for text documents This is mainly designed to go through all the files in the corpus present and check the similarity of the input document based on the documents present in the corpus. Firstly vector space model built with the help of tf-idf and then KL Divergence is used to find the similarity of the query document with other documents present in the corpus

Getting Started

  • Python (Version 3.7 and above)
  • Pip (Latest Version recommended)
  • Git

Installation

  • Clone the repository in your preferred directory using the following command
	git clone https://github.com/Moozzaart23/PlagiarismChecker.git
  • Change your working directory to Plagarism_Checker
	cd Plagarism_Checker 
  • Create a python terminal to install nltk dependencies
	>>> import nltk
	>>> nltk.download('pukt')
	>>> nltk.download('stopwords')
	>>> exit()
  • Copy the documents which you want to check for along with the query document in the same directory
  • Run the following command to check for query
	python check.py 

Team

About

Implementation of KL Divergence and inverted vector model for plagiarism detection in text files

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages