Skip to content

jeng/invertedisam

Repository files navigation

Inverted ISAM files

This program allows you to index a directory of files for rapid searching of keywords. This set of programs is good for a directory of static files. Ever time a new file is added to the directory the complete index will need to be built.

The search program can return results based on the frequency of keywords in a document or by exact phrases.

buildInvertedFile

Usage

buildInvertedFile -d directory -o output [-e Extension] [-s Stop List File]

The program buildInvertedFile takes the following option:

  • The directory of the files that you want to index must be given as the first argument.
  • The 2nd argument must be the base name of the output files. Three files will be create:
    1. The index file will have the extension ‘if’. This file contains the keywords, the number of posting entries and the pointer into the posting file.
    2. The posting file will have the extension ‘pf’. The posting file contains the pointer into the document table for each document that contains this keyword. It also contain the word frequency of this word in each document.
    3. The document file will have the extension ‘df’. It contains the name of all of the documents currently indexed.
  • -e Extension of the files that you want to index. If the option is not provided all files in the given directory will be indexed.
  • -s Stop word list. None of the words in the stop word list file will be added to the index.

searchInvertedFile

Usage

searchInvertedFile indexname [-e] [keyword1] [keyword2] … [keywordn]

  • The indexname is the base name of the index file. You will want to use the same name that you used for output when running the buildInvertedFile.
  • If the -e parameter is passed the following keywords will be taken as an exact phrase. Only document that have the exact phrase will be returned.
  • You can follow the indexname with as many keyword as you want.

About

Programs to build and search inverted isam file for static directories.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published