Skip to content

Mblakey/wiswesser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Wiswesser Line Notation (WLN) Project

  • WLN Parser - read and write WLN to/from smiles, inchi, mol files and other chemical line notations.
  • WLN FSM - extract chemical terms from documents, this machine uses greedy matching to return matched WLN sequences from documents.
  • WLN Compresser - compress WLN strings using markov decision processes.

This is Linux and MacOS software only.

Note: This project is solely created by Michael as part of his PhD work, if you are interested using the project, or find any bugs or issues, reporting them would be extremely helpful.

Requirements

git, cmake, make and a c++ compiler are all essential.
graphviz is an optional install to view wln graphs (not needed for build).

OpenBabel see repo, will be installed as an external dependency.

Build

Run ./bootstrap.sh from the project directory, this will clone and build openbabel as well as linking the library to the parser in cmake. Babel files will be installed to external. Building the projects places all executables into build/.

Project Structure

This repository contains a broad range of functionality using WLN notation for various operations. As such, please read the individual README.txt files for the required area.

Unit Testing

All unit tests are contained in the /test directory.
These include:

  1. compare.sh
  2. reading.sh
  3. writing.sh
  4. file.sh

Unit tests 1-3 operate on the data files in \data. For comparsions agaisnt the old parser in OpenBabel select 1, for reading count tests run 2, writing round trip tests 3. To parse a file of WLN strings, file.sh will attempt conversions on every line.