ssc-decryptor

A Python 3 script for decrypting text encrypted with monoalphabetic/simple substitution ciphers, where word boundaries remain in place, using a technique that I will call combined pattern deduction.

Installation

Install required packages using pip:

pip install -r requirements.txt

About

Each encrypted word from an encrypted message will have zero or more matching plaintext words that share the same word pattern. This script attempts to decipher encrypted messages by reducing the number of plaintext matches for each encrypted word to one (ideally).

Example:

Encrypted Word	Pattern	No. Plaintext Matches
WSNNHGDK	8-12334567	675
SJJQGSJB	8-12234125	4
FTKJPAHG	8-12345678	5762

This number of tuples in the 3-fold cartesian product of the plaintext matches for these 3 words is ~15,000,000.

However, by sequentially combining encrypted words and comparing the longer pattern with the pattern of each combined tuple in the cartesian product of the plaintext matches, the number of matches for each word can be reduced to one by examining ~50,000 tuples.

Decryption process: High-level Overview

The decryption process utilises word patterns and combines words to create larger, more useful, patterns. Consider the following two encrypted words and their corresponding patterns:

Encrypted Word	Pattern
`S`JJQG`S`JB	8-`1`2234`1`25
W`S`NNHGDK	8-1`2`334567

In the first word, S is identified with a 1. In the second word, S is identified with a 2. If we combine both words and create a new pattern, such as:

Encrypted Word	Pattern
`S`JJQG`S`JBW`S`NNHGDK	16-`1`2234`1`256`1`77849A

Then in the new pattern we can now see that S is identified with a 1 for all three occurrences.

It is now possible to say that: any set in the cartesian product of the plaintext matches for each of the individual word patterns that doesn't have the same combined pattern, can be deducted from the original sets of plaintext matches.

We can keep building and comparing larger patterns as we iterate through the encrypted words in sequence, and through this process of deduction, we can determine a cyphertext alphabet to decrypt the message.

Note: You can run the script using the verbose (-v) flag and see this deduction in action

Two Word Example

PQACEIAMNSXU has 23 word pattern matches
RWCTJSXSZA has 261 word pattern matches

By creating a combined pattern from PQACEIAMNSXU and RWCTJSXSZA and comparing it with a combined pattern for each tuple in the cartesian product of each word's word pattern matches, then there is only 1 possible match for each word:

PQACEIAMNSXU has 1 word pattern matches
RWCTJSXSZA has 1 word pattern matches

"OVERWHELMING SCRUTINIZE"

Total tuples compared: (23 x 261) = 6,003

Three Word Example

OCUBICBP has 53 word pattern matches
KCXXPQUN has 675 word pattern matches
PUBOMNV has 8585 word pattern matches

First iteration

OCUBICBP has 53 word pattern match
KCXXPQUN has 675 word pattern match

By creating a combined pattern from OCUBICBP and KCXXPQUN and comparing it with a combined pattern for each tuple in the cartesian product of each word's word pattern matches, then there are 6 possible matches for OCUBICBP and 7 possible matches for KCXXPQUN.

Second iteration after deductions

OCUBICBP has 6 word pattern matches
KCXXPQUN has 7 word pattern matches
PUBOMNV has 8585 word pattern matches

By creating a combined pattern from OCUBICBP, KCXXPQUN and PUBOMNV, and comparing it with a combined pattern for each tuple in the cartesian product of each word's word pattern matches, then there is only 1 possible match for each word:

OCUBICBP has 1 word pattern match
KCXXPQUN has 1 word pattern match
PUBOMNV has 1 word pattern match

"ENGLISH LANGUAGE PATTERNS"

Total tuples compared: (53 x 675) + (6 x 7 x 8585) = 396,345

Encrypted Word Ordering for Examination

The first step of the decryption process is to split the encrypted message into normalised alphabetic words. This can involve converting unicode characters to ascii, removing punctuation, transforming to uppercase, removing duplicates etc. We then need to consider the examination order of the words - taking into account that the larger the number of items in the cartesian product, then the more iterations/comparisons will be required.

There are 3 ordering options:

LONGEST_TO_SHORTEST: Longest length word to the shortest. In general, combing the largest words first will result in a large number of deductions early in the process. This is the default ordering option.
FEWEST_TO_MOST_MATCHES: Word with the fewest plaintext matches to the most. In general, LONGEST_TO_SHORTEST decrypts with the least iterations and is therefore quicker. However, message_example_2.txt is an example where FEWEST_TO_MOST_MATCHES is better.
MATCHES_DIVIDED_BY_LENGTH: Number of plaintext matches divided by the length of the word.

Weaknesses

Depends on a word list (dictionary).
When a word isn't present in the dictionary, it could cause issues if the ordering places the word in the first two words to be examined.
Can struggle with some shorter sentences where each word has a lot of pattern matches.
Doesn't attempt to decrypt numbers.

Strengths

Can often handle words not being present in the dictionary. For example:

Cyphertext:

Zozm Nzgsrhlm Gfirmt LYV UIH; 23 Qfmv 1912 – 7 Qfmv 1954) dzh zm Vmtorhs nzgsvnzgrxrzm, xlnkfgvi hxrvmgrhg, oltrxrzm, xibkgzmzobhg, ksrolhlksvi, zmw gsvlivgrxzo yrloltrhg.

Decrypts to:

Alan Mathison Turing OBE [F,K,Q]RS; 23 June 1912 – 7 June 1954) [k,v,w,f]as an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist.

This isn't perfect as U mapped to F, K, or Q and d mapped to k, v, w, or f. This is because frs, krs, qrs, kas, vas, was, and fas are all words present in the dictionary.txt file, and there isn't enough information to determine a one-to-one mapping for U and d.

However, the word Mathison is not present in the dictionary but we can still decrypt is as we have deciphered the individual characters from other words that are in the dictionary.
Can be fast and very accurate.

Usage

usage: decrypt.py [-h] [-m MESSAGE | -f MESSAGE_FILE] [-v | -vv] [-s]
                  [-w [WORDS_FILES [WORDS_FILES ...]]]
                  [-o {LONGEST_TO_SHORTEST,FEWEST_TO_MOST_MATCHES,MATCHES_DIVIDED_BY_LENGTH}]

optional arguments:
  -h, --help            show this help message and exit
  -m MESSAGE, --message MESSAGE
                        An encrypted message
  -f MESSAGE_FILE, --message-file MESSAGE_FILE
                        An encrypted message text file
  -v, --verbose-info    Makes the decryptor output more verbose
  -vv, --verbose-debug  Makes the decryptor output even more verbose
  -s, --suppress-encrypted-text-output
                        Suppresses the encrypted text from the output
  -w [WORDS_FILES [WORDS_FILES ...]], --words-files [WORDS_FILES [WORDS_FILES ...]]
                        Dictionary (words) files. Example: -w
                        words/dictionary.txt words/names.txt
  -o {LONGEST_TO_SHORTEST,FEWEST_TO_MOST_MATCHES,MATCHES_DIVIDED_BY_LENGTH}, --order {LONGEST_TO_SHORTEST,FEWEST_TO_MOST_MATCHES,MATCHES_DIVIDED_BY_LENGTH}
                        The word examine order

Example usage:

python3 decrypt.py -f message_examples/message_example_10.txt

Attempts to decrypt the contents of the message_examples/message_example_10.txt file using default word ordering LONGEST_TO_SHORTEST.

python3 decrypt.py -f message_examples/message_example_10.txt -o FEWEST_TO_MOST_MATCHES

Attempts to decrypt the contents of the message_examples/message_example_10.txt` file using FEWEST_TO_MOST_MATCHES word ordering.

Message Examples:

There are a number of encrypted message examples provided in this repository. The screenshots below show the output for each one.

message_example_1.txt

message_example_2.txt

message_example_3.txt

message_example_4.txt

message_example_5.txt

message_example_6.txt

message_example_7.txt

message_example_8.txt

message_example_9.txt

message_example_10.txt

message_example_11.txt

message_example_12.txt

message_example_13.txt

message_example_14.txt

message_example_15.txt

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
decryptor		decryptor
message_examples		message_examples
screenshots		screenshots
words		words
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
decrypt.py		decrypt.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decryptor

decryptor

message_examples

message_examples

screenshots

screenshots

words

words

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

init.py

init.py

decrypt.py

decrypt.py

requirements.txt

requirements.txt

Repository files navigation

ssc-decryptor

Installation

About

Decryption process: High-level Overview

Two Word Example

Three Word Example

Encrypted Word Ordering for Examination

Weaknesses

Strengths

Usage

Message Examples:

About

Releases

Packages

Languages

License

djeley/ssc-decryptor

Folders and files

Latest commit

History

Repository files navigation

ssc-decryptor

Installation

About

Decryption process: High-level Overview

Two Word Example

Three Word Example

Encrypted Word Ordering for Examination

Weaknesses

Strengths

Usage

Message Examples:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages