Skip to content

ambron60/shannon-entropy-calculator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

Shannon's Entropy Calculator

Named after Boltzmann's Η-theorem, Shannon defined the entropy Η (Greek capital letter eta) of a discrete random variable X with possible values {x1, ..., xn} and probability mass function P(X) as:

Here E is the expected value operator, and I is the information content of X. I(X) is itself a random variable. The entropy can explicitly be written as:

where b is the base of the logarithm used. Common values of b are 2, Euler's number e, and 10, and the corresponding units of entropy are the bits for b = 2, nats for b = e, and bans for b = 10.

One may also define the conditional entropy of two events X and Y taking values xi and yj respectively, as:

where p(xi, yj) is the probability that X = xi and Y = yj. This quantity should be understood as the amount of randomness in the random variable X given the event Y [SOURCE].

This project calculates the Shannon entropy of a given text message based on symbol frequencies.

Table of Contents

Installation

All the code required to get started is in the file (shannon-entropy.py). Only a working installation of Python 3 is necessary [LINK].

Features

After user input, the program iterates over the given string (m) separating each character (symbol) and calculating its frequency (probability) over the length of m. Besides Shannon's entropy, values for optimally encoding the message and the metric entropy are also determined. Such optimal encoding would allocate fewer bits for the frequency occuring symbols and long bit sequences for the more infrequent symbols [SOURCE].

Usage

This is a sample output of entering the string "abracadabra":

Enter the message: abracadabra

Symbol-occurrence frequencies:

b --> 0.18182 -- 2
d --> 0.09091 -- 1
a --> 0.45455 -- 5
r --> 0.18182 -- 2
c --> 0.09091 -- 1

H(X) = 2.04039 bits. Rounded to 2 bits/symbol (bits per byte),
it will take 22 bits to optimally encode "abracadabra"

Metric entropy: 0.18549

Support

For questions or comments:

License

License: MIT

About

Shannon Entropy Calculator

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages