Skip to content

Bayesian surprise is the result of mismatches between our expectations and actual results, hence the degree of surprise or anomalousness attached to a pattern will vary with respect to these differences. The implication of obtaining large surprise values identifies those patterns likely to be useful and interesting to the user.

Notifications You must be signed in to change notification settings

kenmcgarry/BayesSurprise

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 

Repository files navigation

BayesSurprise

In this work we employ Bayesian surprise to detect interesting/anomalous patterns from discrete sequence data. Many domains consist of discrete sequential time-series such as DNA analysis, online transactions, web click-stream navigation, cyber-attacks, financial transactions and especially sociology life-course data. The difficulty is that each data set has its own unique characteristics and many anomalies defy categorization. Since anomalies are by nature infrequent and elusive, we often do not have enough data for a supervised approach. However, novelty and surprise play a fundamental role in human and animal behavior for survival, attention and adaptation. We use regular expressions to collect the longest repeating sequences and define these as motifs (which may or may not represent novel patterns). The sequences are now composed of simpler motifs which are used to build Probabilistic Suffix Trees (PST) which can capture complex relationships based on motif location and frequency of occurrence. New data that deviates from established motifs either in location of appearance, frequency of appearance, or motif composition may represent recurring patterns that may be different in some way. Bayesian surprise is the result of mismatches between our expectations and actual results, hence the degree of surprise or anomalousness attached to a pattern will vary with respect to these differences. The implication of obtaining large surprise values identifies those patterns likely to be useful and interesting to the user.

About

Bayesian surprise is the result of mismatches between our expectations and actual results, hence the degree of surprise or anomalousness attached to a pattern will vary with respect to these differences. The implication of obtaining large surprise values identifies those patterns likely to be useful and interesting to the user.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages