Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Apriori Algorithm

The Apriori algorithm is used for mining frequent itemsets and devising association rules from a transactional database. For these rules, two parameters are used. 'Support' and 'Confidence'.

Let's see what all these terms mean.

Terms

Transactional databases

Transactional databases are a collection of data organized by time stamps, date etc to represent transaction in databases. This type of database has the capability to roll back or undo its operation when a transaction is not completed or commited. Highly flexible system where users can modify information without changing any sensitive information.

Itemsets

Items in a transaction form an item set.

Support

Support refers to items' frequency of occurrence.

Confidence

Confidence is a conditional probablity.

The algorithm in action

The algorithm is a sequence of steps to be followed to find the most frequent itemset in the given database. This data mining technique follows the join and the prune steps iteratively until the most frequent itemset is achieved. A minimum support threshold is given in the problem or it is assumed by the user.

Step 1

In the first iteration of the algorithm, each item is taken as 1-itemsets candidate. The algorithm will count the occureences of each item.

Step 2

Let there be mininum support, minSupport(for ex: 2). The set of 1-itemsets whose occurrence is satisfyinng the minSupport are determined. Only those candidates which count more than or equal minSupport, are taken ahead for the next iteration and the others are pruned.

Step 3

Next, 2-itemset frequent items with minSupport are discovered. For this, in the join step, the 2-itemset is generated by forming a group of 2 by combining items with itself.

Step 4

The 2-itemset candidates are pruned using the minSupport threshold. Now the table will have 2-itemset with minSupport only.

Step 5

The next iteration will form 3-itemsets using join and prune step. This iteration will follow antimonotone property where the subsets of 3-itemsets, that is the 2-itemset subsets of each group fall in minSupport. If all 2-itemset subsets are frequent then the superset will be frequent otherwise it is pruned.

Step 6

Next step will follow making 4-itemset by joining 3-itemset will itself and pruning if its subset does not meet the minSupport criteria. The algorithm is stopped when the most frequent itemset is achieved.

References

  1. What is the Apriori Algorithm
  2. Types of sources of data in data mining
  3. Apriori Algorithm