Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fpgrowth fails with only one transaction #1049

Open
yhdelgado opened this issue Jun 6, 2023 · 0 comments
Open

Fpgrowth fails with only one transaction #1049

yhdelgado opened this issue Jun 6, 2023 · 0 comments
Labels

Comments

@yhdelgado
Copy link

I have a big dataset with real data. After several attempts, the execution fails at one transaction. I isolated the transaction and re-executed the algorithm. Always fails. I can't understand why it fails at this point, even with the isolated transaction.

Example:

from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns.fpgrowth import fpgrowth
import pandas as pd

transactions =[ [
    114367, 116953, 123213, 125589, 128047, 128579, 130407, 132025, 132082,
    134190, 136097, 136098, 136181, 136357, 136656, 136658, 136659, 136992,
    137180, 137181, 137395, 138215, 139339, 139520, 139551, 140008, 140012,
    140021
  ]]

def get_fpgrowth_associated_products(product_name):
  # filter out transactions that don't include the target product
  filtered_transactions = [t for t in transactions if product_name in t]
  te = TransactionEncoder()
  te_ary = te.fit(filtered_transactions).transform(filtered_transactions)

    # Convert the one-hot encoded array into a pandas DataFrame
  df = pd.DataFrame(te_ary, columns=te.columns_)

    # Compute frequent itemsets using the FP-growth algorithm (min_support = 0.5)
  freq_itemsets = fpgrowth(df, min_support=0.5, use_colnames=True)

  itemsets=set(freq_itemsets.itemsets)

    # find the sets that include the target product
  target_sets = [s for s in itemsets if product_name in s]

    # combine the other items from those sets into a single set
  associated_items = set()
  for s in target_sets:
      associated_items |= s - {product_name}
  
  return list(associated_items)

get_fpgrowth_associated_products(136181)

Versions

MLxtend 0.22.0
Linux-5.19.0-43-generic-x86_64-with-glibc2.35
Python 3.8.16
Scikit-learn 1.2.2
NumPy 1.24.3
SciPy 1.9.3

@yhdelgado yhdelgado added the Bug label Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant