Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while fitting the model #52

Open
sachinlodhi opened this issue Apr 3, 2024 · 7 comments
Open

Error while fitting the model #52

sachinlodhi opened this issue Apr 3, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@sachinlodhi
Copy link

sachinlodhi commented Apr 3, 2024

Following is the dataframe:
image
and following is the additional code:

df.rename(columns={'result': 'Decision'}, inplace=True)

Output:

Index(['Date', 'Country', 'League', 'Season', 'HomeTeam', 'AwayTeam',
       'home_goal', 'away_goal', 'Decision'],
      dtype='object')
config = {"algorithm" : "C4.5"}
model = chef.fit(df, config,  target_label = "Decision")

I am getting error:

[INFO]:  4 CPU cores will be allocated in parallel running
C4.5  tree is going to be built...
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_22413/130574452.py in ?()
----> 1 model = chef.fit(df, config,  target_label = "Decision")

~/anaconda3/envs/rover/lib/python3.10/site-packages/chefboost/Chefboost.py in ?(df, config, target_label, validation_df)
    209                 if enableParallelism == True:
    210                         json_file = "outputs/rules/rules.json"
    211                         functions.createFile(json_file, "[\n")
    212 
--> 213 		trees = Training.buildDecisionTree(df, root = root, file = file, config = config
    214                                 , dataset_features = dataset_features
    215 				, parent_level = 0, leaf_id = 0, parents = 'root', validation_df = validation_df, main_process_id = process_id)
    216 

~/anaconda3/envs/rover/lib/python3.10/site-packages/chefboost/training/Training.py in ?(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id)
    432                 pivot = pd.DataFrame(subdataset.Decision.value_counts()).reset_index()
    433                 pivot = pivot.rename(columns = {"Decision": "Instances","index": "Decision"})
    434                 pivot = pivot.sort_values(by = ["Instances"], ascending = False).reset_index()
    435 
--> 436                 else_decision = "return '%s'" % (pivot.iloc[0].Decision)
    437 
    438                 if enableParallelism != True:
    439                         functions.storeRule(file,(functions.formatRule(root), "else:"))

~/anaconda3/envs/rover/lib/python3.10/site-packages/pandas/core/generic.py in ?(self, name)
   6200             and name not in self._accessors
   6201             and self._info_axis._can_hold_identifiers_and_holds_name(name)
   6202         ):
   6203             return self[name]
-> 6204         return object.__getattribute__(self, name)

AttributeError: 'Series' object has no attribute 'Decision'

Even If I do not rename it doesn't matter. It always throws this error.

@sachinlodhi sachinlodhi changed the title Erro while fitting the model Error while fitting the model Apr 3, 2024
@serengil
Copy link
Owner

serengil commented Apr 5, 2024

What about this?

model = chef.fit(df, config,  target_label = "result")

@serengil serengil added the bug Something isn't working label Apr 5, 2024
@sachinlodhi
Copy link
Author

What about this?

model = chef.fit(df, config,  target_label = "result")

Still I see same error.

@VPK02
Copy link

VPK02 commented Apr 12, 2024

I also got same error. Please help to solve this error.

@sachinlodhi
Copy link
Author

I got it working but it is not consistent. Sometimes it works sometimes it doesn't. So I took these steps:

  1. Clone the repository.
  2. Create the virtual environment. You can also activate the existing one.
  3. Install the requirements.
  4. Run the script.
  5. NOW, IF you get the error then do a thing.... As you installed chefboost by cloning your package is basically located in the cloned directory. What I did is that I was backtracking the calls made and I used print("some random text") in Chefboost.py at line #274, #155, in Training.py at line #395.
  6. I am pretty sure this may seem really ridiculous but I do not know once I started printing the status or some random text it started working. But it is not consistent. Sometimes it fails.

@serengil
Copy link
Owner

package on pip is old. i need to publish the recent changes. meanwhile, you can pull the source code and run it instead of pip package.

git clone https://github.com/serengil/chefboost.git
cd chefboost
pip install -e .

@sachinlodhi
Copy link
Author

package on pip is old. i need to publish the recent changes. meanwhile, you can pull the source code and run it instead of pip package.

git clone https://github.com/serengil/chefboost.git
cd chefboost
pip install -e .

Yes I tried that and sometimes it works but sometimes everything freezes and I have to stop and run the cell again and then it gives some output. For large dataframes like with like 20k rows it freezes. Number of features are like 10-15.

@serengil
Copy link
Owner

would you please try to set parallelism false as

config = {'algorithm': 'C4.5', 'enableParallelism': False}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants