Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"ValueError: DataFrame.dtypes for data must be int, float or bool. Did not expect the data types in fields" even for the columns with type as #407

Open
aakash086 opened this issue Sep 23, 2023 · 1 comment

Comments

@aakash086
Copy link

aakash086 commented Sep 23, 2023

Dataframe has below columns :

Age int64
Worked_in_field int64
Major int64
GPA float64
Projects int64
Experience int64
Bootcamp int64
Ethnicity int8
BusinessTravel int8
DailyRate int64
Department int8
DistanceFromHome int64
Education int64
EducationField int8
EnvironmentSatisfaction int64
Gender int8
HourlyRate int64
JobInvolvement int64
JobLevel int64
JobRole int8
JobSatisfaction int64
Attrition int64

Trained the model to predict Attrition and using DiCE for counterfactuals

d = dice_ml.Data(dataframe=df, outcome_name='Backed_out_before_Round2',continuous_features=['Age', 'Worked_in_field', 'Major', 'GPA', 'Projects', 'Experience'])
m = dice_ml.Model(model=xgb, backend="sklearn")
exp = dice_ml.Dice(d, m, method="random")

query=df[1:2].drop(['Attrition'],axis=1)

e = exp.generate_counterfactuals(query, total_CFs=1, desired_class="opposite",random_seed=0)
e.visualize_as_dataframe(show_only_changes=True)

0%| | 0/1 [00:00<?, ?it/s]

ValueError Traceback (most recent call last)
Cell In[162], line 1

.......

183 msg = """DataFrame.dtypes for data must be int, float or bool.
184 Did not expect the data types in fields """
--> 185 raise ValueError(msg + ', '.join(bad_fields))
187 if feature_names is None and meta is None:
188 if isinstance(data.columns, MultiIndex):

ValueError: DataFrame.dtypes for data must be int, float or bool.
Did not expect the data types in fields Bootcamp, Ethnicity, BusinessTravel, DailyRate, Department, DistanceFromHome, Education, EducationField, EnvironmentSatisfaction, Gender, HourlyRate, JobInvolvement, JobLevel, JobRole, JobSatisfaction

Issue : Even though columns are of datatype int64, int8 the error is somewhat not clear.

@gaugup
Copy link
Collaborator

gaugup commented Sep 25, 2023

@aakash086 could you send the entire stack trace? If you could share a sample notebook with dataset that will help a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants