Replies: 2 comments 11 replies
-
Hi, the DoubleMLIRM Model is only available with a single treatment. For multiple treatments you have to fit several models seperately. |
Beta Was this translation helpful? Give feedback.
-
Thank you Mr. Klaassen, I very much appreciate your input. Currently I have created a function that uses your library to compute the ATE for various datasets, where the outcome and treatment variables could be continuous or binary. My function also includes a method to find the top n treatments if user is not aware of what may be an important factor to include in their treatment list. Would you like to inspect this functionality and potentially add this to your library? Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi all,
I am currently trying to find the ATE of three binary treatments on a binary outcome. Which model would I use to do this? I know the DoubleMLIRM model can use binary treatments, but I can only use one binary treatment in the DoubleMLData d_cols paramter. Does anyone have a solution or any suggestions? Thanks in advance!
Here is my code:
"""
import pandas as pd
import doubleml as dml
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
' Load the dataset into a pandas dataframe'
df = pd.read_csv('/content/ai4i2020.csv')
'Assuming df is your DataFrame'
df['Machine failure'] = np.where((df['TWF'] == 1) | (df['HDF'] == 1) | (df['PWF'] == 1) | (df['OSF'] == 1) | (df['RNF'] == 1), 1, 0)
'Convert 'Machine failure' to integer'
df['Machine failure'] = df['Machine failure'].astype(int)
'One-hot encode categorical treatment variables'
df = pd.get_dummies(df, columns=['Type']).drop(['UDI', 'Product ID'], axis=1)
'Correct the data type assignments'
df['Type_H'] = df['Type_H'].astype(int)
df['Type_L'] = df['Type_L'].astype(int)
df['Type_M'] = df['Type_M'].astype(int)
'Define treatment, outcome, and covariates'
treatments = ['Type_M']
'##### I would like to use this: ['Type_M', 'Type_H', 'Type_L']#############'
outcome = 'Machine failure'
covariates = ['Air temperature [K]', 'Process temperature [K]', 'Rotational speed [rpm]', 'Torque [Nm]', 'Tool wear [min]']
'Split the data into training and testing sets'
train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)
'Create a DoubleMLData object for training'
train_data_dml_base = dml.DoubleMLData(train_df, y_col=outcome, d_cols=treatments, x_cols=covariates)
'Boosted Trees for classification'
boost_class = XGBClassifier()
np.random.seed(123)
dml_irm = dml.DoubleMLIRM(train_data_dml_base, ml_g=boost_class, ml_m=boost_class)
dml_irm.fit(store_predictions=True)
irm_summary = dml_irm.summary
print(irm_summary)
"""
Beta Was this translation helpful? Give feedback.
All reactions