You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am working on a multi-class classification problem using TensorFlow and experiencing challenges with achieving satisfactory accuracy. I have 7 classes. Each class in a folder contains 2000 .csv files (each file have two columns). When I train the model with a binary classification approach, testing one class with another, the accuracy and val_accuracy go high, 0.85 to 0.95, but when I test with multi-classes, the accuracy reaches a maximum of 0.47. Below is the code containing the data polishing and the model multi-class.
#Load the folders and archive the files csv in dataframes
for class_index, folder_path in enumerate(folder_paths):
for file in os.listdir(folder_path):
file_path = os.path.join(folder_path, file)
df = pd.read_csv(file_path)
data.append(df)
labels.append(class_index)
X = data
y = labels
Find the min value in dataframe
min_length = min(len(df) for df in X)
Set the dataframes with the same lengh
truncated_dfs = [df.head(min_length) for df in X]
dataframe to numpy array
X = np.array([df.values for df in truncated_dfs])
Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
history = model.fit(X_train, y_train,
epochs=100,
validation_data=(X_test, y_test),
callbacks=[checkpoint_callback])
`
I have tried adjusting the architecture of the neural network, experimenting with different activation functions, and optimizing hyperparameters such as learning rate and batch size. However, I am still not achieving the desired accuracy. I'm sure where I'm going wrong is in pre-processing the data or in the model, since binary training has a good results.
Comparing with Binary Training get accuracy 0.85 to 0.95 **Expected accuracy in multiclass: above than 0.90 **
The dataset:
Guidelines
I have read and understood this category's guidelines before making this post.
Programming HelpProgramming languages, open source, and software development.
1 participant
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Body
I am working on a multi-class classification problem using TensorFlow and experiencing challenges with achieving satisfactory accuracy. I have 7 classes. Each class in a folder contains 2000 .csv files (each file have two columns). When I train the model with a binary classification approach, testing one class with another, the accuracy and val_accuracy go high, 0.85 to 0.95, but when I test with multi-classes, the accuracy reaches a maximum of 0.47. Below is the code containing the data polishing and the model multi-class.
`#Classes csv in folders
folder_paths = [
'/content/drive/MyDrive/medical_chem/Aa',
'/content/drive/MyDrive/medical_chem/Ab',
'/content/drive/MyDrive/medical_chem/Ac',
'/content/drive/MyDrive/medical_chem/Ba',
'/content/drive/MyDrive/medical_chem/Bb',
'/content/drive/MyDrive/medical_chem/Cc',
'/content/drive/MyDrive/medical_chem/DD'
]
data = []
labels = []
#Load the folders and archive the files csv in dataframes
for class_index, folder_path in enumerate(folder_paths):
for file in os.listdir(folder_path):
file_path = os.path.join(folder_path, file)
df = pd.read_csv(file_path)
data.append(df)
labels.append(class_index)
X = data
y = labels
Find the min value in dataframe
min_length = min(len(df) for df in X)
Set the dataframes with the same lengh
truncated_dfs = [df.head(min_length) for df in X]
dataframe to numpy array
X = np.array([df.values for df in truncated_dfs])
Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
normalize the data
X_train = normalize(X_train, axis=1)
X_test = normalize(X_test, axis=1)
y_train = to_categorical(y_train, num_classes=7)
y_test = to_categorical(y_test, num_classes=7)
X_train.shape, y_train.shape, X_test.shape, y_test.shape
output ((8943, 2906, 2), (8943, 7), (2236, 2906, 2), (2236, 7))
model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(7, activation='softmax') # Output layer for 7 classes
])
checkpoint to trained model
checkpoint_path = "training_checkpoint/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
checkpoint_callback = ModelCheckpoint(filepath=checkpoint_path,
save_weights_only=True,
save_best_only=True,
monitor='val_loss',
verbose=1)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
#model.load_weights(checkpoint_path)
history = model.fit(X_train, y_train,
epochs=100,
validation_data=(X_test, y_test),
callbacks=[checkpoint_callback])
`
I have tried adjusting the architecture of the neural network, experimenting with different activation functions, and optimizing hyperparameters such as learning rate and batch size. However, I am still not achieving the desired accuracy. I'm sure where I'm going wrong is in pre-processing the data or in the model, since binary training has a good results.
Comparing with Binary Training get accuracy 0.85 to 0.95 **Expected accuracy in multiclass: above than 0.90 **
The dataset:
Guidelines
Beta Was this translation helpful? Give feedback.
All reactions