New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem getting accuracy predictions #1
Comments
Your usage of the predict function is wrong, look at the line: |
I get a different error when I change it to FYI, here's the new error I get: Iteration: 0 Loss: 0.6686762701198394 ValueError Traceback (most recent call last) /usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py in accuracy_score(y_true, y_pred, normalize, sample_weight) /usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py in _check_targets(y_true, y_pred) /usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py in check_consistent_length(*arrays) ValueError: Found input variables with inconsistent numbers of samples: [3655, 14618] |
It seems that the error raises from the sklearn.accuracy_score function, can you add some prints of y_test/y_pred before of calling "accuracy_score"? anyway if you are patient I'll add more metrics by the end of this week. |
It is fine to wait until you add the accuracy yourself, but here's the info you asked for:
1.4546618 0.03125 0.625 0.27828315 1.30357143 -0.73214286 |
Classification and Regression are two major prediction problems that are usually dealt with in Data mining and machine learning. accuracy_score is a classification metric, you cannot use it for a regression problem. You get the error because these regression model do not produce binary outcomes, but continuous (float) numbers (as all regression model do); so, when scikit-learn attempts to calculate the accuracy by comparing a binary number (true label) with a float (predicted value), it not unexpectedly gives an error. And this cause is clearly hinted at the error message itself: The sklearn.metrics.accuracy_score(y_true, y_pred) method defines y_pred as: y_pred : 1d array-like, or label indicator array / sparse matrix. Predicted labels, as returned by a classifier. Which means y_pred has to be an array of 1's or 0's (predicated labels). They should not be probabilities. |
When I add the code to get an accuracy score, I get an error. The code I added was:
y_pred = DSL_learner.predict(X_train, y_train)
y_pred = numpy.argmax(y_pred,axis=1)
I am using numpy 1.14 and Ubuntu 16.04.4 .
See below:
===========================
from deepSuperLearner import *
ETC = ExtraTreesClassifier()
GB = GradientBoostingClassifier()
Base_learners = {'ETC':ETC, 'GB':GB}
np.random.seed(100)
DSL_learner = DeepSuperLearner(Base_learners)
DSL_learner.fit(X_train, y_train)
y_pred = DSL_learner.predict(X_train, y_train)
y_pred = numpy.argmax(y_pred,axis=1)
print('Final prediction accuracy score: [%.4f]' % accuracy_score(y_test, y_pred))
DSL_learner.get_precision_recall(X_test, y_test, show_graphs=True)
Iteration: 0 Loss: 0.6936235359636144
Weights: [0.95464725 0.04535275]
Iteration: 1 Loss: 0.6922511148906573
Weights: [0.96612301 0.03387699]
Iteration: 2 Loss: 0.6930091286990414
Weights: [1. 0.]
ValueError Traceback (most recent call last)
in ()
13 DSL_learner = DeepSuperLearner(Base_learners)
14 DSL_learner.fit(X_train, y_train)
---> 15 y_pred = DSL_learner.predict(X_train, y_train)
16 y_pred = numpy.argmax(y_pred,axis=1)
17 print('Final prediction accuracy score: [%.4f]' % accuracy_score(y_test, y_pred))
/usr/local/lib/python3.5/dist-packages/deepSuperLearner/deepSuperLearnerLib.py in predict(self, X, return_base_learners_probs)
239 X = np.hstack((X, avg_probs))
240
--> 241 if return_base_learners_probs:
242 return avg_probs, base_learners_probs
243
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
The text was updated successfully, but these errors were encountered: