Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added random_forest_regressor #3624

Closed

Conversation

arthiondaena
Copy link

addresses issue #3598

I implemented random_forest_regressor class.

Copy link

mlpack-bot bot commented Feb 18, 2024

Thanks for opening your first pull request in this repository! Someone will review it when they have a chance. In the mean time, please be sure that you've handled the following things, to make the review process quicker and easier:

  • All code should follow the style guide
  • Documentation added for any new functionality
  • Tests added for any new functionality
  • Tests that are added follow the testing guide
  • Headers and license information added to the top of any new code files
  • HISTORY.md updated if the changes are big or user-facing
  • All CI checks should be passing

Thank you again for your contributions! 👍

@shaojunjie0912
Copy link

Compared with the example of decision tree regression, it is found that there is a problem with the prediction value of random forest regression, and the specific code is as follows.

#include <armadillo>
#include <mlpack/methods/random_forest.hpp>

int main() {
    arma::mat dataset(10, 1000, arma::fill::randu);            // 1000 points
    arma::rowvec responses = arma::randn<arma::rowvec>(1000);  // responses

    responses.print();  // print responses

    arma::mat testDataset(10, 500, arma::fill::randu);  // 500 test points
    mlpack::RandomForestRegressor rf;
    rf.Train(dataset, responses);  // train
    arma::rowvec predictions;
    rf.Predict(testDataset, predictions);  // predict

    predictions.print();  // print predictions
}

The predicted values for decision tree regression are as follows.

decisiontree

The predicted values for random forest regression are as follows.

randomforest

@arthiondaena
Copy link
Author

@shaojunjie0912 Thank you for informing, I forgot to change the type of prediction variable. The size_t is trying to store double, the result was the large integer value you were checking.

If there are anymore errors inform me, I will try my best to fix them.

@arthiondaena
Copy link
Author

Hey @zoq, I completed everything that a random forest regressor needs, if there is any need for modification or addition, ping me.

Now I want to add an adaboost regressor which is another part of that issue, these are the things I will be implementing:

  • Prune function for decision tree regressor
  • Loss functions(linear, square law, exponential)
  • Atlast the implementation
    What are your thoughts about it?

Copy link
Member

@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code, do you think it would be possible to add another Predict function to the existing RandomForest implementation?

namespace mlpack {

/**
* This class implements a random forest regressor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to have a description for the none trivial template parameters.

Comment on lines 58 to 63
//Calculate r2 score for random forest regressor
double rfrR2Score = 1 - arma::sum(square(Ytest-rfrPred))/arma::sum(square(Ytest-TestMean));

arma::Row<double> dtrPred;
dtr.Predict(Xtest, dtrPred);
//Calculate r2 score for decision tree regressor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//Calculate r2 score for random forest regressor
double rfrR2Score = 1 - arma::sum(square(Ytest-rfrPred))/arma::sum(square(Ytest-TestMean));
arma::Row<double> dtrPred;
dtr.Predict(Xtest, dtrPred);
//Calculate r2 score for decision tree regressor
// Calculate r2 score for random forest regressor.
double rfrR2Score = 1 - arma::sum(square(Ytest-rfrPred))/arma::sum(square(Ytest-TestMean));
arma::Row<double> dtrPred;
dtr.Predict(Xtest, dtrPred);
// Calculate r2 score for decision tree regressor.

Minor style adjustments, to follow https://github.com/mlpack/mlpack/wiki/DesignGuidelines.

Comment on lines 71 to 87
// Loading data.
data::DatasetInfo info;
arma::mat trainData, testData;
arma::rowvec trainResponses, testResponses;
LoadBostonHousingDataset(trainData, testData, trainResponses, testResponses,
info);

// Train a random forest.
RandomForestRegressor<> rfr(trainData, info, trainResponses, 25 /* 25 trees */, 1,
1e-7, 0, MultipleRandomDimensionSelect(4));

REQUIRE(rfr.NumTrees() == 25);

rfr.Train(trainData, info, trainResponses, 20 /* 20 trees */, 1, 1e-7, 0,
true /* warmStart */, MultipleRandomDimensionSelect(4));

REQUIRE(rfr.NumTrees() == 25 + 20);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation is off here, see https://github.com/mlpack/mlpack/wiki/DesignGuidelines#line-length-and-wrapping for more details.

arma::Row<double> dtrPred;
dtr.Predict(Xtest, dtrPred);
//Calculate r2 score for decision tree regressor
double dtrR2Score = 1 - arma::sum(square(Ytest-dtrPred))/arma::sum(square(Ytest-TestMean));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double dtrR2Score = 1 - arma::sum(square(Ytest-dtrPred))/arma::sum(square(Ytest-TestMean));
double dtrR2Score = 1 - arma::sum(square(Ytest-dtrPred)) / arma::sum(square(Ytest-TestMean));

Use space between operations, also all names should follow the naming scheme https://github.com/mlpack/mlpack/wiki/DesignGuidelines#naming-conventions.

@arthiondaena
Copy link
Author

Looking at the code, do you think it would be possible to add another Predict function to the existing RandomForest implementation?

What should be the functionality for this new predict function that is different from other?
And do you mean random forest or random forest regressor, as random forest already has 5-6 predict functions

@shaojunjie0912
Copy link

Hi~

Pls when can this branch be merged into the master branch?

@arthiondaena

@arthiondaena
Copy link
Author

Hi~

Pls when can this branch be merged into the master branch?

@arthiondaena

Should probably ask the reviewer.

@arthiondaena
Copy link
Author

Hey @zoq, when can this be merged?

Copy link

mlpack-bot bot commented Apr 22, 2024

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

@mlpack-bot mlpack-bot bot added the s: stale label Apr 22, 2024
@mlpack-bot mlpack-bot bot closed this Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants