Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in matches predictions #71

Open
andony-arrieula opened this issue Feb 28, 2024 · 8 comments
Open

Error in matches predictions #71

andony-arrieula opened this issue Feb 28, 2024 · 8 comments

Comments

@andony-arrieula
Copy link
Contributor

There are 2 errors in the match prediction section:

  • the data used to make the prediction is not updated, it's the data of the previous match that is reused
  • the data is passed as a list of values, but the list is not given in the same order as during the training

The match prediction is not usable then.

@andony-arrieula
Copy link
Contributor Author

It seems it's not the data not in the good order, but the preprocessing done in preprocess_dataset which is not performed on the prediction data.

@kochlisGit
Copy link
Owner

This should happen only during Cross-Validation process and it's perfectly normal. The idea is that we randomly selected a train set and a n evaluation set, to measure the performance of the models.

However, during the training period (ONLY), it should use the first 100 matches as test and the rest of the dataset should be processed in the correct order!
Otherwise, can you print the results of the preprocess_dataset for the evaluation data to verify this?

@andony-arrieula
Copy link
Contributor Author

The problem is not on the evaluation data, but on the prediction data passed to the predict match dialog.

For example, I launch my trained model on Ligue 1 French league, I select Paris SG as home team (the best team in the league) and Clermont as away team (the worst team in the league), and I enter the same odd (3.00) for all possible results, the algorithm gives me probabilities of 0.32 for Paris 0.23 a draw and 0.44 for Clermont, which is totally incoherent, and this is because the data given to the model is not processed before it was passed to the model.

@kochlisGit
Copy link
Owner

So the program grabs the features from the history tables, but does not preprocess them beforing passing them to model for prediction?

@andony-arrieula
Copy link
Contributor Author

Exactly !

But the program also does not update the statistics with the data of the previous match !

@kochlisGit
Copy link
Owner

Thanks. I will take a look into it.

@andony-arrieula
Copy link
Contributor Author

I am also looking on that issue on my own side, I think the best way to proceed is to modify the construct_input() method to process the data before giving it to the model to make the prediction.

@kochlisGit
Copy link
Owner

Yeah, I think that would be the best way. The rows should be processed before returned using the scaler of them model's config (if any)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants