Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding the benchmark coverage of this repository for all the toolkits #143

Open
braceletboy opened this issue Dec 19, 2019 · 3 comments

Comments

@braceletboy
Copy link
Contributor

@zoq @rcurtin I felt that this is a fantastic project where people can find which ml-toolkits are better for certain algorithms, and where the toolkits can improve themselves. So, I have been doing some work on my own that might be useful for this project. I have made a google sheet of the data I have been collecting in this regard. This google sheet contains:

  • the names of various machine learning and statistical analysis algorithms supported by the toolkits benchmarked in this repository
  • in which libraries they are found and in which libraries they aren't found
  • what are the API classes or functions that correspond to the algorithms
  • which algorithms have benchmarks and for what libraries are these benchmarks written.

I have till now covered all the algorithms provided by scikit-learn, mlpack and I am in the process of adding all the algorithms provided by Shogun into this list. This is a work in progress. I am going to add more algorithms to this list in the coming future and hopefully complete this. This is the google sheet that I am preparing:

image

image

image

image

I this regards I have some questions:
a) Is the aim of this project limited to benchmarking the algorithms supported mlpack? If no, I feel that having a sheet like this one, would help. (I got the idea of consolidating all this in a google sheet after I saw a google sheet on tensorflow's github when they were making tensorflow 2.0 and had to list all the API classes that needed some specific change).
b) Also, would it be possible for contributors from mlpack to also contribute to this sheet? I can give edit access. Currently, there are around 166 algorithms that are already listed with many more algorithms not covered and I haven't yet gone through all the library APIs. Would appreciate the help :)

@rcurtin
Copy link
Member

rcurtin commented Dec 24, 2019

a) Is the aim of this project limited to benchmarking the algorithms supported mlpack? If no, I feel that having a sheet like this one, would help. (I got the idea of consolidating all this in a google sheet after I saw a google sheet on tensorflow's github when they were making tensorflow 2.0 and had to list all the API classes that needed some specific change).

When we originally started on this project (it was @zoq's GSoC project many years ago :)) the idea was to use this benchmarking system to compare mlpack's implementations against other implementations. But it's grown somewhat since then, and honestly, it's a pretty general-purpose benchmarking system, so I don't see any need to limit only to algorithms that mlpack supports.

b) Also, would it be possible for contributors from mlpack to also contribute to this sheet? I can give edit access. Currently, there are around 166 algorithms that are already listed with many more algorithms not covered and I haven't yet gone through all the library APIs. Would appreciate the help :)

Sure, I would imagine that there would be some interest. You might try posting it on the mlpack chat channel (IRC/Matrix/gitter/etc.): https://www.mlpack.org/community.html#real-time-chat

@zoq
Copy link
Member

zoq commented Dec 28, 2019

a) Is the aim of this project limited to benchmarking the algorithms supported mlpack? If no, I feel that having a sheet like this one, would help. (I got the idea of consolidating all this in a google sheet after I saw a google sheet on tensorflow's github when they were making tensorflow 2.0 and had to list all the API classes that needed some specific change).

Awesome, thanks for putting everything together.

b) Also, would it be possible for contributors from mlpack to also contribute to this sheet? I can give edit access. Currently, there are around 166 algorithms that are already listed with many more algorithms not covered and I haven't yet gone through all the library APIs. Would appreciate the help :)

Happy to help, just send you a request.

@braceletboy
Copy link
Contributor Author

braceletboy commented Jan 13, 2020

@zoq and @rcurtin Sorry for the late response. I was on an extended vacation. I have approved your request @zoq. Please have a look at it and let me know what you think about it. Also let me know if you have any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants