Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EP comparison to other PPLs and roadmap (HMC etc) #208

Open
ksachdeva opened this issue Jan 3, 2020 · 2 comments
Open

EP comparison to other PPLs and roadmap (HMC etc) #208

ksachdeva opened this issue Jan 3, 2020 · 2 comments

Comments

@ksachdeva
Copy link

Hi,

I have an extensive background in dotnet stack however for about 1 year and half I have been using python to perform all things machine learning. For last few months I have been studying bayesian statistics and probabilistic programming. Needless to say, for me C# is a far superior language than python however what matters the most is the libraries available in a particular domain. Therefore, seeing libraries like Infer.net gives me a ray of hope of using a better technology stack (i.e. dotnet).

That said, I have explored this library and have a decent idea on how it works. Have read (and executed as well) some chapters of MBML book. I am familiar with pymc3, pyro, webppl, edward and tensorflow_probability. All these PPLs or embedded PPLs have their respective take on how models should be expressed, executors/compilers, accelerator support and provide implementation of various inference algorithms.

As I did my feasibility I have noticed following -

  • Infer.net has three inference algorithms - EP, Gibbs and VI with EP being the default. While running the code corresponding to MBML book, I tried to change the algorithm to Gibbs and VI and almost all of the time (tried 3-4 projects) I got an unsupported error. I also see lot of examples in infer.net that have assertions or exception if EP is not the selected algorithm. This shows that even though other inference engines are there they are not first class citizens.

  • PyMC3 and Pyro do not support EP. They do implement variety of MCMC and VI based algorithms. I know you can not speak for why these libraries have not (yet ?) implemented EP but it somewhere indicates to me that they value other algorithms overs EP. Does this mean EP is not state of art ?

  • Edward (I am least familiar with it) lists various inference algos here - http://edwardlib.org/api/ed/inferences ... I do not see an explicit mention of EP but I have seen an example of it here http://edwardlib.org/api/inference-compositionality

  • Stan (again superficial knowledge of stan ecosystem) but if you look at the landing page (https://mc-stan.org/) I do not see mentioning of EP. They do mention others. May be they support it; I am not sure about this one.

May be my understanding is incorrect but it seems that expectation propagation algorithm is not favored by many other PPLs and frameworks. At least it is not a high priority for them where as Infer.net is primarily EP.

Seeing this contradiction i.e. Infer.net making EP the first class citizen and others not even supporting it, I feel I am missing something fundamental here. I would sincerely appreciate if you could correct my understanding and guide.

Even if my understanding is incorrect and there are scenarios where EP works better than other inference algorithms I find that in comparison to other frameworks Infer.net is limited in the number of supported algorithms. Is there a roadmap/plan to implement more inference engines (exact and approximate inference - Junction tree, ADVI, HMC etc)

Many thanks for this hard work

Regards
Kapil

@tminka
Copy link
Contributor

tminka commented May 28, 2020

There isn't any roadmap for Infer.NET. Microsoft folks make changes based on what they need at the time. Folks outside Microsoft can make pull requests anytime for any part of the code or documentation.

The original vision for Infer.NET was to start with 3 inference algorithms and include more over time. But as people in Microsoft started using Infer.NET, they preferred EP for most models. This created a feedback loop where EP received the most attention and development, making it better and thus even more attractive relative to the other two algorithms. After a certain point, no one in Microsoft was using any inference algorithm besides EP in Infer.NET. There are still plenty of Microsoft products that use EP internally to this day, and they depend on the continual improvements we are making to Infer.NET.

After other PPLs rose up, anyone who wanted to use a different algorithm like ADVI or HMC would just use those other PPLs, so there was no pressure to add those algorithms to Infer.NET (and there still isn't, in my opinion). Essentially what has happened is that people first decide what inference algorithm they want to use, then choose the appropriate PPL.

@solna86
Copy link

solna86 commented Jun 15, 2020

I thought it would be useful for @ksachdeva to add my outsider view.

Infer.NET is less expressive than Pyro and the other languages you mention as it is not a Turing-complete probabilistic programming language. It uses factor graphs, which cannot represent all possible probability distributions.

However, for those problems where factor graphs are sufficient (and that includes a vast space of probability distributions) it's simply a lot more efficient.

I find Infer.NET extremely good for problems where I have small or medium sized datasets, I have good domain knowledge to build a model and/or I want quick and predictable inference.

The only library that competes in the same space as Infer.NET right now is, I think, ForneyLab.jl. It's nice but quite less mature at the minute. I like it generates messaging schedules using macros, but it's documentation is still too sparse and it has efficiency problems with medium-sized datasets (one workaround is to split in mini-batches and perform online learning).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants