-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A Discrete Bayesian Graph #210
Comments
Hi @erlebach! I am not sure I understand your model. Can you provide the equations for your generative model? Or you can try to explain what are you trying to model with this graph? It seems strange to me that you have a shared placeholder for the output of First of all, indeed there is no close form solution (aka sp message) in To circumvent the problem you need to introduce a recognition distribution ("variation technique") around the Categorical node. In the code below I've added one more line before algorithm construction: # The model
gr = FactorGraph()
# @RV b1 ~ Clamp(0.4)
# @RV b2 ~ Clamp(0.3)
@RV d ~ Bernoulli(0.4)
@RV i ~ Bernoulli(0.3)
@RV grade_ ~ Nonlinear{Sampling}(i, d, g=grade, n_samples=50)
@RV g ~ Categorical(grade_)
@RV letter_ ~ Nonlinear{Sampling}(g, g=letter, n_samples=50)
@RV l ~ Bernoulli(letter_)
@RV score_ ~ Nonlinear{Sampling}(i, g=score, n_samples=50)
@RV s ~ Bernoulli(score_)
# placeholder(d, :d)
placeholder(i, :i)
placeholder(s, :s)
placeholder(l, :l)
placeholder(g, :g)
# The line the produces the error:
q = PosteriorFactorization(g, grade_, ids=[:g, :grade_])
algo = messagePassingAlgorithm([d]) # <<<<< ERROR However this code won't work either, because one of your recognition factors What you can do, is to remove # The model
gr = FactorGraph()
# @RV b1 ~ Clamp(0.4)
# @RV b2 ~ Clamp(0.3)
@RV d ~ Bernoulli(0.4)
@RV i ~ Bernoulli(0.3)
@RV grade_ ~ Nonlinear{Sampling}(i, d, g=grade, n_samples=50)
@RV g ~ Categorical(grade_)
@RV letter_ ~ Nonlinear{Sampling}(g, g=letter, n_samples=50)
@RV l ~ Bernoulli(letter_)
@RV score_ ~ Nonlinear{Sampling}(i, g=score, n_samples=50)
@RV s ~ Bernoulli(score_)
# placeholder(d, :d)
placeholder(i, :i)
placeholder(s, :s)
placeholder(l, :l)
# placeholder(g, :g)
# The line the produces the error:
q = PosteriorFactorization(g, grade_, ids=[:g, :grade_])
algo = messagePassingAlgorithm([d]) # <<<<< ERROR
source_code = algorithmSourceCode(algo)
eval(Meta.parse(source_code)); THIS WORKS! |
Hi @albertpod:
I was hoping initially to be able to compute marginals on any variable not observed without appealing to variational inference. I wished to experiment with choosing different subsets of observed variables and examine their effect. Furthermore, I want to be able to specify which variables are observed via a dictionary, passed to some function, and have everything the appropriate inference executed. It should be noted that I am only providing a single observation at this stage, and the various distributions of the nodes are specified, so an analytical solution is possible. By analytical, I mean the following. The marginal of I is a Bernoulli with parameter Note that I might want to observe I have implemented the model successfully in pymc, if that is of that is of interest to you. These implementations serve as exercises to develop expertise in a variety of frameworks in Julia and Python. For reference, I have never heard the term |
How about we move this to the discussions? |
Sorry, I noticed the new topic on As for recognition distribution We had a short discussion with @ThijsvdLaar about your issue. It seems like you can be just fine with I understand your wish to observe In the meantime, I will try to come up with the inference solution for your model using a |
Thank you! Quick question on something I never understood. If a variable is observed, its probability distribution is quite irrelevant, is that not true? However, when estimating a likelihood, its probability distribution still impacts the posterior through the remaining non-observed variables. If true, P(G|D, I) is still needed. In any case, ideally, I would like, if possible, the flexibility of observing or not observing any subset of random variables without having to write different methods for each combination of observables. Could't an observable be implemented in a general manner through multiplication by a Delta function? This is possible with MCMC, but perhaps not with message passing. |
I am not sure I've understood your question. What do you mean by irrelevant distribution? (let's continue this part at the Discussion; it is always helpful to draw a graph and inspect how the messages flow depending on your graph structure) As for the solution with a using ForneyLab
using ProgressMeter
gA = 3
gB = 5
gC = 8
# Three deterministic functions
function grade(i, d)
if i == 0 && d == 0
g1,g2,g3 = [0.3, 0.4, 0.3]
elseif i == 0 && d == 1
g1,g2,g3 = [0.05, 0.25, 0.7]
elseif i == 1 && d == 0
g1,g2,g3 = [0.9, 0.08, 0.02]
else
g1,g2,g3 = [0.5, 0.3, 0.2]
end
return [g1, g2, g3]
end
function letter(g)
if g == gA
l[1] = .9
elseif g == gB
l = .6
else
l = 0.01
end
return [l, 1-l] # changed to comply with Catergorical distribution
end
function score(i)
if i[1] == 0
s = 0.05
else
s = 0.8
end
return [s, 1-s] # changed to comply with Catergorical distribution
end
# The model
gr = FactorGraph()
@RV d ~ Categorical([0.4, 0.6]) # equivalent to Bernoulli(0.4)
@RV i ~ Categorical([0.3, 0.7]) # equivalent to Bernoulli(0.3)
@RV grade_ ~ Nonlinear{Sampling}(i, d, g=grade, n_samples=50, dims=[(3,),(2,),(2,)]) # we need to specify the dimensionality for each interface, i.e. grade_, i, d
@RV g ~ Categorical(grade_)
@RV letter_ ~ Nonlinear{Sampling}(g, g=letter, n_samples=50, dims=[(2,),(2,),(2,)])
@RV l ~ Categorical(letter_)
@RV score_ ~ Nonlinear{Sampling}(i, g=score, n_samples=50, dims=[(2,),(2,),(2,)])
@RV s ~ Categorical(score_)
placeholder(s, :s, dims=(2,))
placeholder(l, :l, dims=(2,))
q = PosteriorFactorization(g, grade_, ids=[:G, :Grade])
algo = messagePassingAlgorithm([d])
source_code = algorithmSourceCode(algo)
eval(Meta.parse(source_code))
# Prior statistics
vmp_its = 100
data = Dict(:s => [1.0, 0.0], :l=>[0.0, 1.0])
marginals = Dict{Any, Any}(:g => ProbabilityDistribution(Univariate, Categorical, p=one(3)/3))
messages = initGrade()
@showprogress for i in 1:vmp_its
stepGrade!(data, marginals, messages)
stepG!(data, marginals, messages)
end
println(ForneyLab.unsafeMean(marginals[:i]))
println(ForneyLab.unsafeMean(marginals[:d]))
println(ForneyLab.unsafeMean(marginals[:g])) As a side note, at the moment, our lab focuses on |
Thanks! One question: you wrote: @RV letter_ ~ Nonlinear{Sampling}(g, g=letter, n_samples=50, dims=[(2,),(2,),(2,)]) Shouldn't this be written as @RV letter_ ~ Nonlinear{Sampling}(g, g=letter, n_samples=50, dims=[(2,),(2,)]) I am surprised this would have worked for you. |
@erlebach ah, right, I made a typo. Your variant is correct. FL just ignored the last dimensionality specification. We should throw an error or warning when a user does it. |
Great. I ran the code, and it does not work. But I have a question. You wrote: data = Dict(:s =>[1.0, 0.0], :l => [0.0, 1.0]) Why is that? When sampled, a Categorical returns a single number (0,1), or (1,2)? Why are you specifying a vector of length 2? Or are you specifying two observed values? I just want to make sure since in my original problem, I only assigned one observed value. When executing TypeError: in keyword argument m, expected Number, got a value of type Vector{Float64}. Looking more closely, the error is in ProgressMeter. I removed Progressmeter, and still get the same error. TypeError: in keyword argument m, expected Number, got a value of type Vector{Float64}
Stacktrace:
[1] stepGRADE!(data::Dict{Symbol, Vector{Float64}}, marginals::Dict{Any, Any}, messages::Vector{Message})
@ Main ./none:17
[2] top-level scope
@ ./In[69]:3
[3] eval
@ ./boot.jl:373 [inlined]
[4] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
@ Base ./loading.jl:1196 |
I have a feeling that you haven't specified the dimensionality of your placeholder as I, i.e. As for data, the Categorical node in FL defines a one-dimensional probability distribution over the normal basis vectors of dimension |
You are correct, @albertpod ; I had not specified the dimensionality of the placeholders. I did not copy/paste your code, but instead modified mine. That is how I learn faster. I did not realize that a Categorical was one-hot encoded. Makes sense (different than PyMC). The code now runs. Whether it runs correctly is another matter. I am surprised that a 100 iterations of a variational method runs slower than PyMC running 5,000 iterations of a Monte-Carlo sampler. Of course, the variational approach provides a parametrized distribution. And I have not been careful to make precise measurements. I guess the answer is because I am using the nonlinear sampler three times in my Model. :-) Why do you use |
I understand, no worries. Well, actually you also use "sampling". ForneyLab knows how to combine different message passing techniques. Here you have used hybrid message passing, i.e. SP, VMP, and Sampling-based (around a nonlinear node). It's hard to compare the iterations from one package to another, but I am also surprised by your observation. I would be also suspicious about the correctness of the result. Unfortunately, for this model, we cannot inspect the free energy (our minimization target). So we don't know when the algorithm converges: within 1, 2,3, or 100 iterations. Maybe you can check how your marginals change over iterations. We still need to check a few things on our side. Perhaps, we'll come up with a better solution. As for |
All good points. Actually, I am not convinced of convergence either. Marginals look fine. To check my marginals over iterations, I would have to output data using a callback function. Is that possible? I can also create a latex document more clearly explaining what I am doing at this point. You'll note that I have not created a prior for the various probabilities that I am "stating" when stating my model. So I am not doing inference in the normal sense of the word. I have a joint distribution that is known and am simply fixing some parameter values. By the way, when fixing the data as :s => [1., 0.], that is as expected since the output of a Bernoulli is [1,0] or [0,1]. What happens if I set my data to [0,0] or [0.5, 0.5]? I would not expect [0,0] to work since the elements do not sum to 1. [0.5, 0.5] probably won't work since my samples can only be [0,1] or [1,0]. At least that my expectation when using Bernoulli's. If you have any experience with PyMC, I can send you my code (which works) and then you can get a better idea about what I am doing. Or I can email you a pdf that explains more in detail. |
Hi @erlebach! Here's pseudocode for tracking your marginals: marginals = Dict{Any, Any}(:g => ProbabilityDistribution(Univariate, Categorical, p=one(3)/3))
marginal_storage = Dict{Any, Any}(:g => [], :i => [], :d => []) # or anything else you are interested in
messages = initGrade()
@showprogress for i in 1:vmp_its
marginals = stepGrade!(data, marginals, messages)
push!(marginal_storage[:i], marginals[:i])
push!(marginal_storage[:d], marginals[:d])
marginals = stepG!(data, marginals, messages)
push!(marginal_storage[:g], marginals[:g])
end Having a rigorous explanation of the model will be helpful, but I am afraid we won't have much time to look into that atm. As for data, in FL, the Categorical node does not care how you encode the observation. For example, you can say I recommend inspecting the generated code: P.S. Changing Bernoulli distributions to Categorical is a temporary solution. Your issue indicated that there is a bug in FL. We will start working on it soon. |
Thank you for this explanation. I will try all this out! |
I am implementing a discrete Bayesian network using a few Nonlinear functions and cannot get the
messagePassingAlgorithm()
method to run without error. The resulting graph has all its edges terminated either byClamp
or via aplaceholder
. Here is the code, which is rather simple:The error produced is:
Clearly, this means that there is no SumProduct rule to handle categorical variables with a SampleMessage Input. So my question is: how is this case treated? Would a straightforward variation technique be appropriate? Thanks.
The text was updated successfully, but these errors were encountered: