Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot calculate model.probability() #672

Closed
sviperm opened this issue Jan 7, 2020 · 11 comments
Closed

Cannot calculate model.probability() #672

sviperm opened this issue Jan 7, 2020 · 11 comments

Comments

@sviperm
Copy link

sviperm commented Jan 7, 2020

Moving by the tutorial. On Input 7 and 8 got error. Made 0 changes.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-43-c1e53d06d14d> in <module>
----> 1 model.probability(['A', 'B', 'C'])

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/base.pyx in pomegranate.base.Model.probability()

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/BayesianNetwork.pyx in pomegranate.BayesianNetwork.BayesianNetwork.log_probability()

TypeError: list indices must be integers or slices, not tuple

pip list

Package            Version
------------------ -------
attrs              19.3.0 
backcall           0.1.0  
bleach             3.1.0  
cycler             0.10.0 
decorator          4.4.1  
defusedxml         0.6.0  
entrypoints        0.3    
importlib-metadata 1.3.0  
ipykernel          5.1.3  
ipython            7.11.1 
ipython-genutils   0.2.0  
jedi               0.15.2 
Jinja2             2.10.3 
joblib             0.14.1 
json5              0.8.5  
jsonschema         3.2.0  
jupyter-client     5.3.4  
jupyter-core       4.6.1  
jupyterlab         1.2.4  
jupyterlab-server  1.0.6  
kiwisolver         1.1.0  
MarkupSafe         1.1.1  
matplotlib         3.1.2  
mistune            0.8.4  
more-itertools     8.0.2  
nbconvert          5.6.1  
nbformat           5.0.3  
networkx           2.4    
notebook           6.0.2  
numpy              1.18.1 
pandas             0.25.3 
pandocfilters      1.4.2  
parso              0.5.2  
pexpect            4.7.0  
pickleshare        0.7.5  
Pillow             7.0.0  
pip                9.0.1  
pkg-resources      0.0.0  
pomegranate        0.12.0 
prometheus-client  0.7.1  
prompt-toolkit     3.0.2  
ptyprocess         0.6.0  
Pygments           2.5.2  
pygraphviz         1.5    
pyparsing          2.4.6  
pyrsistent         0.15.6 
python-dateutil    2.8.1  
pytz               2019.3 
PyYAML             5.3    
pyzmq              18.1.1 
scipy              1.4.1  
seaborn            0.9.0  
Send2Trash         1.5.0  
setuptools         39.0.1 
six                1.13.0 
terminado          0.8.3  
testpath           0.4.4  
tornado            6.0.3  
traitlets          4.3.3  
watermark          2.0.2  
wcwidth            0.1.8  
webencodings       0.5.1  
wheel              0.33.6 
zipp               0.6.0  

Python 3.6

@koxu1996
Copy link

Same here

@martin-riedl
Copy link

martin-riedl commented Jan 22, 2020

Can confirm that issue as well. Downgrading from 0.12.0 to 0.11.2 solves that ...

@sviperm
Copy link
Author

sviperm commented Jan 23, 2020

Can confirm that issue as well. Downgrading from 0.12.0 to 0.11.2 solves that ...

Downgrading solves that, but there another bug...

CODE

from pomegranate import BayesianNetwork, DiscreteDistribution, ConditionalProbabilityTable, Node, State
import matplotlib.pyplot as plt

angina_pectoris = DiscreteDistribution({0: 0.99, 1: 0.01})
heart_attack = DiscreteDistribution({0: 0.99, 1: 0.01})

pain_syndrome = ConditionalProbabilityTable([
    
    [0, 0, 1, 0.01],
    [0, 0, 0, 0.99],
    
    [0, 1, 1, 0.9],
    [0, 1, 0, 0.1],
    
    [1, 0, 1, 0.9],
    [1, 0, 0, 0.1],
    
    [1, 1, 1, 1],
    [1, 1, 0, 0],
    
], [angina_pectoris, heart_attack])

relief_of_pain = ConditionalProbabilityTable([
    
    [0, 0, 'yes', 0.1],
    [0, 0, 'not_completely', 0.1],
    [0, 0, 'no', 0.1],
    
    [0, 1, 'yes', 0.05],
    [0, 1, 'not_completely', 0.35],
    [0, 1, 'no', 0.6],
    
    [1, 0, 'yes', 0.9],
    [1, 0, 'not_completely', 0.05],
    [1, 0, 'no', 0.05],
    
    [1, 1, 'yes', 0.33],
    [1, 1, 'not_completely', 0.33],
    [1, 1, 'no', 0.33],
    
], [angina_pectoris, heart_attack])

ps = State(pain_syndrome, name='pain_syndrome')
rp = State(relief_of_pain, name='relief_of_pain')
ap = State(angina_pectoris, name='angina_pectoris')
ha = State(heart_attack, name='heart_attack')

model = BayesianNetwork('Medical decision support system')

model.add_states(ps, rp, ap, ha)

model.add_edge(ap, ps)
model.add_edge(ap, rp)

model.add_edge(ha, ps)
model.add_edge(ha, rp)

model.bake()

model.probability([0, 'yes', 0, 1])

ERROR

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)

<ipython-input-19-cfc278c8d52f> in <module>
----> 1 model.probability([0, 'yes', 0, 1])
      2 

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/base.pyx in pomegranate.base.Model.probability()

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/BayesianNetwork.pyx in pomegranate.BayesianNetwork.BayesianNetwork.log_probability()

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/distributions/ConditionalProbabilityTable.pyx in pomegranate.distributions.ConditionalProbabilityTable.ConditionalProbabilityTable.log_probability()

KeyError: ('0', '1', '0')

@ghost
Copy link

ghost commented Jan 24, 2020

@sviperm

Your error appears to be originating from your different datatypes for the assignments. I suggest you to transform all integers (0,1) to strings ('0','1')

Thus, it's a different issue than what you stumpled upon initialy

@jmschrei
Copy link
Owner

I would recommend making the input to probability and log_probability always be 2D even when there's a single example. I have a paper deadline on the 30th but I'll look into this shortly after. Sorry for the inconvenience.

@martin-riedl
Copy link

Thanks to @SebastianBelkner, we got the problem for the 0.12 incompatibility:
model.probability(...) documentation and examples requires an (2d) array-like structure.

Looking at the v0.11.2 implementation, the array like structure was converted first to a Numpy array and then accessed via numpyarr[i,j]:

def log_probability(self, X, n_jobs=1):
	        <...>
		X = numpy.array(X, ndmin=2)
		<...>
		for i in range(n):
			for j, state in enumerate(self.states):
				logp[i] += state.distribution.log_probability(X[i, self.idxs[j]])

		return logp if n > 1 else logp[0]

The conversion to a Numpy has been removed later. Consequently, the way of accessing values does not work that way. **Converting the 2d-array in advance to a Numpy array as argument for model.probability(...) solves the problem. **

IMHO there are two solutions:

  • either include X = numpy.array(X, ndmin=2) again, or
  • update documentation and examples.

@jmschrei
Copy link
Owner

jmschrei commented Feb 7, 2020

I will add the casting as a numpy array back in in the next version. Thanks for catching this and posting a temporary solution.

@hanumantha03
Copy link

Can confirm that issue as well. Downgrading from 0.12.0 to 0.11.2 solves that ...

It really solved my problem. If you are using Kaggle then, install pomegranate in the first cell using the following command "!pip install pomegranate==0.11.2"

@jmschrei
Copy link
Owner

Passing in a single vector now raises an error. You should pass in a 2D matrix or a list of lists (even when there is only one example). In v0.12.1 print(model.probability([[0, 'yes', 0, 1]])) will return 4.95e-5, which looks like the right answer.

@koxu1996
Copy link

@jmschrei For current release neither list nor list of lists is working. Is there already some change in source code or example will be updated?

@jmschrei
Copy link
Owner

v0.12.1 will have the fix. I will release that soon. For now, pass in a 2D numpy array and it should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants