add learnPattern ClassifierRegion #800

Lupino · 2020-05-03T14:02:08Z

when I try the rest api, then ClassifierRegion learn with TM.activeCells and
infer with TM.activeCells, the result is alway the same.

I read the examples hello_tm.py, tm predict next value is predictedActiveCells.

So ClassifierRegion need add an learnPattern for learning, and origin
pattern for infer.

when I try the rest api, then ClassifierRegion learn with TM.activeCells and infer with TM.activeCells, the result is alway the same. I read the examples `hello_tm.py`, tm predict next value is predictedActiveCells. So ClassifierRegion need add an learnPattern for learning, and origin pattern for infer.

dkeeney · 2020-05-03T23:10:47Z

I am trying to understand why you need a 'learnPattern' .
The 'pattern' input is used as the learn pattern if the 'learn' flag is on.
The 'pattern' input is also the 'infer' pattern in all cases.
I don't understand why you would need the pattern to be different.

Keep in mind that the Classifier is not using a htm algorithm, instead it uses classical NN to perform the matching of bucket to pattern. This means that it takes LOTS of input for each bucket value in learn mode to learn enough to guess which bucket value would produce any 'pattern' value.

Lupino · 2020-05-04T03:19:06Z

    {network: [
       {addRegion: {name: "encoder", type: "RDSEEncoderRegion", params: {size: 1000, sparsity: 0.2, radius: 0.03, seed: 2019, noise: 0.01}}},
       {addRegion: {name: "sp", type: "SPRegion", params: {columnCount: 2048, globalInhibition: true}}},
       {addRegion: {name: "tm", type: "TMRegion", params: {cellsPerColumn: 8, orColumnOutputs: true}}},
       {addRegion: {name: "clsr", type: "ClassifierRegion", params: {learn: true}}},
       {addLink:   {src: "encoder.encoded", dest: "sp.bottomUpIn"}},
       {addLink:   {src: "sp.bottomUpOut", dest: "tm.bottomUpIn"}},
       {addLink:   {src: "tm.bottomUpOut", dest: "clsr.learnPattern"}},
       {addLink:   {src: "tm.predictedActiveCells", dest: "clsr.pattern"}},
       {addLink:   {src: "encoder.bucket", dest: "clsr.bucket"}}
    ]}

I use ClassifierRegion to infer predictedActiveCells and learn with activeCells

dkeeney · 2020-05-04T13:04:50Z

I don't think this is going to give you what you expect.
The idea is to train the Classifier on the pattern that you want it to infer on.

For example, assume any network of SP,TM or whatever and you feed a set of buckets from the encoder into that network. Then you hook up a Classifier to some output that that you are interested in. I think predictedActiveCells output in your case but it could be any output. When the encoder passes in pattern A, the network generates pattern A'. And if it is passing bucket B to your network it returns pattern B'. If the Classifier sees enough examples of A into A' and B into B' while in learning mode then when learning is turned off it will continue to infer A was the bucket when it sees pattern A'. For this to work, the pattern passed to the classifier while learning must be the same pattern that is used by the classifier to infer.

breznak · 2020-05-04T16:32:37Z

I guess I can see what @Lupino is trying to achieve.
I'm not sure it's a correct approach or if/how we should implement that.

the TM is more complicated than SP with more outputs - active, winner, predicted, active OR predicted cells (cue the problems with "default" output in TM region).
it might be correct to assume some of these outputs will be (semantically, structurally) same (correlated), so you could use "source A" for learning, and expect reasonable results when processing data from "source B"
basically this problem boils down to ability to: learn & infer at the same step (from different sources, obviously)

In code this would be

classifier.compute(patternA, learn=True)
classifier.compute(pattB, learn=False)
...

I guess we cannot do that with the current NetworkAPI architecture (?) (we cannot do that with neither SP,TM).

src/htm/algorithms/SDRClassifier.cpp

src/htm/regions/ClassifierRegion.cpp

breznak · 2020-05-04T16:38:53Z

src/htm/regions/ClassifierRegion.cpp

+                           type: Real64,  count: 0},
+      pattern: { description:  "An SDR output bit pattern for a sample.  Usually the output of the SP or TM. For example: activeCells from TM",
+                           type: SDR, count: 0},
+      learnPattern: { description:  "An SDR output bit pattern for a sample.  Usually the output of the SP or TM. For example: activeCells from TM",


how would this learn/infer pattern input work with async data? I mean, Say you want to learn a single pattern, and then make inference on 10 samples. What would be on learnPattern input? And will this behave correctly?

We might need (already have) a separate learn & infer () const, and call it separately in the ClassRegion.

dkeeney · 2020-05-04T22:31:08Z

But I still don't understand why you would want to train with one stream and match patters from another stream, even if the outputs are correlated. But perhaps you see a use for this...

I am a little concerned that it complicates the connections for the majority of users that would want to train on the same pattern stream as that used for inference. If this change goes in, they will have to provide two connections from the TM output to the Classifier.

This would break our compatibility with the NuPic API so I don't want to be two quick to change it. I would rather step back and think about this a little and see if there is another way to give you want you need without changing the API.

Lupino · 2020-05-05T00:27:16Z

I don't think this is going to give you what you expect.
The idea is to train the Classifier on the pattern that you want it to infer on.

For example, assume any network of SP,TM or whatever and you feed a set of buckets from the encoder into that network. Then you hook up a Classifier to some output that that you are interested in. I think predictedActiveCells output in your case but it could be any output. When the encoder passes in pattern A, the network generates pattern A'. And if it is passing bucket B to your network it returns pattern B'. If the Classifier sees enough examples of A into A' and B into B' while in learning mode then when learning is turned off it will continue to infer A was the bucket when it sees pattern A'. For this to work, the pattern passed to the classifier while learning must be the same pattern that is used by the classifier to infer.

tm.predictedActiveCells give the wrong pattern.size, but tm.predictiveCells is fine

Lupino · 2020-05-05T00:28:22Z

this is my sin(x) run and predict result

breznak · 2020-05-05T08:18:17Z

I am a little concerned that it complicates the connections for the majority of users that would want to train on the same pattern stream as that used for inference. If this change goes in, they will have to provide two connections from the TM output to the Classifier.

this is a valid concern.

ideally, if we can achieve the desired intention with the current functionality (API)
we could add an optional pattern to Classifier as classifier.compute(learnPatter, bool learn, [inferPatt])

Is it possible to re-link the inputs to a region (Classifier) at runtime? As in

step: 
  {link: tm.A -> classifier}
  {classifier: learn=true}
  //then at the same step:
  {re-link: tm.B -> classifier}
  {classifier: learn=false} //inference

...but I guess this is not the usage pattern, NetworkAPI is like a (tensorflow)graph, constructed once, and then data flow in the designed structure. (without user's active control)

dkeeney · 2020-05-05T13:27:02Z

NetworkAPI is like a (tensorflow)graph, constructed once, and then data flow in the designed structure. (without user's active control)

It is not quite that ridged but nearly so. The NetworkAPI does have a removeLink() function although it has not been implemented in the REST API. Although, rewiring in the middle of a run may still not do what you want. I assume you need to always do infer even while in the learning mode. So this would require that somehow there needs to be two streams of data connected to the ClassifierRegion.

To avoid breaking existing code, we cannot rename or change the behavior of the 'pattern' input. I agree with @breznak that 'inferPattern' and 'learnPattern' would be logical choices but it breaks code.

Perhaps a compromise is to have both a 'pattern' and a 'learnPattern' input. The latter is optional and if not present, 'pattern' is used for both inference and learning when in learn mode. I fear that this will still be confusing for folks, especially since even I still don't understand why there needs to be separate learning and inference pattern inputs. I see that it works, but why can't the leaning pattern also be the infer pattern in all cases?

When learnPattern is set, use learnPattern. inferPattern is set, use inferPattern

Lupino · 2020-05-06T08:07:51Z

I revert pattern for compat.

dkeeney · 2020-05-06T13:29:04Z

I don't want you to give up just because I object. There still might be some way to get you what you need. Perhaps if I understand the need better it would present a possible solution.

As I see it, tm.predictiveCells is a subset of tm.bottomUpOut and you want to train using tm.bottomUpOut because it might have more variety of the output patterns than tm.predictiveCells and therefore would train faster. Is this correct?

breznak · 2020-05-11T16:16:49Z

can we reach some consensus on this one? I can see both pros & cons here.

dkeeney · 2020-05-11T17:35:45Z

I still do not understand the need to train on one stream and then use that to infer on another stream. It seems to me that the best stream to train on would be the one on which you will be inferring on. So I must be missing something.

If we need to do this...
We could add an optional stream "AlternativeTrainingPattern" input. If there is a stream on that input it would use it for training, otherwise it would use the 'Pattern' input for both training and inference. This is the only way I can think of that could provide a separate training pattern without breaking someone.

breznak · 2020-05-11T18:09:40Z

If we need to do this...
We could add an optional stream "AlternativeTrainingPattern" input. If there is a stream on that input it would use it for training, otherwise it would use the 'Pattern' input for both training and inference.

I guess this would work nicely, let's wait for OP if this solves the issue and is actually needed.

I still do not understand the need to train on one stream and then use that to infer on another stream. It seems to me that the best stream to train on would be the one on which you will be inferring on

my guess is either 'because we can'
or as TM outputs both T (active) & T+1 (predictive) cells, we could want to train on 'active' as we know the ground truth at time T, and at the same time predict (predictive at T roughly equals active at T+1). So at time T we can get classification of the predicted value.

To me this is only issue of implementation/enablement of the NetworkAPI, as in the cpp/py code you could do either with it manually. It's upto us (mostly you?) to judge if the extention does/does not overcomplicate the API ?

breznak · 2020-06-03T21:34:48Z

Have we reached a solution, or this just dies off of lack of interest? Closing?

dkeeney requested review from breznak, dkeeney and ctrl-z-9000-times May 3, 2020 22:41

breznak added the NetworkAPI label May 4, 2020

breznak requested changes May 4, 2020

View reviewed changes

Lupino added 3 commits May 5, 2020 08:29

use NTA_CHECK(pattern.size == dimensions_)

b689460

Merge branch 'master' into master

3d078d2

change ClassifierRegion pattern -> inferPattern

27aec54

revert pattern

2f47541

When learnPattern is set, use learnPattern. inferPattern is set, use inferPattern

breznak added the triage considering priority,acceptance,closure of this issue label Jun 3, 2020

Lupino force-pushed the master branch 2 times, most recently from 3d93a23 to 2f47541 Compare November 24, 2020 03:11

Merge branch 'master' into master

2b17319

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add learnPattern ClassifierRegion #800

add learnPattern ClassifierRegion #800

Lupino commented May 3, 2020

dkeeney commented May 3, 2020

Lupino commented May 4, 2020

dkeeney commented May 4, 2020

breznak commented May 4, 2020

breznak May 4, 2020

dkeeney commented May 4, 2020

Lupino commented May 5, 2020

Lupino commented May 5, 2020

breznak commented May 5, 2020

dkeeney commented May 5, 2020

Lupino commented May 6, 2020

dkeeney commented May 6, 2020

breznak commented May 11, 2020

dkeeney commented May 11, 2020

breznak commented May 11, 2020

breznak commented Jun 3, 2020

add learnPattern ClassifierRegion #800

Are you sure you want to change the base?

add learnPattern ClassifierRegion #800

Conversation

Lupino commented May 3, 2020

dkeeney commented May 3, 2020

Lupino commented May 4, 2020

dkeeney commented May 4, 2020

breznak commented May 4, 2020

breznak May 4, 2020

Choose a reason for hiding this comment

dkeeney commented May 4, 2020

Lupino commented May 5, 2020

Lupino commented May 5, 2020

breznak commented May 5, 2020

dkeeney commented May 5, 2020

Lupino commented May 6, 2020

dkeeney commented May 6, 2020

breznak commented May 11, 2020

dkeeney commented May 11, 2020

breznak commented May 11, 2020

breznak commented Jun 3, 2020