Modelling Card Correlations #27

rasenmaeher92 · 2020-04-22T11:49:40Z

Hey fashia,
I recently read your ebisu article again and really enjoyed it! I would really be interested, whether it really reduces the review load empirically?

Card Correlations

But to the main matter of this issue, one aspect which should influence recall of cards, are correlations between them. For example an

increase/decrease of a cards interval/recall probability could also increase/decrease another cards intervall/recall probability (positive correlation).
an increase/decrease of a cards interval/recall probability could also decrease/increase another cards intervall/recall probability (negative correlation).

I am no export in memory, but I believe one such correlation effect is called interference.

Card Correlations within native Ebisu

So my first question would be, whether I understood it correctly, that Ebisu could not model such correlation events, because a correct answer to a card B solely due to a correlation with a previously reviewed card A, would wrongly solely attributed to the card B?
Later review of the card B without the previous cue of card A would lead to an unexpected failure.

Bayesian model of card correlations

Therefore, I was wondering, whether it would be possible to model card correlations within your bayesian framework?

Correlation Modelling

A very rough sketch of this model could work like this:
For each (directed) card pair a prior zero mean distribution is initialized and when a cards changes its estimates (interval/recall probability) the previous changes of other cards are used to try to estimate posterior correlation distributions to these respective cards. Obvisiously larger time differences between these interval/recall probability changes should produce smaller changes to the prior.

Another way to see this would be, that cards form nodes in a graph, and the correlation distributions are directional edges. As the previous update method effectively only updates a few correlation distributions, whilst there are roughly n^2 correlations for n cards in a deck, another strategy would be needed to densely connect the graph.
The following simple example provides a good idea: Assume our priors for the cards 1 and 2 and for the cards 1 and 3 are well estimated and have non zero means with the same sign. As cards 2 and 3 were never reviewed in close timely fashion, our prior for the cards 2 and 3 (and vice versa) still has zero mean.
For this correlation graph to be consistent, we would have to update our 2,3 and 3,2 prior correlation distributions towards the non zero mean and identical sign.

Nicely enough both approaches to updating the correlation distributions can be done for past review data, thus this model would be backwards compatible.

Updating Card Estimates

Such correlation model could be used in the following way:
If the estimate (interval/recall probability) of card A changes due to a review, all connected cards (connected means the correlation distribution has a non zero mean) should also update their intervales/recall probabilities weighted by the mean of the respective correlation distribution.

Advantages

Once the estimates of the correlations between cards are good enough, the review load for positively correlated cards would be reduced. Furthermore, negatively correlated cards are reviewed more often, giving the student the opportunity to address interference issues.

Question

So what do you think about this as an expert in bayesian spaced repition models? Would this feasible mathematically as well as implementation wise?
And thinking of compatibility to Ebisu: You mentioned in another issue, that Ebisu is built on binary cards answers (passed/failed instead of Again/Hard/Good/Easy), could non binary correlations still be used as an input to update Ebisu's parameters?

fasiha · 2020-04-24T23:36:12Z

Hey, I wanted to let you know I haven't forgotten about this and am thinking about this thoughtful comment!

rasenmaeher92 · 2020-04-25T12:46:37Z

Thats good to know! I am excited to hear your opinion, whenever you find the time.

I have continued to think about this and found an issue, for which I have not yet found an intuitive solution:

Timely decay of correlations

In my rough sketch above I described how the correlation distributions are updated (and our estimate of the correlation hopefully is improved) due to correlating review results of cards. But this model has the build-in assumption, that the true correlation is stationary with respect to time.

I believe that is not true, but I am still uncertain, whether this assumption would be harmful in any real world application. It may very well be the case, that the effect of card correlation induced parameter change is much smaller, than the parameter change due to reviewing. If most correlations means are around the single digit percent size, their effect on the cards scheduling is really more fine tuning.

As I dont think correlations are stationary, and modeling this decay and deriving its parameters from the reviews could be hard, I think it would be the best to ignore this issue at the beginning.

Again I would be interested in your opinion, whether the assumption of stationary correlations could be harmful?

jasonsparc · 2021-10-20T17:55:53Z

Hi! I almost don't know anything about bayesian inference, so I can't help in that regard. But I too wanted this feature. I had certain insights about this, and I thought perhaps I should share them, in case it might help.

Suppose we have two quiz items: A and B. And that item A is related to item B in some way.

For example, item A could be a certain sentence, with a certain word in it, and item B is a quiz on that word alone. And if you could answer quiz item A correctly, then it might mean that there's a chance that the quiz for item B was also answered at that moment.

That is, by quizzing on item A, there's a chance that the answer for item B would also be remembered at that instance. If by quizzing on item A, the answer for item B was also remembered, then isn't that also practically the same as quizzing on item B? In other words, the model could be, the probability that quizzing on item A correctly would also quiz on item B correctly. But that's just the case for strengthening, there's also the case for weakening or interference: the probability that answering item A correctly would also answer item B incorrectly if quizzed on afterwards.

Now that I have given you the general idea, let's move on to the many possible ways that item A could affect item B. There are 3 possible causes for item A to affect item B:

Item A was quizzed with a passing grade, and it affected item B.
Item A was quizzed with a failing grade, and it affected item B.
Item A was quizzed on, and it affected item B, regardless of whether the quiz on item A passed or failed.

Now, for each of the above possible causes, there are 3 possible outcomes:

Item B was strengthened. That is, quizzing on item A caused the answer for item B to be remembered at that instance.
Item B was weakened. That is, quizzing on item A prevented the answer for item B to be remembered, effectively interfering with its retrieval, at that moment.
Item B wasn't really affected at all.

Therefore, we actually have 9 possible states regarding the relationship between quiz items A and B, and they should all be regarded as probabilistic. We could even deduce certain conclusions based on the current most probable state, e.g., if the state was cause 3, effect 1, then it could be that the question portion of item A is the one affecting item B.

Notice also that the relationship described above is one-way only, i.e., quizzing on item A affects item B (and not the other way around).

Now, we could then have a brand new API function: updateRelation or something similar (you guys are more than welcome to come up with a much better name). It would receive the same arguments passed to updateRecall for item A, but with a few more extra arguments needed, the prior tuple and tnow for item B, and some kind of tuple that keeps track of the possibility that item A affects item B.

So after calling updateRecall for item A, we must also immediately call updateRelation for the possibility that item A affected item B. If item A could affect more than one item, call updateRelation for each possibly affected item.

As a bonus, by designing the API this way, we could have situations where item B doesn't affect item A at all, even if item A could affect item B in some way.

Again, I almost don't know anything about bayesian inference (and my knowledge of bayesian probability is not comprehensive), but I hope all these help and provide some useful insights.

jasonsparc mentioned this issue Oct 20, 2021

Support for partial reconsolidation #51

Open

fasiha mentioned this issue Sep 18, 2023

Questions about v3 #63

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modelling Card Correlations #27

Modelling Card Correlations #27

rasenmaeher92 commented Apr 22, 2020

fasiha commented Apr 24, 2020

rasenmaeher92 commented Apr 25, 2020

jasonsparc commented Oct 20, 2021

Modelling Card Correlations #27

Modelling Card Correlations #27

Comments

rasenmaeher92 commented Apr 22, 2020

Card Correlations

Card Correlations within native Ebisu

Bayesian model of card correlations

Correlation Modelling

Updating Card Estimates

Advantages

Question

fasiha commented Apr 24, 2020

rasenmaeher92 commented Apr 25, 2020

Timely decay of correlations

jasonsparc commented Oct 20, 2021