Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental update with old users but new items #700

Open
levrone1987 opened this issue Oct 24, 2023 · 4 comments
Open

Incremental update with old users but new items #700

levrone1987 opened this issue Oct 24, 2023 · 4 comments

Comments

@levrone1987
Copy link

I want to use model.partial_fit_items for incremental model update. The users pool stays the same, but new items may arrive. This is the code I have:

mat = csr_matrix((test_df['amount_in_eur'], (test_df['player_index'], test_df['item_index'])))
model.partial_fit_items(test_df['item_index'], mat)

and I get the following error:

    314 """Incrementally updates item factors
    315 
    316 This method updates factors for items specified by itemids, given a
   (...)
    327     Sparse matrix containing the liked users for each item in itemids
    328 """
    329 if len(itemids) != item_users.shape[0]:
--> 330     raise ValueError("item_users must contain 1 row for every user in itemids")
    332 # recalculate factors for each item in the input
    333 item_factors = self.recalculate_item(itemids, item_users)

ValueError: item_users must contain 1 row for every user in itemids

If my assumption that what I want to achieve is possible, please help me write correct code.

@benfred
Copy link
Owner

benfred commented Oct 24, 2023

it looks like you're passing a (user, items) matrix to the partial_fit_items - but the method expects a (items, users) matrix.

Does it work if you create it like

mat = csr_matrix((test_df['amount_in_eur'], (test_df['item_index'], test_df['player_index'])))

?

@levrone1987
Copy link
Author

Same error @benfred

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[38], line 1
----> 1 model.partial_fit_items(test_df['item_index'], mat)

File ~/miniconda3/envs/whizdom-env/lib/python3.9/site-packages/implicit/cpu/als.py:330, in AlternatingLeastSquares.partial_fit_items(self, itemids, item_users)
    314 """Incrementally updates item factors
    315 
    316 This method updates factors for items specified by itemids, given a
   (...)
    327     Sparse matrix containing the liked users for each item in itemids
    328 """
    329 if len(itemids) != item_users.shape[0]:
--> 330     raise ValueError("item_users must contain 1 row for every user in itemids")
    332 # recalculate factors for each item in the input
    333 item_factors = self.recalculate_item(itemids, item_users)

ValueError: item_users must contain 1 row for every user in itemids

Is my assumption correct: is it possible to retrain the model with some new items (so, same users as in the pretrained model, but new items)? If yes, could you please provide a small working example on how this could be done?

@levrone1987
Copy link
Author

I tried all variations, but it still does not work @benfred

@benfred
Copy link
Owner

benfred commented Nov 6, 2023

The best example I have is probably in the unittest here

implicit/tests/als_test.py

Lines 272 to 301 in cb2a66d

@pytest.mark.parametrize("use_gpu", [True, False] if HAS_CUDA else [False])
def test_incremental_retrain(use_gpu):
likes = get_checker_board(50)
model = AlternatingLeastSquares(factors=2, regularization=0, use_gpu=use_gpu, random_state=23)
model.fit(likes, show_progress=False)
ids, _ = model.recommend(0, likes[0])
assert ids[0] == 0
# refit the model for user 0, make them like the same thing as user 1
model.partial_fit_users([0], likes[1])
ids, _ = model.recommend(0, likes[1])
assert ids[0] == 1
# add a new user at position 100, make sure we can also use that for recommendations
model.partial_fit_users([100], likes[1])
ids, _ = model.recommend(100, likes[1])
assert ids[0] == 1
# add a new item at position 100, make sure it gets recommended for user right away
model.partial_fit_items([100], likes[1])
ids, _ = model.recommend(1, likes[1], N=2)
assert set(ids) == {1, 100}
# check to make sure we can index only a single extra item/user
model.partial_fit_users([101], likes[1])
model.partial_fit_items([101], likes[1])
ids, _ = model.recommend(101, likes[1], N=3)
assert set(ids) == {1, 100, 101}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants