You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you have multiple workers/instances of Jikan API running behind a load balancer, and if the requested anime/manga is not in mongodb then there can be a race condition, where two instances of Jikan API would scrape the same datasheet from MAL resulting in duplicated items in mongodb.
This behaviour has been reported several times: #262#333
The solution I can think of is some sort of pessimistic locking through some lock ticket stored in a separate collection in mongodb, and if there is an entry for an anime/manga item there, the second worker which would try to scrape the item would not insert the scraped item. However this can cause the risk of ip ban, because while Worker A is scraping and inserting into mongodb, Worker B might get two requests for the same item and it would have to scrape twice.
Currently Jikan API works the following way:
If you have multiple workers/instances of Jikan API running behind a load balancer, and if the requested anime/manga is not in mongodb then there can be a race condition, where two instances of Jikan API would scrape the same datasheet from MAL resulting in duplicated items in mongodb.
This behaviour has been reported several times: #262 #333
The solution I can think of is some sort of pessimistic locking through some lock ticket stored in a separate collection in mongodb, and if there is an entry for an anime/manga item there, the second worker which would try to scrape the item would not insert the scraped item. However this can cause the risk of ip ban, because while Worker A is scraping and inserting into mongodb, Worker B might get two requests for the same item and it would have to scrape twice.
In case of pessimistic locking the worker which finds a locking ticket should return
423
status code.https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/423
The text was updated successfully, but these errors were encountered: