Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃悰 /relations bug #461

Open
cyber-sec0 opened this issue Dec 15, 2023 · 9 comments
Open

馃悰 /relations bug #461

cyber-sec0 opened this issue Dec 15, 2023 · 9 comments

Comments

@cyber-sec0
Copy link

cyber-sec0 commented Dec 15, 2023

Environment

W11
Edge

Anything else?

It seems like https://relatedanime.com is much better to get anime relations than the https://api.jikan.moe/v4/x/x/relations endpoint.
Apparently /relations is missing many relations that https://relatedanime.com can return and show

For instance

If I'm on https://myanimelist.net/anime/28851/Koe_no_Katachi and if I run
const Relations = await (await fetch('https://api.jikan.moe/v4/' + location.href.split('/')[3] + '/' +
location.pathname.match(/\d+/)[0] + '/relations')).json();
Relations.data.flatMap(relation => relation.entry);

I only get the IDs 56805 and 35566 (I think/agree that 28851 should not be returned, since it's already on the page I am in)

While https://relatedanime.com/anime/28851 includes
https://myanimelist.net/manga/103574
https://myanimelist.net/manga/48621
https://myanimelist.net/anime/35566

Just another example

If I'm on https://myanimelist.net/anime/46102 and if I run the fetch /relations API on this page

I only get the IDs 136113, 50653, and 53642 (I think/agree that 46102 should not be returned, since it's already on the page I am in)

While https://relatedanime.com/anime/46102 includes https://myanimelist.net/manga/156491

Just another example

If I'm on https://myanimelist.net/anime/19 and if I run the fetch /relations API on this page

I only get the IDs 1, 1109, and 39332 (I think/agree that 19 should not be returned, since it's already on the page I am in)

While https://relatedanime.com/anime/19 includes https://myanimelist.net/manga/10968

Just another example

If I'm on https://myanimelist.net/anime/2167 and if I run the fetch /relations API on this page

I only get the IDs 2898, 6887, 136594, 1723, 4059, 6351, and 4181 (I think/agree that 2167 should not be returned, since its already on the page I am in)

While https://relatedanime.com/anime/2167 includes
https://myanimelist.net/manga/5390
https://myanimelist.net/manga/2598
https://myanimelist.net/manga/3941
https://myanimelist.net/manga/6997
https://myanimelist.net/manga/11777

https://myanimelist.net/anime/2167 returns https://myanimelist.net/anime/4181, but https://myanimelist.net/anime/4181 does not even return https://myanimelist.net/anime/2167, and instead of returning at least 7 IDs like https://myanimelist.net/anime/2167 did, https://myanimelist.net/anime/4181 only returns 3 IDs

https://myanimelist.net/anime/1723 AND https://myanimelist.net/anime/6351 both only return 2 IDs instead of at least 7 IDs

Why is the API so inconsistent and unreliable?
Is /relations still under testing?

Hope it gets fixed soon! (The MAL API is terrible for this!)

@irfan-dahir
Copy link
Contributor

irfan-dahir commented Dec 15, 2023

Jikan directly scrapes the ID of the page that's requested.

e.g https://myanimelist.net/anime/19 here only has 3 relations which jikan returns as: https://api.jikan.moe/v4/anime/19/relations
e.g2 https://myanimelist.net/anime/28851 has 2, which returns as https://api.jikan.moe/v4/anime/28851/relations

relatedanime seems to be storing and querying relations instead or doing something differently than just directly scraping an entry. we are not doing that at the moment, just caching the scraped relations that show for that ID.

If we were to return all relations that might have some caveats and would require some filtering. Such as, the 'sequel' or 'prequel' relation for a anime ID would be different if it has multiple seasons (entries).

@cyber-sec0
Copy link
Author

cyber-sec0 commented Dec 15, 2023

I don't think that 'sequel' or 'prequel' would be necessary, at least not for me right now, I just need all relation IDs and that's it. But ok...

Could it please be implemented in the future then?

I would hate having to make a lot of API requests just to store and query all relations in a franchise, this would require me to fetch every single entry of the franchise.

When do you think that it would likely be done? Like a year from now maybe?

Also, the current Rate Limiting information on https://docs.api.jikan.moe/ is wrong because it says that I can do 60 requests per minute, but the limit of each request per second is 3 seconds between each request, so 60/3 is = 20, which isn't = 60.

If I could at least fetch 60 entries per minute (= 1 entry per second), then I would only not be able to grab all relations of the top 13 franchises listed here https://chiaki.site/?/tools/watch_order_groups/type/large, which isn't a lot of franchises at least, but if I can only make 20 fetch requests per minute (3 per sec), then I am very limited on creating a complete list of relations for a lot of franchises (62 franchises at the moment to be specific)... And that excludes all related manga entries....

@cyber-sec0
Copy link
Author

Could it please be implemented in the future then?

When do you think that it would likely be done? Like a year from now maybe?

@IsraelGomes
Copy link

I'm not a developer of the project, but I think it is easier to request the relation IDs by using a recursive search, just walk over the returned relations for the first anime and request the IDs separated, one request for each anime/manga, and you will get all related IDs.
I always used this approach, and it works just fine for me. The API supports at least 60 requests per minute or 1/s, it is enough even for big franchise, in fact you can request all MyAnimeList site with that speed in one day, of course, not counting errors.
For the API to provide what you said in one simple request for us, it will have to make several requests to MAL under the hood, either way, someone will have to make a lot of requests, or we do a lot of requests to the API, or the API do a lot of requests to MAL.

@cyber-sec0
Copy link
Author

cyber-sec0 commented Dec 18, 2023

@IsraelGomes

I am waiting for an official answer but thanks, doing a "search" is actually what I don't want to waste time doing either because Jikan will fail

@pushrbx
Copy link
Collaborator

pushrbx commented Dec 19, 2023

@cyber-sec0 I believe Jikan API's scope is to just scrape and cache the scrape result. The current code base is tailored around that idea. Your feature request implies that there should be a shift in the scope, where all scraped data is organised in a structured format and saved to a database that way. It would require significant effort imo.

Just to be clear, this is how the app works currently:
image

In case of the feature you are asking for the whole app's complexity would have to increase as there would be a need for something which does multiple requests to MAL to build the structured data. Roughly I think that could be achieved in two ways:

  1. Actively: If the requested data doesn't exist in the database yet, the client's request would kick off a synchronous process to fetch and process the data and save in the database. Of course the client would have to wait for the whole process to finish before receiving a response.
  2. Passively: If the requested data doesn't exist in the database yet, the client would receive an error to try again later, and the system would kick off a background process to fetch and process the data. Additionally there would be a scheduled background task to keep the database up to date. [this is not ideal]

@irfan-dahir I'm voting against anything like this. 馃樃

Just to offer a solution for your problem around

If I could at least fetch 60 entries per minute (= 1 entry per second), then I would only not be able to grab all relations of the top 13 franchises listed

You could self-host jikan api, and you would not have to deal with the rate limit.

@cyber-sec0
Copy link
Author

@pushrbx

Good to know, but I don't want to and I can't self-host the jikan API.

I still don't see why a db is needed for it, since it currently scraps all entries, it shouldn't be hard to just cache/store everything kind of together or make some kind of relations, which a db or 'prequel' and such texts probably wouldn't be needed for.

For example, there could be a new endpoint, that when called/fetched jikan would return its ID relations just like it does now, but then in the background it would also fetch the other relations, thus self-constructing a relations list, without the user having to make more than 1 fetch request. This data could be cached or stored in any way, not necessarily a db... Also, this way it would/should stop after getting all related entries for the entire franchise.

(Jikan would (by itself on the host server) do a network request for all returned related entry IDs), so the user could make 1 single request and have everything returned. If this is possible a whole new reconstruction of the whole API wouldn't be necessary. The only thing that would happen is that this new /allrelations endpoint would probably be a bit more complex than just the /relations, since it would have to "use" the /relations and fetch everything it returns a couple times.

@irfan-dahir
Copy link
Contributor

@cyber-sec0
I agree that there is definitely a way to do this and this as a feature from Jikan could be useful.
Unfortunately, we're also limited by technical restraints which we need to factor in. There's a very high chance of Jikan getting rate-limited by MAL and we've a very narrow channel for making requests which is why everything is designed to be cached, queried and if a single consumer-side request means making a request to MAL then we ensure that only one request is being made from Jikan to MAL and never more. This is done to ensure a good API experience and fair-use otherwise you'd be facing MAL's rate-limit on every other request (which we'd have no control over).

The only way to do this effectively would be through the "passive" way @pushrbx mentioned and by storing relations into the DB to make them queryable. I can't say whether we'd be able to pick something like this up any time soon or even in a year as it would mean re-designing the entire API - which we just recently did. At the moment, we're focused on introducing new API endpoints and improving existing ones that would not require this sort of complexity.


Another solution (via Jikan) I could think of for you at the moment is to do this is to write a script to go through each entry's relations by looping through MAL IDs and generate your own extended relations to use as your internal API.

Or if you need a client-only solution such as showing all these relations in a section on your app then to show the relations that you get initially and then lazy load or to add a button to fetch extended relations.

Also, the current Rate Limiting information on https://docs.api.jikan.moe/ is wrong because it says that I can do 60 requests per minute, but the limit of each request per second is 3 seconds between each request, so 60/3 is = 20, which isn't = 60.

The way this works is that you can make 60 requests a minute. Whether this is done every second or a couple times per second is up to you. But the cap is at 60 r/m.

@cyber-sec0
Copy link
Author

cyber-sec0 commented Dec 21, 2023

@irfan-dahir
"There's a very high chance of Jikan getting rate-limited by MAL"
How about instead of having Jikan make multiple /relations requests to MAL consecutively, it actually
1 Scraps 1 MAL ID entry page for its relations
2 Does the same for all MAL ID entries
3 Temporarily caches the /relations for all MAL ID entries
(Until here this is exactly what the API currently does correct?)

4 Create a new /allrelations endpoint (Shouldn't be hard to just create a new URL ending on /allrelations)
5 When ID/allrelations are called it will at first, do the same thing ID/relations does (this is exactly what the API currently does anyway) (So on the server side Jikan will treat a GET call to ID/allrelations as if it were a GET call to ID/relations)

(The new harder steps below)
6 Then Jikan will temporarily cache all new relations IDs returned by the first GET ID/relations
7 On the Jikan server it will keep doing a GET call to ID/relations using the already cached ID/relations (So no new network requests would be made directly to MAL)
8 Only steps 6 and 7 would be repeated until we check that all IDs for the franchise were grabbed
9 Then return to the user that called GET /allrelations, all the relations (So basically multiple /relations all together)

For step 8, For example, I know that in JS there are many ways of adding a string URL to an array, and then checking if the array already has that string URL or not, so a similar idea could be used to check if any URL string dups are found, it means that that cached ID/relations has already been through our loop, so we don't have to GET it again. This could be a way to exit the loop after all IDs for the franchise were grabbed.

I suppose that as long the current /relations endpoint has an increased cache time, it could be implemented without a major change in the whole API or a db-making endpoint.

If the API supports 60 requests per second, why is there a 3 sec delay per request?
There aren't 3*60=180 seconds in a minute...

Either it has to be updated to show the text "1 second per request", or "20 requests per minute"

I tried the code below just to test that, and 60 requests per second worked, so I think that the text should be updated to show the text "1 second per request".

Since 60 requests per second works, I suppose that I will be ok even if this isn't implemented within a year from now... But it would be nice to have a /allrelations endpoint.
The issue is that currently there it no way to know how many manga and anime entries a franchise has, no website API or anything has that info, although chiaki.site has all anime entries for a franchise it does not display any manga entries... so maybe even 60 requests wouldn't be enough for a few entries if we include the manga entries....

Also, another problem I have is that I do have other scripts that use Jikan, so if they run on the page, and then I run another script that will attempt to make 60 requests per sec to Jikan, it will likely fail, since my browser/IP already used Jikan a couple times just a few secs ago...

If it can be implemented just using logic and a longer caching time, without a major API change or db that would be great, I still believe that it's feasible to add it without major changes or a db...

let count = 0; const intervalId = setInterval(() => { count++; if (count > 60) { clearInterval(intervalId); } else { fetch('https://api.jikan.moe/v4/' + location.href.split('/')[3] + '/' + location.pathname.match(/\d+/)[0] + '/moreinfo') .then(response => { if (response.status === 200) { console.log('Response is 200'); } }) .catch(error => console.error(error)); } }, 1000);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants