Better Data Formatting #544

irfan-dahir · 2024-02-01T13:23:06Z

This is a collection of points based on the feedback received over time and my general thoughts. Better formatting of returned data can help improve developer experience.

Most of these would be destructive and non-backward compatible changes to the response schema so we'd have to target major parser and REST API versions.

Enumification of known values

Some properties are returned from MAL as-is currently but they are limited and known values. It would be a good idea to create constants for them at parser-level. This would make validation easier to handle on both parser and REST API.

Some properties that come to mind from Anime/Manga type resources are:

type
source
status
rating

Note: These can be nullable.
Note2: These can have their own listing endpoints like Anime Genres in case users might not want to hard code these values and keep them dynamic client-side.

Duration to Seconds

Currently, duration is returned as a string, we should convert them to seconds so devs can format them easily.
Feedback received here: https://discord.com/channels/460491088004907029/462992340718583814/1102217962577997964

Proposed Schema:

"duration": {
    "seconds": 5400,
    "string": "1h 30m"
}

Date Props to not estimate

Related issue: #486
Currently, if the date range receives something like "2024", it will assume the starting date is "1 January, 2024". It would be better to keep those unknown prop values as null.

Current Schema:

    "aired": {
      "from": "2024-01-01T00:00:00+00:00",
      "to": null,
      "prop": {
        "from": {
          "day": 1,
          "month": 1,
          "year": 2024
        },
        "to": {
          "day": null,
          "month": null,
          "year": null
        }
      },
      "string": "2024 to ?"
    },

Proposed Schema:

    "aired": {
      "from": "2024-01-01T00:00:00+00:00", // we "estimate" here
      "to": null,
      "prop": {
        "from": {
          "day": null, // returns null here
          "month": null, // returns null here
          "year": 2024
        },
        "to": {
          "day": null,
          "month": null,
          "year": null
        }
      },
      "string": "2024 to ?"
    },

Opening/Ending Themes

Currently, we're just returning array of strings. We're not doing any further parsing at the moment. But, there is some metadata in there that we can parse and return separately. Like the episode range those OP/EDs were played in.
Related issue: #534

Current Schema:

      "openings": [
        "1: \"We Are! (ウィーアー!)\" by Hiroshi Kitadani (きただにひろし) (eps 1-47,1000, 1089-)",
      ]

Proposed Schema:

      "openings": [
            {
                  "titles": [
                        {
                              "type": "English",
                              "title": "We Are!",
                        },
                        {
                              "type": "Japanese",
                              "title": "ウィーアー!",
                        },
                  ],
                  "author": {
                        "name": [
                                  {
                                        "type": "English",
                                        "title": "Hiroshi Kitadani",
                                  },
                                  {
                                        "type": "Japanese",
                                        "title": "きただにひろし",
                                  },
                        ]
                  },
                  "episodes": ["1-47", "1000", "1089-"]
            }
      ]

The episodes data is provided in 3 different types:

Ranges (e.g 1-47 "Throughout episode 1 till 47")
Specific episode mentions (e.g 1000)
Ongoing range (e.g 1089- "Episode 1089 and onwards")

Furthermore, as a object we can link some additional data that is now returned as well. Like preview URLs for these OP/ED themes: #534 and (if any) attached music videos.

Returning null on placeholder URLs

Related issue: #488

cc: @pushrbx

What else is there? If anyone else has any suggestions, let's discuss it below.

The text was updated successfully, but these errors were encountered:

rizzzigit · 2024-02-05T14:56:33Z

I'm curious about the discussion of representing all array data in a form of annotated arrays. Like this:

{
  "props": [ "mal_id", "title", "score", "episodes", "year" ],
  "data": [
    [ 1, "Cowboy Bebop", 8.75, 26, 1998 ],
    [ 5, "Cowboy Bebop: Tengoku no Tobira", 8.38, 1, null ],
    [ 6, "Trigun", 8.22, 26, 1998 ],
    [ 7, "Witch Hunter Robin", 7.24, 26, 2002 ],
    [ 8, "Bouken Ou Beet", 6.93, 52, 2004 ]
  ]
}

The benefit I could think of is it reduces the bandwidth usage since all entries are homogenous, this means property names do not need to repeat. I haven't done research on the computational efficiency of output generation compared to the current schema, but accessing it could theoretically be faster than key-value pairs.

In terms of accessing data, Users can define a constant to find the index of the property before iterating through the data. Assuming that the spelling is correct and it's defined in the documentation, finding the index of the property should not fail or return -1.

const result = await fetch("http://api.jikan.moe/anime").then((response) => response.json())

const titleIndex = result.props.indexOf('title')
const idIndex = result.props.indexOf('mal_id')

for (let i = 0; i < result.length; i++) {
    console.log(`MAL ID: ${result[i][idIndex]}`)
    console.log(`Title: ${result[i][titleIndex]}`)
}

If this schema is paired with the ability to specify only parts of the data in the specified order, I think they should also be fine accessing the data.
For instance, if the user requests /anime?props=title,title_jp,mal_id the API should return something like this:

{
  "props": [ "title",  "title_jp", "mal_id" ],
  "data": [
    [ "Cowboy Bebop: Tengoku no Tobira", "カウボーイビバップ 天国の扉", 5 ]
  ]
}

Users can assume the positions of each property based on their request, just like this:

const result = await fetch("http://api.jikan.moe/anime?props=title,title_jp,mal_id").then((response) => response.json())

for (let i = 0; i < result.data.length; i++) {
    console.log(`MAL ID: ${result.data[i][2]}`)
    console.log(`Title: ${result.data[i][0]}`)
    console.log(`Title (JP): ${result.data[i][1]}`)
}

That's all I have for now. Any feedback on this are very much appreciated.

irfan-dahir added enhancement refactoring/speeding up discussion schema labels Feb 1, 2024

irfan-dahir added this to the 5.0.0 milestone Feb 1, 2024

irfan-dahir pinned this issue Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better Data Formatting #544

Better Data Formatting #544

irfan-dahir commented Feb 1, 2024

rizzzigit commented Feb 5, 2024

Better Data Formatting #544

Better Data Formatting #544

Comments

irfan-dahir commented Feb 1, 2024

Enumification of known values

Duration to Seconds

Date Props to not estimate

Opening/Ending Themes

Returning null on placeholder URLs

rizzzigit commented Feb 5, 2024