Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for displaying Community Notes #1023

Open
wants to merge 1 commit into
base: guest_accounts
Choose a base branch
from

Conversation

Aaron1011
Copy link

Demo

I tested this with tweet http://localhost:8080/elonmusk/status/1692558493255942213#m (with reply http://localhost:8080/teslaownersSV/status/1692597508109684813#m, which still shows the Community Note on the parent tweet in the conversation).

With this pull request:

Main Tweet Reply
nitter_community_note_main nitter_community_note_reply

On Twitter.com:

Main Tweet Reply
twitter_community_note_main twitter_community_note_reply

Retrieving the Community Note

Unfortunately, none of the (undocumented) https://api.twitter.com endpoints used by Nitter seem to include Community Notes contents. However, when setting withBirdwatchNotes: true, the graphTweet (https://api.twitter.com/graphql/q94uRCEn65LZThakYcPT6g/TweetDetail) API includes a has_birdwatch_notes field, which is true when a Community Note is present on a tweet. This works both for 'main' selected tweets, and for other tweets in the same conversation.

When viewing a tweet while not signed in on twitter.com, the browser makes a call to https://twitter.com/i/api/graphql/DJS3BdhUhcaEpZ7B7irJDg/TweetResultByRestId . This endpoint actually contains Community Notes for the given tweet (if they exist), under a "birdwatch_pivot" key:

"birdwatch_pivot": {
    "destinationUrl": "https://twitter.com/i/birdwatch/n/1692711122338533397",
    "footer": {
        "text": "Context is written by people who use X, and appears when rated helpful by others.  Find out more.",
        "entities": [
            {
                "fromIndex": 83,
                "toIndex": 96,
                "ref": {
                    "type": "TimelineUrl",
                    "url": "https://twitter.com/i/flow/join-birdwatch",
                    "urlType": "ExternalUrl"
                }
            }
        ]
    },
    "note": {
        "rest_id": "1692711122338533397"
    },
    "subtitle": {
        "text": "Blocking is a basic safety feature that allows basic protection for victims of abuse and stalking. Removing this feature would compromise the safety of many people on social media.\nthehotline.org/resources/stal…",
        "entities": [
            {
                "fromIndex": 181,
                "toIndex": 211,
                "ref": {
                    "type": "TimelineUrl",
                    "url": "https://t.co/0DtCJWEbkG",
                    "urlType": "ExternalUrl"
                }
            }
        ]
    },
    "title": "Readers added context they thought people might want to know",
    "shorttitle": "Readers added context",
    "iconType": "BirdwatchV1Icon"
}

However, this endpoint doesn't work with normal OAuth tokens (probably because it's a twitter.com endpoint instead of api.twitter.com). Instead, it uses a Bearer token with a hardcoded value, and passes in a "guest token" in an x-guest-token header. I grabbed this token directly from the Network dev tool in my browser, but it should also be possible to get it from one of the account-creation scripts floating around on the issue tracker.

Passing in these values, we get a response containing the above JSON object (along with other tweet data that we don't need), which is everything required to render a Community Note

Implementation

  • Each object in guest_accounts.json now requires a guest_token field (alongside the oauth_token and oauth_token_secret). This guest token will be passed in as the x-guest-token header only when retrieving a Community Note
  • When we parse a tweet in parseGraphTweet, we check if has_birdwatch_notes is true. If it is, we then invoke the TweetDetail API described above in order to get the Community Note. Since the overwhelming majority of tweets do not have a Community Note, we should call TweetDetail fairly infrequently (unless Nitter users like to view tweets with Community Notes on them).
  • Before rendering the Community Note, we need to process any entities in it. So far, I've only seen one kind - a link replacement. This is used to make 'Find out more' into a link, and whenever links are embedded into the actual text of the Community Note. The format is similar to other entities, so I was able to reuse some of the code.
  • The note is rendered using styles from a new community-note.scss file. I'm not a web designer, so I just copy-pasted card.scss, removed the ellipses logic, and renamed/removed classes. The result isn't perfect, but it includes all of the elements of a Community Note that display on twitter.com (the link to the note feedback pages, any external links, and the 'Find out more' in the footer).

Other notes

I ended up making the request to the TweetDetail from within parseTweet, which requires making lots of parse functions async. This doesn't seem great, but the only alternatives I came up with seemed worse:

  • We could have the initial parsing just mark tweets with notes (by looking at "has_birdwatch_notes"), and do a second pass to actually fetch them. Unfortunately, there are lots of calls to parseGraphTweet, all of which would need an extra call afterwards.
  • We could store a Future directly, and only await it later from some already async function. Unfortunately, this breaks the flatty serialization in redis_cache, since a Future can't be serialized. I didn't see a clear way to make this work with flatty.

I didn't include the Community Notes icon - it's an inline SVG on Twitter.com, and I felt somewhat nervous about copyright issues.

This is my first time using Nim, so please let me know if any of this could be done in a more idiomatic way.

@zedeus
Copy link
Owner

zedeus commented Sep 18, 2023

Thank you! I really appreciate all the effort you put into this.

However, this endpoint doesn't work with normal OAuth tokens (probably because it's a twitter.com endpoint instead of api.twitter.com)

It should work by simply changing twitter.com/i/api to api.twitter.com. With that said, there's a better way to get this data which has no rate limit and doesn't use any authorization:
https://cdn.syndication.twimg.com/tweet-result?id=1692558493255942213&lang=en&token=0
token doesn't matter, it just seems to be a caching key so it can be left as 0. id is the tweet ID, and no headers are required.
It's also served by a CDN so it's very fast.

I ended up making the request to the TweetDetail from within parseTweet, which requires making lots of parse functions async.

This is indeed a pain in the butt, but I would definitely go with a different solution to keep it simple and less intrusive. Similar logic is done elsewhere by splitting up the API calls, and only fetching what is needed. In this case the communityNote option can be an empty tweet (just some Tweet()), which can then be checked from the /status/ route, performing another API request if tweet.communityNote.isSome(), and then replacing it.

Similar-ish behavior:

nitter/src/parser.nim

Lines 308 to 309 in d7ca353

if parsed.retweet.isSome:
parsed.retweet = some parseLegacyTweet(tweet{"retweeted_status"})

More similar from the profile/timeline route (the little photo gallery on profile pages only gets fetched if it's enabled in user preferences, and the user isn't accessing the Media tab):

rail =
skipIf(skipRail or query.kind == media, @[]):
getCachedPhotoRail(name)

Alternatively, you could extend this logic:

nitter/src/api.nim

Lines 110 to 113 in d7ca353

proc getTweet*(id: string; after=""): Future[Conversation] {.async.} =
result = await getGraphTweet(id)
if after.len > 0:
result.replies = await getReplies(id, after)

by adding something like this:

if not result.tweet.isNil and result.tweet.communityNote.isSome():
  result.tweet.communityNote = await getCommunityNote(id)

This wouldn't cover replies or other tweets in the thread, but I'd consider that a bit overkill anyway.

As a last note, I'd definitely want to put this behind a user preference, but I can take care of that if you're not feeling too adventurous.

Comment on lines +215 to +219
CommunityNote* = object
title*: string
subtitle*: string
footer*: string
url*: string
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that the only important thing which changes is subtitle, the rest can just be hardcoded, turning this into just communityNote*: Option[string]

@Aaron1011
Copy link
Author

Aaron1011 commented Sep 18, 2023

It should work by simply changing twitter.com/i/api to api.twitter.com

Unfortunately, that didn't work in my local testing - the api.twitter.com endpoints never seem to contain birdwatch_pivot

With that said, there's a better way to get this data which has no rate limit and doesn't use any authorization:

Oh, that's very nice! I'll adjust the PR to use that instead

This wouldn't cover replies or other tweets in the thread, but I'd consider that a bit overkill anyway.

That was the main reason I went with the async approach - I think it would be really unfortunate to be unable to see Community Notes if you're linked to a reply to the original tweet.

@zedeus
Copy link
Owner

zedeus commented Sep 18, 2023

It would still be best to process the thread and replies separately instead of stuffing it into the parser code. Getting the community note for the tweet being viewed seems good enough as a first step.

@Victor239
Copy link

Something to consider is showing this content in Nitter's RSS feed pages also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants