Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using AI to improve Community Notes #163

Open
JayThibs opened this issue Nov 14, 2023 · 14 comments
Open

Using AI to improve Community Notes #163

JayThibs opened this issue Nov 14, 2023 · 14 comments

Comments

@JayThibs
Copy link

In order to improve the speed at which important community notes get added and to help community noters write better notes, I'm curious if people have put some effort into using a mic of AI (language models) and more simple methods. I'd like to help with this if we can make it work economically.

For example, you could have a scaffolding approach that looks for specific words, which then feeds into an embedding for semantic similarity to contentious issues, and then finally into an LLM that ranks how important the tweet is to have a community and some additional context (through a web search and internal knowledge within the LLM) to help the community noter. I think there's a way to make this economically viable for companies.

Yes, companies, I want Community Notes to expand beyond X. Let's figure out how to connect it to YouTube. Why haven't other social media websites picked it up yet? If they care about truth, this would be a considerable step forward beyond. Notes like “this video is funded by x nation” or “this video talks about health info; go here to learn more” messages are simply not good enough. We need to improve the state of truth-seeking on the internet.

Not just that, as an AI Safety researcher, this is particularly important to me. Don't forget that we train language models on the internet! The more truthful your dataset is, the more truthful the models will be! Let's revamp the internet for truthfulness, and we'll subsequently improve truthfulness in our AI systems!!

@JayThibs
Copy link
Author

JayThibs commented Nov 14, 2023

I used GPT-4 to come up with a little estimate of the cost:

To estimate the cost of using GPT-3.5-turbo to verify whether a tweet contains misinformation, we need to consider both input and output token costs and the average token count per tweet analysis.

Average Token Count per Tweet Analysis:

A tweet can be up to 280 characters long. Considering that an average English word is around 4.5 characters plus a space, this translates to about 280 / 5.5 ≈ 51 words per tweet.
Tokens are not equivalent to words; a token can be a word, a part of a word, or punctuation. On average, we can estimate around 1.5 tokens per word, considering that longer words might be split into multiple tokens and punctuation also counts as tokens.
Therefore, each tweet can be estimated to involve about 51 * 1.5 ≈ 77 tokens.

Input Cost Calculation:
If you analyze 10,000 tweets per day, and each tweet is approximately 77 tokens, the total daily input tokens are 10,000 * 77 = 770,000 tokens.
The cost for input usage is $0.0030 per 1,000 tokens. So, the daily input cost is 770,000 / 1,000 * $0.0030 = $2.31.

Output Token Estimation:
The output for each tweet verification might vary, but let's assume an average of around 50 tokens per response (this is a rough estimate, as the model might provide a brief or detailed analysis depending on the complexity of the tweet).
For 10,000 tweets, the total output tokens would be 10,000 * 50 = 500,000 tokens.

Output Cost Calculation:
The cost for output usage is $0.0060 per 1,000 tokens. So, the daily output cost is 500,000 / 1,000 * $0.0060 = $3.00.

Total Daily Cost:
Adding both input and output costs, the total daily cost would be $2.31 (input) + $3.00 (output) = $5.31.

Monthly Cost Estimation:
If this operation runs every day for a month (30 days), the monthly cost would be $5.31 * 30 = $159.30 (per month).
This is a rough estimate based on average values for token length and response size. The actual cost may vary depending on the exact length of each tweet and the verbosity of the model's responses.

This seems like a reasonable cost to aim for. For big social media companies, this is nothing (even if we increase it to $1000/month). I picked 10k tweets per day because we could use other cheaper methods to filter down to about that many tweets per day. For example, only throw in tweets that have over 100 likes (or more), rank how important the tweet should be sent to 3.5-turbo or 4 based on a score that takes into account a list of sensitive words and embedding scores. I'm sure there's additional stuff we could add here. Clip anything with a low score + max out at x number of tweets per day.

You could add gpt-4-vision for tweets that contain images or gpt-4 for the y number of most-liked tweets.

Also, it would be good to make it even easier for people to write great community notes. I'm sure GPT-4 + search and other models can help with this.

(Of course, the companies could probably save even more money if it just had a GPU with a fine-tuned model and maybe fine-tuned embeddings.)

@JayThibs
Copy link
Author

Ok, some additional things to try:

  • MuMiN: it has a dataset and leaderboard for identifying misinformation tweets (it even has a multimodal part). We could train a model on this. It has a tutorial on how to use it. The MuMiN dataset is a challenging misinformation benchmark for automatic misinformation detection models. The dataset is structured as a heterogeneous graph and features 21,565,018 tweets and 1,986,354 users, belonging to 26,048 Twitter threads, discussing 12,914 fact-checked claims from 115 fact-checking organisations in 41 different languages, spanning a decade.
  • Could probably find more relevant stuff here: Papers with Code - Misinformation
  • Misinfo Baselines and related paper
  • Repository for fake news detection datasets

@JayThibs
Copy link
Author

JayThibs commented Nov 15, 2023

Alright, so I started working a repo for this.

Screenshot 2023-11-15 at 3 56 44 AM

@luckybear97
Copy link

luckybear97 commented Nov 15, 2023

I don't think an AI is needed with Community Notes. Priority notes already implemented on X/Twitter backend and adding another layer of AI verification would slow it down dramatically.

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are. In addition, using GPT (OpenAI) while Grok is part of X, would totally does not make sense.

@JayThibs
Copy link
Author

@luckybear97

I don't think an AI is needed with Community Notes. Priority notes already implemented on X/Twitter backend and adding another layer of AI verification would slow it down dramatically.

I'm not familiar with the codebase so don't really know how the algorithm works. My guess is that AI could improve the ranking if it is used in addition to it? I'd be surprised if the algorithm's efficiency can't be improved? If not, there's also the community notes assistant that could help note writers.

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are.

You can use custom LLMs (like one of the fine-tuned open source models) or optimize your system prompt / instructions accordingly to fit your use-case. I don't think this is really an issue.

In addition, using GPT (OpenAI) while Grok is part of X, would totally does not make sense.

Grok has no API. They could use an internal Grok model if they like, I wouldn't care. But OpenAI/Anthropic have APIs so I started with them.

But honestly, Grok's personality is not very well-suited to help with this IMO.

@armchairancap
Copy link

I don't think an AI is needed with Community Notes.

Yes. It's a form of mission creep and it would add complexity and cost without adding any value, especially considering that AI is heavily regulated and each jurisdiction may have its own official or approved AI that's biased to promote the official lies.

@TheApproach
Copy link

TheApproach commented Dec 9, 2023

IMO, Community Notes remaining as a source of genuine human feedback, and minimizing generated feedback is ideal.
At least until fully cognizant & sentient AI people show up in the community, I suppose.

These are the notes of the community, and also an especially valuable public discourse aspect that many AIs will undoubtedly look at. I think getting AI involved here would lend to creating nonsense artifacts. Similar to toying with control samples in an experiment, making results less useful.

@xcsf6
Copy link

xcsf6 commented Dec 13, 2023

I don't think an AI is needed with Community Notes. Priority notes already implemented on X/Twitter backend and adding another layer of AI verification would slow it down dramatically.

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are. In addition, using GPT (OpenAI) while Grok is part of X, would totally does not make sense.

Do you claim that RLHF or other alignment data with their "AI Safety" guidelines to remove bias increase the bias and X/Twitter goals are reflecting "raw data's biases" to community note?.

@armchairancap
Copy link

Do you claim that RLHF or other alignment data with their "AI Safety" guidelines to remove bias increase the bias and X/Twitter goals are reflecting "raw data's biases" to community note?.

Yes, I do.

You completely ignore the self-evident fact that governments are often the biggest and most systematic spreaders of dis- and mis-information and AI "laws" that X would have to follow would be different in every jurisdiction, on top of "safe" AI models being wrong i.e. rigged by the state.

In your view North Korea's state-approved AI should be able to apply their "AI Safety guidelines" to my Community Note exposing their lies.

In addition to being impractical and impossible to implement, it is preposterous to think that Community Notes should be subject to various types of state-mandated AI censorship.

Why even bother having humans in the loop?

@tactipus
Copy link

lol

@xcsf6
Copy link

xcsf6 commented Jan 21, 2024

@armchairancap @luckybear97

Please don't argue with "straw man."

I never talk about governments at the previous comment.

You completely ignore the self-evident fact that governments are often the biggest and most systematic spreaders of dis- and mis-information and AI "laws" that X would have to follow would be different in every jurisdiction, on top of "safe" AI models being wrong i.e. rigged by the state.

In your view North Korea's state-approved AI should be able to apply their "AI Safety guidelines" to my Community Note exposing their lies.

In addition to being impractical and impossible to implement, it is preposterous to think that Community Notes should be subject to various types of state-mandated AI censorship.

At least to say, free speech cause of first amendment to US constitution protects corporations like X, Meta, Google etc against government law enforcements.

Thus, LLMs created by X in U.S. would be interpreted as X's corporation speech.

Why even bother having humans in the loop?

Which subset of humans are in the loop?

Because X could select community note contributor in arbitrary and closed manners, X could induce "selection biases". However, this is still considered as "X's free speech rights to be biased."

If X's goal is to create truly bias-free contributors, then it just needed to use identification numbers like phone numbers or bio-metrics, but X uses "violation history in X's moderation rules".

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are

I don't think that current contributor selection mechanism is free from biases mentioned as above.

@xcsf6
Copy link

xcsf6 commented Jan 21, 2024

@armchairancap

BTW, if you are an anarcho-capitalist, you have NO free speech rights against Microsoft at this GitHub platform because of MS's absolute private ownership of the computing clusters.

@luckybear97
Copy link

@armchairancap @luckybear97

Please don't argue with "straw man."

I never talk about governments at the previous comment.

You completely ignore the self-evident fact that governments are often the biggest and most systematic spreaders of dis- and mis-information and AI "laws" that X would have to follow would be different in every jurisdiction, on top of "safe" AI models being wrong i.e. rigged by the state.

In your view North Korea's state-approved AI should be able to apply their "AI Safety guidelines" to my Community Note exposing their lies.

In addition to being impractical and impossible to implement, it is preposterous to think that Community Notes should be subject to various types of state-mandated AI censorship.

At least to say, free speech cause of first amendment to US constitution protects corporations like X, Meta, Google etc against government law enforcements.

Thus, LLMs created by X in U.S. would be interpreted as X's corporation speech.

Why even bother having humans in the loop?

Which subset of humans are in the loop?

Because X could select community note contributor in arbitrary and closed manners, X could induce "selection biases". However, this is still considered as "X's free speech rights to be biased."

If X's goal is to create truly bias-free contributors, then it just needed to use identification numbers like phone numbers or bio-metrics, but X uses "violation history in X's moderation rules".

Furthermore, LLMs are hardly free from biases, in which the data was selectively "handpicked" from their "AI Safety" guidelines which is against of what X/Twitter goals are

Perhaps you're confused,I never claimed "X" is free from bias it's just that "free speech" is their goals and using LLM to selectively rate or process notes would terribly destroy it.

I don't think that current contributor selection mechanism is free from biases mentioned as above.

I don't know what's their criteria to be a poster and never was I claimed it's free from biases. They are a private company for "profit" in which they have every rights to decide whatever rules they want although I do agree it's best to be transparent about it

@xcsf6
Copy link

xcsf6 commented Jan 21, 2024

Perhaps you're confused,I never claimed "X" is free from bias it's just that "free speech" is their goals and using LLM to selectively rate or process notes would terribly destroy it.

I never address about what your claim is, but @armchairancap said that "You completely ignore the self-evident fact..." which I never said about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants
@JayThibs @tactipus @luckybear97 @TheApproach @armchairancap @xcsf6 and others