Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI automation for newsletter #1384

Open
ElhamAryanpur opened this issue Jun 8, 2023 · 46 comments
Open

AI automation for newsletter #1384

ElhamAryanpur opened this issue Jun 8, 2023 · 46 comments

Comments

@ElhamAryanpur
Copy link
Contributor

Hello!

After some conversation with Ozkriff and looking at how much work goes into editing the newsletter each month, I was wondering if it'd be a good idea to start using some automation tools.

For editorial roles, something like chatgpt/gpt4 can assist a lot. What I had in mind was the bot running on each pull request, and checking the content that was added and feed that to the AI for editing and auditing, and returning either a fixed version or list of things to do to fix them or update them.

For example one of my PR had too much repetition and extra information, and the title wasn't good. In such a case it can do all of that given the newsletter guidelines, or notify me to do them the way required.

Cost wise, since the newsletter is a monthly release, I don't think we can exceed a dollar at the busiest month given how cheap the api is, and since each section is small, the context of gpt3.5 won't be a problem either.

One other thing I remembered is that it can also assist in writing for projects that were announced but had no one to write for them, for example we had Rusty Jam #3. It can also write a section for it without taking time off the editors, and it can just be reviewed for fixes and added.

These are just my suggestions so I'm not sure if it can be appealing to use, I have some experience with OpenAI so I can assist in implementing it.

@erlend-sh
Copy link
Member

I would love to see this experimented with.

The AI-assistance track is well worth exploring, but it's worth noting that this type of automation could also work with a much more basic feed aggregator that simply asks projects to list their update feeds (blog rss, mastodon, github releases etc.) and it'd create summaries for projects by simply linking out to their updates for the past month.

@ElhamAryanpur
Copy link
Contributor Author

@erlend-sh that is actually a really really awesome idea!!!

@Vrixyz
Copy link
Collaborator

Vrixyz commented Nov 23, 2023

This week in rust has an interesting bot which might be worth investigating : https://github.com/extrawurst/twir-bot

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 11, 2024

@ElhamAryanpur if you're still up for implementing something like this, I'd be very up for reviewing it and getting it merged :)
I also see some great time saving potential on the editing side and would gladly pay for the API access, since it would be really cheap.

The feature I'd like to see the most would be a short automated summary for content no one has written anything for yet. Maybe it's already enough to feed the raw HTML to GPT and ask it for a summary? I also know that there are services that do this kind of thing for you using GPT like https://notegpt.io/web-summary, idk if they're better than just entering our own prompt though.

@ElhamAryanpur
Copy link
Contributor Author

@janhohenheim absolutely, since then I've invested a lot of time in my own side of LLM based software, and can say it's even better than ever to do something like this.

We can have four approaches:

  1. using a custom model hosted on a VPS or similar. Provides full privacy, can be reused by other Rust based newsletter and publication, even social media moderation as well. periodically or automated, the locally hosted model fetch update changelogs and such, and does all the summaries and reports itself through RAG architecture.

  2. using a custom or stock model hosted locally by a maintainer. Same as above, except This is very cost effective and pretty much free, no need for API access anywhere. Models like Mistral 0.2 7B has over 32k context length, 4.5GB in model file size (gguf), and can run on any modern computer. So pretty much anyone can use it. We can even use fine-tuned version such as hermes/dolphin mistral, for better results.

  3. fine tune a model by a cloud provider such as OpenAI, claude, google, ... Bit expensive and at mercy of the cloud provider but it can have benefits of the first option.

  4. using a stock model by a cloud provider such as OpenAI GPT 4, claude, gemini, ... Cheaper than third option but same risks. Bit of issue with these two options is degradation of the models over time as more gaurdrails are introduced and potentially can sometimes put a dent on your bank if suddenly price changes and such. They can also block your access on a whim if they like.

Personally I think second option would be the best to start with. RAG helps with auto search of changelogs and summary writing. Pull requests too but perhaps a bit difficult automatically locally than github actions 😅.

Let me know which options you'd think is nicer and I can begin.

@erlend-sh
Copy link
Member

I also recommend checking out https://spiderwebai.xyz/ by @j-mendez

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 11, 2024

@ElhamAryanpur great to hear! Since the newsletter has historically struggled with maintainer burden, I am more inclined to option 4. You know this stuff better than me though, so if you think that option 2 would be really really good for us, I'm ready to rent a cheap server on DigitalOcean and give you access.
Also, what do you think about the service @erlend-sh mentioned?

@ElhamAryanpur
Copy link
Contributor Author

I also recommend checking out https://spiderwebai.xyz/ by @j-mendez

Yeah they're using RAG too, I assume langchain by most chances

@ElhamAryanpur
Copy link
Contributor Author

@ElhamAryanpur great to hear! Since the newsletter has historically struggled with maintainer burden, I am more inclined to option 4. You know this stuff better than me though, so if you think that option 2 would be really really good for us, I'm ready to rent a cheap server on DigitalOcean and give you access.
Also, what do you think about the service @erlend-sh mentioned?

That is very true, I have wrote a section about my work there in the past and it shocked me how much work the maintainers did every month...

Yeah we can start locally for development, get some early testing on the newsletter, if the results were great, we can then move to hosting or keep it locally. I just don't wish to burden you for paying the servers or API 😅 trying to get a solution that anyone can use and contribute instead of hurting your wallet, especially at this stage

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 11, 2024

@ElhamAryanpur alright then! Do you need anything from me to start? How do you want to organize yourself? If you create a repo with a readme on how to run the model, I can ensure it runs on my machine in the background (or on a machine I rented anyway to host a Minecraft server, hehe)

@ElhamAryanpur
Copy link
Contributor Author

For sure, it'll probably be a repo. I'm not sure much on else yet, will keep you updated here. Thank you!

@j-mendez
Copy link

@ElhamAryanpur great to hear! Since the newsletter has historically struggled with maintainer burden, I am more inclined to option 4. You know this stuff better than me though, so if you think that option 2 would be really really good for us, I'm ready to rent a cheap server on DigitalOcean and give you access.
Also, what do you think about the service @erlend-sh mentioned?

That is very true, I have wrote a section about my work there in the past and it shocked me how much work the maintainers did every month...

Yeah we can start locally for development, get some early testing on the newsletter, if the results were great, we can then move to hosting or keep it locally. I just don't wish to burden you for paying the servers or API 😅 trying to get a solution that anyone can use and contribute instead of hurting your wallet, especially at this stage

Hi! If the bandwidth is minimal and simply a page or two ( it would take a lot of request to get to 1$ ), we also do not pad the cost for GPT from OpenAI. The dashboard is very early stage and being actively improved. The service is more flushed from an API perspective atm. I recommend testing a basic prompt on the GPT playground and works it works off a small set of HTML - use the GPT configuration to extract what is needed etc. Lmk if you have any questions. Thanks @erlend-sh!

@ElhamAryanpur
Copy link
Contributor Author

Hm? We did talk about it in the options listed

@j-mendez
Copy link

j-mendez commented Apr 11, 2024

Hm? We did talk about it in the options listed

If you create an account I can add a dollar to the account to experiment. The service goal is pretty much putting this project on a server to scale https://github.com/spider-rs/spider.

@ElhamAryanpur
Copy link
Contributor Author

Hm? We did talk about it in the options listed

If you create an account I can add a dollar to the account to experiment. The service goal is pretty much putting this project on a server to scale https://github.com/spider-rs/spider. We are in the middle of making a dashboard that is like the supabase dashboard to view all of the data from the crawls etc, should be out by next week.

Oh the issue isn't that it can't be done through OpenAI, we're just exploring different options. I'm potentially looking into making it run locally as to cut down the charges from ever occuring. Because it won't just be a page or two of review, it'll also be crawling the changelogs and releases of different project and compile them too, so we're looking at a lot of tokens being used. But yeah the code should be able to be used by any service, including OpenAI in the future. But for now I'm keeping things simple during development.

@iolivia
Copy link
Contributor

iolivia commented Apr 13, 2024

Hey folks 👋

I noticed last weekend we have not been publishing any newsletters recently, stumbled upon this and the other discussion about maintenance burden, and I wanted to try out an experiment to see if we can improve this. I sort of reached very similar conclusions to the ideas in this thread that more automation is needed to scan stuff, some AI to summarise stuff (or this needs to be done by a human in the meantime), and in general something that can ease the maintenance burden, for example having a basic script that can prepare a draft that needs to be edited, rather than fully created.

Take a look at my experiment here - https://github.com/iolivia/newsletter-bot

Current things it can do:

  • Filter the updates by a given time range
  • Fetch github releases for engine and library updates - these section are half automated with this approach, the release notes are there but they need to be summarised somehow, and sometimes follow the links to blog posts with release notes
  • Fetch github issues for generating request for contributions - this section is 💯 automated with this approach
  • Fetch reddit threads for open discussions - this section is 💯 automated with this approach
  • Generate basic markdown

There is an example output of the local script

GITHUB_TOKEN=github_pat_<token> cargo run -- 2024-04-01 2024-04-13
Args 2024-04-01 - 2024-04-13
Rust-SDL2/rust-sdl2
bevyengine/bevy
Found release: ✅ v0.13.2 2024-04-04 21:01:55 UTC
Found release: ❌ v0.13.1 2024-03-18 22:38:27 UTC
Found release: ❌ v0.13.0 2024-02-17 19:32:58 UTC
Found release: ❌ v0.12.1 2023-11-30 01:23:10 UTC

Rust-SDL2/rust-sdl2 - 1 Beginner Open Issues - ✅
bevyengine/bevy - 99 Beginner Open Issues - ✅
PistonDevelopers/piston - 0 Beginner Open Issues - ❌
not-fl3/macroquad - 1 Beginner Open Issues - ✅
ggez/ggez - 0 Beginner Open Issues - ❌
nannou-org/nannou - 0 Beginner Open Issues - ❌
jeremyletang/rust-sfml - 1 Beginner Open Issues - ✅

Found top post: Spell Casting system short devlog (written in Rust)
Found top post: This Month in Rust GameDev: Call for Submissions!
Found top post: We're still not game, but progress continues.
Found top post: banging my head against the wall (someone help me think about data structures)
Found top post: Working on a casting system with the first spell (in Rust)

And an example of the markdown file it produces here.

Let me know what you think about this, maybe this is a good starting point 😄

@janhohenheim
Copy link
Collaborator

@iolivia wooooah, that's cool! I'll take a closer look once I have time :)

@ElhamAryanpur
Copy link
Contributor Author

@iolivia amazing work! I can help with the AI part for summary text, will open a PR

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 15, 2024

@iolivia I checked out the repo, and it looks really nice! Good work!
I'll drop you a PR later adding some sources.

One thing of note is that right now, the bot is a bit too good. Many of the news provided are, in my opinion, not significant enough to be included in the newsletter. Removing them by hand is trivial though :) Other than that, we could ignore all posts below a certain amount of upvotes / hearts / retweets etc. and all crate updates that only change the patch version.

Another thing I'm wondering is how to use the bot in practice. Running it at the beginning of the newsletter (the 3rd of the month) seems useless, since it would only aggregate news of the last 3 days. Running it in the middle is a bit arbitrary and will miss quite a few cool updates. Maybe we could add a GitHub Action to run it right at the freeze period to add all news no one has written about yet? If we want this completely automated, we should add the output of the bot to the newsletter only if the newsletter does not already include that content.

Another nice thing would be Discord integration like in TWIR, but that's very much optional.

For the moment, the bot is definitely good enough to be used manually. Again, great work!

@ElhamAryanpur
Copy link
Contributor Author

The discord integration can be added through webhooks, I have done a few projects with it so I can assist with that. And a solution for the news can be:

  1. as you said a CI to periodically check for news and filter ones with high upvotes and hearts
  2. store them in a to be summarized section maybe, or somewhere to gather them, or not if duplicate
  3. when we are near the newsletter date, we can then check those sections and use AI or human to summarize them. This should solve both issues of not being too early or too late.

I have pushed a PR using LLaMa library for summary, and if we use the model I recommend, dolphin mistral 7B v0.2 GGUF, we should be fine with pretty much any size of gathered release notes as the model supports upto 32k token context length (to compare, ChatGPT at launch had only 2k context length). The model needs ~4.1GB VRAM so pretty much anyone can run it too. Hence gathering the news periodically and summarizing them at the end should be OK.

What do you guys think?

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 15, 2024

@ElhamAryanpur sounds great! Couldn't we summarize them at the point they get gathered? For example, the bot could aggregate news every 3 days, add them to the GH issue and write a generated summary into the current newsletter markdown file.

@iolivia would you be available for implementing that part? Or would you like some help?

@ElhamAryanpur
Copy link
Contributor Author

it is possible, just that the model could be too large for github actions to run, and locally it has no batching support yet. So it could take some time to summarize everything. also having it summarized at the end can help the bot have a complete picture of all the development and make a better summary. But we absolutely can do the every 3 days too. can set the bot on a cron job in a server somewhere.

@janhohenheim
Copy link
Collaborator

@ElhamAryanpur I've got a fedora server ready to run it :)

@ElhamAryanpur
Copy link
Contributor Author

hell yeah!

@iolivia
Copy link
Contributor

iolivia commented Apr 21, 2024

So happy to see all the progress, thanks so much everyone for the awesome contributions already! 🔥

One thing of note is that right now, the bot is a bit too good. Many of the news provided are, in my opinion, not significant enough to be included in the newsletter. Removing them by hand is trivial though :)

Agreed, this was my observation as well! I tried to experiment with removing releases with notes less than x characters, but then you miss all the major releases that have a link to a blog post for notes. Maybe another idea is to create a mini-section for minor releases at the end of each section with mostly a link to the repo and the release version and a one liner, this could help discover repos that are active.

Another thing I'm wondering is how to use the bot in practice.

No strong feelings on this tbh, but trying to keep it simple the options I see:

  1. someone runs it locally when it's time to generate the newsletter and pushes a PR to the newsletter repo with the md file - this is easy enough to do tbh
  2. we add a trigger in the CI to run the bot on the 1st of the month that would run it and prepare the draft for the previous month

@janhohenheim
Copy link
Collaborator

@iolivia since the newsletter died last time because of maintainer burden I'm weary of adding anything that adds any friction to the process, so I'll be automating as much as possible. I'll try to add a script for all of this after this newsletter so your bot is integrated in the next cycle 🚀

@ElhamAryanpur
Copy link
Contributor Author

@janhohenheim typesafe 🔥 blazingly fast 🔥 automation to the moon.

@LPGhatguy
Copy link
Contributor

If future newsletter entries are going to be AI edited or generated, I'd like to request that no content that I work on for the Rust community be included in the newsletter.

I appreciate the good intentions of everyone involved in this effort.

The point raised by @17cupsofcoffee in #1417 (comment) is very salient and captures my sentiment well.

I like the format from This Week in Graphics. The author is funded via Patreon and writes very short bullet point summaries.

I'm interested in helping push for a grant from the Rust Foundation to fund a writer or editor to help with the newsletter as an alternative to involving AI!

@17cupsofcoffee
Copy link
Collaborator

17cupsofcoffee commented Apr 26, 2024

I think @LPGhatguy's comment raises a good point that hasn't been made in these threads already - using AI in the production of the newsletter will discourage some people from reading/contributing1 (there are a lot of people who aren't massive fans of this tech - in creative spaces, especially!), and I hope this is weighed up versus any potential benefits of using it.

Footnotes

  1. If I'm totally honest, it's kind of sapped my motivation to get involved again. The appeal of the newsletter to me is that it's a curated view of all the cool stuff that's going on in the community - the idea of padding it out with LLM-generated text feels like it runs contrary to that, and it bums me out a little.

@erlend-sh
Copy link
Member

erlend-sh commented Apr 26, 2024

using AI in the production of the newsletter will discourage some people from reading/contributing

That's fair. It cuts both ways though.

Conversely, I got to a point where I dreaded having to make PRs for the newsletters because I was already writing posts for my project's blog, mastodon, discord etc., which meant my marketing-energy was already spent. As much as I loved the newsletter it also felt like a burden sometimes, since I knew I had all these updates that I should share but didn't, as I simply didn't have Yet Another Post left in me.

Also, due to the immense workload of manual curation, the alternative we've implicitly opted for during the past several months has been no newsletter.

I'm on the record as an AI critic, but that doesn't mean I think it should be unilaterally shunned as a technology, especially not for one of the very few things it's actually good for, namely text summarization/consolidation. I could get behind an objection against proprietary, cloud-based AI, but I really don't have many qualms about the self-hosted variety, in particular when the final publishing is still subject to human review.

The newsletter-bot does already do a pretty good job without any AI assistance though. If it was possible for people to opt out of the AI treatment for their projects' updates, might that be an acceptable compromise?

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 26, 2024

Definitely valid points! @LPGhatguy @17cupsofcoffee I appreciate the tone of both of your feedbacks; it's obvious that your criticisms were written in good faith :)
I'll clarify some things.
I hate padding stuff out, I hate unnecessary text, and I hate content spewed out by AI just for content's sake.
My goal is to have as much of the newsletter curated as possible until only some entries remain to be filled out. These are already hand-picked by a human. At no point should an LLM decide which content goes in.
The point here is to avoid a situation where the newsletter is mostly done, but no one has time to tackle some missing summaries. While AI certainly has its limits and many issues (like making stuff up, racism involved in training data, stealing work, etc.), it is notably good and reliable for generating simple summaries. These are things that for me as an editor are mostly busywork. Keep in mind that this is after someone has already decided that a project is worth writing about, and now everything is missing are one or two sentences about it that no one managed to write before the release date.
At no point is user data used to train an LLM. The LLM simply generates a short summary based on training data that already exists today and only considers the context of the writing style we already use in the rest of the newsletter. Again, the rest of the newsletter is not part of the training data.
I can of course respect anyone's wishes to not be involved in any part of the newsletter creation process. I also want to highlight that such opt-outs are usually included for people to say "I don't want to have my data included in the LLM" and not for "I don't want a product that uses an LLM to support my content" (correct me if I misunderstand your stance)
@LPGhatguy @17cupsofcoffee would it be alright for you if there was a disallow-list in the repo where people can put their links and names in so that the AI won't generate a summary for anything they've created? We can have a little link in the newsletter making people aware of this possibility. Or would you want to have your content also no longer be summarized by humans?

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 26, 2024

PS: as @erlend-sh notes, the current plan is to have everything self-hosted and fully under our control. The bot using the tech is open sourced as well.
Also, if there is major community backlash after introducing AI generated summaries for some posts, I will be the first to want to have them gone again.

@ElhamAryanpur
Copy link
Contributor Author

I'd like to add that the summary in my opinion is not the only thing it can be helpful with. Originally the idea was to help the maintainers with the already submitted entries. Such as spell checks, grammar check, rephrasing things, better title, following guideline, ... Those were things almost completely done by the maintainers, which is a huge burden especially when there are a lot of entries.

Automations of such things aren't there to replace the original and final human authors, but rather to assist in cleanups.

@LPGhatguy
Copy link
Contributor

Automation and a self-hosted bot is fine with me. If the newsletter is going to have any AI-generated content though, I'd prefer that my content not be present at all.

If the group is interested in paying a part time editor to be involved and do this work, I think that is an acceptable alternative to me.

@iolivia
Copy link
Contributor

iolivia commented Apr 27, 2024

Hey folks!

Just to chime in here as the author of the newsletter bot, my intent is to automate the gathering of release notes, social media posts etc to help with the content gathering, very similar to TWIR. After this initial gathering there needs to be an editing process, and the AIsummarisation feature was just an idea mentioned in this thread on how to help with shortening lengthy release notes when they are not relevant.

For me the whole point of the newsletter is to help build the community around Rust game development and showcase the amazing work of so many people. I don't think it's worth sacrificing this mission for shortening some text which a human can do within minutes.

So while I understand the appeal of using AI and the fun factor, given the feedback I would focus the project on gathering content and making it clear a human-centric editing process still needs to happen.

Thoughts?

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 27, 2024

@iolivia creating a summary only takes a few minutes in a vacuum. As @erlend-sh mentioned, most of us do this in the spare few hours we have and the mental burden of "this one more thing I need to do on time" is not to be underestimated. I don't want to reveal personal information, but I know of at least one maintainer that greatly struggled mentally with this. As @ElhamAryanpur noted, getting rid of formatting and spelling stuff is just a great for making sure that maintainer burden stays low, which is historically the one thing the newsletter struggled most with.
Paying someone to do our busywork seems to me like an undesirable move when there are good tools existing for this. After all, we also automated the newsletter creation/setup instead of paying someone to copy over templates.

Based on the feedback, I think the most productive way forward is this:

  • Setup summary generation (and maybe already automated spell checking etc.) for the next cycle (May) in such a way that it only generates stuff for points we did not finish in time
  • Go over the generated stuff by hand
  • Have a general "Do you want to have your content excluded from the newsletter? Add your name or website here" section in the newsletter connected to a file on GitHub. This allows people to stay out of it for whatever reason they have.
  • Setup a lint that blocks PRs in general if they reference people or links from said list.
  • Hold some kind of poll later on (maybe in 3 months?) to ask people how they feel about the general format of the newsletter. I really want to avoid any unnecessary padding in the newsletter, and this should help us tweak the process if we failed on this regard.

Would that be okay for you, @iolivia and @ElhamAryanpur? Since this process leaves 95% of the content to humans and we never automatically include anything a bot writes automatically, I'd happily count it as a human-centric workflow with some tools to help the humans do their thing.

@LPGhatguy
Copy link
Contributor

Paying someone to do our busywork seems to me like an undesirable move when there are good tools existing for this. After all, we also automated the newsletter creation/setup instead of paying someone to copy over templates.

I don't think this is a fair characterization. There are lots of people who are paid full time to be editors and I don't buy into the idea that AI tools should replace their labor. In a time where many people, especially in the games industry, are concerned about companies buying into AI at the expense of humans, AI is unarguably the wrong direction.

If no one is invested enough to summarize a piece of content to include in the newsletter, is it worth including that content at all, or can it be a footnote? Does every small piece of news that happens in the community need a featured section, or can those be editorialized to just a few to reduce maintainer burden?

To be clear, I'm not exciting about pulling my content or contributions from the newsletter. Including AI generated summaries in a gamedev-focused newsletter would be very disappointing and I'm certain I'm not alone in that sentiment.

@j-mendez
Copy link

It is a good idea to mark AI content being generated so users know it is not from a human. Lots of opportunities to use AI without it being tied to curation. If being used to curate,send a PR that can be linted and manually merged if or auto merged if checks and balances are good.

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 28, 2024

@j-mendez the line there is very fuzzy. When I'm editing stuff I have GitHub copilot running in the background helping me out. An AI might go over the entire document later to fix typos or weird sentences. Generated summaries are reviewed and, if needed, changed by humans. But as a minimum step we could add a little asterisk next to summaries that have initially been generated. What do you think?

@LPGhatguy I definitely agree to the sentiment that we should be very careful about automating peoples jobs away. I still sense that you are coming from a position of good faith, and I appreciate how you phrase your concerns, so I hope I don't sound to harsh when I say this.
There is a time and a place to discuss the ideological problems pertaining AI. This newsletter issue is not the right place or the hill to die on. We are not taking anyone's job, as the alternative is to simply remove content. We are not automating creative work; writing the summary is just busywork for us. We are not padding the content, as it is not an issue of "is it interesting enough to write a summary about" but instead "do the maintainers happen to not have too many plates to spin in their life right now"
Everything will always stay human-curated as long as I have any say. We are talking about generating one or two sentences per item we already agree on being important in some way to the community. If the generated stuff is bad or boring, we will omit it from the newsletter.
None of the (very valid!) criticisms of AI apply to this case. I feel like we took precautions to make sure we are using the technology in a responsible way.
I'm open to change the workflow if we were not diligent enough. I am not open to avoiding an entire technology out of principle.

Again, I hope I am not overly harsh (please tell me if I come off otherwise). I appreciate your input on this and understand your reasoning. We will also respect your wish to not be included in the newsletter from now on.

@LPGhatguy
Copy link
Contributor

There is a time and a place to discuss the ideological problems pertaining AI. This newsletter issue is not the right place or the hill to die on.

I am a (somewhat) prominent member of the Rust gamedev community. I'm building a commercial game using Rust. I'm both personally and professionally invested in the perception of our community. The newsletter is one of the main communication channels that goes out from our community and into the public sphere. It's also published under an officially-labeled working group of the Rust project.

If there is a "hill to die on," I think the involvement of AI generated content in this context is a great cue to step in and make sure my voice and the sentiment of my peers in the games industry is heard. If there is a better public space to raise these issues, please let me know.

We are not taking anyone's job, as the alternative is to simply remove content. We are not automating creative work; writing the summary is just busywork for us.

Frankly, I do not think that you or others in this thread have seriously entertained the proposed alternatives. The options are not "AI" or "no content" and calling this labor "busywork" is reductionist. This comes across to me as a human problem that the group is trying to solve with technology.

How many hours of work is this per newsletter?
How much would it cost to hire an editor to do this work?
What channels does the Gamedev WG have to seek funding from The Rust Foundation or other entities?
How much visibility does this GitHub issue have to the people who have a stake in the community?

Alternatively, why die on the hill of involving AI in a project struggling with maintainer burden and community engagement? The risk of creating bad press is fairly high and the upside is low.

I'd like to reiterate that I am personally volunteering to negotiate with The Rust Foundation or potential corporate sponsors like Embark to secure funding for this role. If those avenues aren't successful, I am also willing to hire and manage a writer or community manager that could assist with this work.

@janhohenheim
Copy link
Collaborator

janhohenheim commented Apr 28, 2024

@LPGhatguy I indeed didn't realize you volunteered for this, thanks. I'm completely fine with someone being paid to do this, I just don't personally think it is worth doing so in this case.
I'd like to hear other maintainers' opinions here. If the consensus is that AI is a no-go, I'll accept that. Pinging @ozkriff and @AngelOnFira in particular.
Edit: For what it’s worth, my personal recommendation is to let @LPGhatguy try to get funding or hire someone on their own. There’s no harm in trying, and I think the Rust foundation knows best what they want to use their money for.
If this proves unsuccessful, I’d like to proceed with the AI.
If that is also not possible, I suggest adopting a policy of "throw everything away that didn't get edited in time" to make sure people don’t burn out again.

@ElhamAryanpur
Copy link
Contributor Author

@iolivia creating a summary only takes a few minutes in a vacuum. As @erlend-sh mentioned, most of us do this in the spare few hours we have and the mental burden of "this one more thing I need to do on time" is not to be underestimated. I don't want to reveal personal information, but I know of at least one maintainer that greatly struggled mentally with this. As @ElhamAryanpur noted, getting rid of formatting and spelling stuff is just a great for making sure that maintainer burden stays low, which is historically the one thing the newsletter struggled most with. Paying someone to do our busywork seems to me like an undesirable move when there are good tools existing for this. After all, we also automated the newsletter creation/setup instead of paying someone to copy over templates.

Based on the feedback, I think the most productive way forward is this:

  • Setup summary generation (and maybe already automated spell checking etc.) for the next cycle (May) in such a way that it only generates stuff for points we did not finish in time
  • Go over the generated stuff by hand
  • Have a general "Do you want to have your content excluded from the newsletter? Add your name or website here" section in the newsletter connected to a file on GitHub. This allows people to stay out of it for whatever reason they have.
  • Setup a lint that blocks PRs in general if they reference people or links from said list.
  • Hold some kind of poll later on (maybe in 3 months?) to ask people how they feel about the general format of the newsletter. I really want to avoid any unnecessary padding in the newsletter, and this should help us tweak the process if we failed on this regard.

Would that be okay for you, @iolivia and @ElhamAryanpur? Since this process leaves 95% of the content to humans and we never automatically include anything a bot writes automatically, I'd happily count it as a human-centric workflow with some tools to help the humans do their thing.

I am up for it, we can implement those in the newsletter-bot

@junkmail22
Copy link
Contributor

Hello yes: If LLMs are used to generate content for this newsletter, I will boycott it and encourage others to do the same.

I definitely won't ever be submitting to the newsletter again if this goes through.

@janhohenheim
Copy link
Collaborator

@junkmail22 can we please have a discussion without resorting to these tactics? I am open to talking everything through, but to say "Don't use this technology or I will try to sabotage the project" is just not an okay way to converse in an open source environment. Neither would it be if I said "Accept AI or I will quit being a maintainer".
If you simply want to make sure your disapproval is weighted in, you are welcome to vote in the currently running survey

@junkmail22
Copy link
Contributor

junkmail22 commented May 3, 2024

@janhohenheim I've voiced my disapproval on the survey, as well as in other places.

Please understand - this isn't an underhanded tactic or anything. It is simply the case that I don't want to have anything to do with a project that uses LLM technology. My disapproval to this is very strong, and I am voicing that disapproval. Even if I didn't voice my feelings in this thread, I would still boycott the newsletter if LLMs were used to generate articles.

Frankly, I think this is a very good place to voice my disapproval, instead of just an anonymous survey.

@janhohenheim
Copy link
Collaborator

janhohenheim commented May 3, 2024

@junkmail22 that's all fine, please do voice your disapproval, I want this exchange to happen.
The only problem I have is with the notion of "I will encourage others to boycott the newsletter", which reads as a bit of a threat / attempt at holding the project hostage. I can read by your answer that you have good intentions and are arguing in good faith, so I hope you understand my feedback. I am sorry if my previous comment was not specific enough, my bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants