twitter_archive.qmd

---
title: "twitter_archive"
format: gfm
---

## Keeping up with the R community

### There are many sources

-   newsletters

-   mastodon?

-   Meetups

## Twitter has been important

The #rstats hashtag opens a world. Useful to follow specific individuals and the people they follow.

Grabbing content can be a challenge. My process has been:

-   Save Web pages that are referenced in Evernote using its handy web clipper plugin
-   Save images to my photo library and take it from there
-   Selective "likes"
-   Extract relevant text with an OCR browser extension https://github.com/amebalabs/TRex 

### Getting the Twitter Archive

Request a download here: https://twitter.com/settings/download_your_data 

It can take a few days for it to be ready.  You'll get a direct message on Twitter when it's ready to download.

### Use the built-in html interface

![](images/image-885175351.png)

### Using R to access the data from your Twitter Archive

Demonstration, looking at how to access and possible uses of `like.js` and `following.js`.

```{r}
#| label: libraries and such
library(tidyverse)
library(here)
library(jsonlite)
library(gt)
library(knitr)
```

Each archive file has some junk at the beginning that you need to skip.  Different files have a different amount of junk, so you have to figure out how much of the file to skip.

```{r}
#| label: read the files

following <- read_file(here("data", "following.js")) |>
  str_sub(start = 30) |> 
  fromJSON()

str(following[[1]])

like <- read_file(here("data", "like.js")) |>
  str_sub(start = 25) |> 
  fromJSON()

str(like[[1]])
```

The `flatten` function is handy for pulling data frames out of JSON structures.

```{r}
following_df <- as_tibble(flatten(following$following))
glimpse(following_df)

like_df <- as_tibble(flatten(like$like))
glimpse(like_df)
```

`gt` is a handy way to have a look at text that would otherwise get truncated.

```{r}
sample_table <- like_df[1:50,2:3] |> 
  filter(str_detect(fullText,"http") & str_detect(fullText,"rstat") ) 

kable(sample_table, "simple")

```

```{r}
like_df[1:50, 2:3] |>
  filter(str_detect(fullText, "http") &
           str_detect(fullText, "rstat")) |>
  mutate(fullText = paste(expandedUrl, "\n", str_wrap(fullText, 100))) |>
  select(fullText) |> 
  unlist() |> 
  cat( sep = "\n\n")


```


## Other resources

-   [Albert Rapp - How to collect dataviz from Twitter into your note-taking system](https://albert-rapp.de/posts/09_get_twitter_posts_into_your_notetaking_system/09_get_twitter_posts_into_your_notetaking_system.html)

-   [Python twitter archive parser and other resources](https://github.com/timhutton/twitter-archive-parser)

-   How to [download your twitter archive](https://support.twitter.com/articles/20170160)

-   [A visual analysis around a twitter hashtag](https://blog.ouseful.info/2012/02/06/visualising-activity-round-a-twitter-hashtag-or-search-term-using-r/)

-   [Find your Twitter pals on Mastodon](https://fedifinder.glitch.me/#)