Introduction

This repository contains code, additional results, and anonymized annotation data, for our recently accepted CSCW'22 paper. If you use this, please cite our paper (currently only the Arxiv version...)

@article{pathakMethodAnalyzeMultiple2021,
  title = {A {{Method}} to {{Analyze Multiple Social Identities}} in {{Twitter Bios}}},
  author = {Pathak, Arjunil and Madani, Navid and Joseph, Kenneth},
  year = {2021},
  month = jul,
  language = {en}
}

There are three directories here, described below.

In addition, we include two other files:

Our final document describing the annotation task in Annotation Task Description.docx
Results for our Demographic and Twitter Status analysis for more identities, in image_more_demographics.pdf

analysis

This directory contains the R code we used to perform the reliability and validity analyses, along with data on the pre-existing lists of identities, behaviors, and modifiers we used for that analysis

annotated data

This directory contains only datasets that we do not believe can be used to easily re-identify individuals within the datasets we used, but still provide details on our qualitative analyses. The main file of interest is labels_identity.xlsx, which contains results from our 3 rounds of coding discussed in the paper. The other files are just intermediary files used for generating that annotation task.

method

This directory contains code to run our method for extracting personal identifiers (see 01_gen_identifiers.ipynb and the associated utility code in identifiers_from_bio.py) and the code to run our cluster analysis 02_cluster_analysis.ipynb).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
analysis		analysis
annotated_data		annotated_data
method		method
.gitignore		.gitignore
Annotation Task Description.docx		Annotation Task Description.docx
Readme.md		Readme.md
image_more_demographics.pdf		image_more_demographics.pdf
paper_draft.pdf		paper_draft.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

analysis

analysis

annotated_data

annotated_data

method

method

.gitignore

.gitignore

Annotation Task Description.docx

Annotation Task Description.docx

Readme.md

Readme.md

image_more_demographics.pdf

image_more_demographics.pdf

paper_draft.pdf

paper_draft.pdf

Repository files navigation

Introduction

analysis

annotated data

method

About

Releases

Packages

Languages

kennyjoseph/twitter_personal_identifiers

Folders and files

Latest commit

History

Repository files navigation

Introduction

analysis

annotated data

method

About

Resources

Stars

Watchers

Forks

Languages