Skip to content
/ caa Public

Data from the College Art Association's dissertation list and the scripts used to harvest and analyze it.

License

Notifications You must be signed in to change notification settings

nancyum/caa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CAA Dissertation Project

This project deals with the College Art Association (CAA) dissertation roster, which has been published since 1963, first in print and then online only. This roster provides information about the changing shape of the field of art history over the past sixty years, through a collective profile of recent PhDs. The dissertation roster is now published by caa.reviews, with entries beginning in 2002, and is updated yearly.

Ken Chiu of Binghamton University wrote the script, caa.py, which was used to scrape the data for completed dissertations from 2002 to 2018 in caa.reviews. The script uses the Beautiful Soup Python library for scraping. If you have any questions about using the script, or would like help modifying it, please contact Ken (kchiu@binghamton.edu) or Nancy (nancyum@binghamton.edu).

Nancy Um ran this script on July 22, 2020, which generated caa.csv. Some entries failed to populate due to formatting errors. The failed entries were saved separately. NU cleaned caa.csv with OpenRefine, which resulted in the identification of a few more failed entries. NU generated a new file, which contained all of the failed entries, cleaned it, and then combined it with the entries in caa.csv. The file caaTOTAL_OR.csv contains all of the entries from 2002 to 2018, including those that were harvested computationally and those that had to be entered by hand.

The R markdown file, caa.Rmd, includes the scripts that were used to process the data, relying upon the tidyverse suite of packages, along with the tokenizers and tidytext packages. Plots were generated using ggplot. The file subjects.csv includes the coded categories that were used to generate figures 10, 11, and 12 of the article, based on CAA's standard breakdown. These materials are intended to be paired with the article, Nancy Um, "What Do We Know about the Future of Art History? Let’s Start by Looking at Its Past, Sixty Years of Dissertations," published as a special essay in caa.reviews, August 18, 2020, http://www.caareviews.org/reviews/3797#.X0E0RC2ZO3I.

About

Data from the College Art Association's dissertation list and the scripts used to harvest and analyze it.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages