Skip to content

nypl-spacetime/city-directories

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 

Repository files navigation

New York City Directories

Join the chat at https://gitter.im/nypl-spacetime/city-directories

The New York Public Library has digitized its collection of city directories, and the resulting high-resolution images can be browsed and downloaded on our Digital Collections.

As part of the NYPL's NYC Space/Time Directory project and in collaboration with the Data Services at New York University's Bobst Library, we are using optical character recognition (OCR) to turn the city directories into a searchable atlas of historical New York City.

Meetups

Two meetups have been organized about the digitized city directories:

List of digitized city directories

See DIRECTORIES.md for a table of the city directories we are processing and extracting text from. To just browse and download the scanned books, visit Digitial Collections.

Data

Processed city directory data will soon be published on the NYC Space/Time Directory homepage!

hOCR files

hOCR files will soon be published in our data repository!

Anatomy of an hOCR file name:

Example:

1849.00030.28.56749967.e10e9aa0-5291-0134-79ba-00505686a51c.processed.hocr
  • 1849 ⟶ year of directory
  • 00030 ⟶ page number of original print page
  • 28 ⟶ image number of sequentially downloaded images from single item (i.e. a directory)
  • 56749967 ⟶ NYPL asset's individual image ID
  • e10e9aa0-5291-0134-79ba-00505686a51c ⟶ NYPL UUID for individual image
  • processed ⟶ image was preprocessed by ImageMagick textcleaner

Open source libraries

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published