Skip to content

OpenArabicPE/newspaper_al-quds

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title author date ORCID
newspapers_al-quds: read me
Till Grallert
2018-03-28 10:57:56 +0300
orcid.org/0000-0002-5739-8094

GitHub release DOI

This repository contains bibliographic metadata for the newspaper al-Quds published by Jirjī Ḥabīb Ḥanāniyā in Jerusalem between 1908 and 1914. The Center for Palestine Studies at Columbia University scanned issues 1 to 391 and put them online. Currently these issues can only be accessed through their issue number and nested sub pages. I therefore produced machine-actionable bibliographic metadata including volume and issue numbers, as well as dates in all three calendars mentioned in the paper's masthead.

NOTE: as of late 2021 the facsimiles can no longer be reached. they were originally hosted on a Google Drive and all links are broken.

some technical details

This repository contains a single TEI XML file containing one <biblStruct> for each issue. This file is produced through automatic iteration making use of this code and manual validation against the digital facsimiles.

The TEI is then automatically converted to MODS XML for integration into reference management software etc (such as Zotero).

notes on the digital facsimiles

Since the publication schedule of al-Quds was rather irregular, I had to check a large number of facsimiles for their publication dates in order to adjust the input parameters for the algorithm generating the metadata. Doing so I came across a large number of missing issues, sub-pages that display only "Hello world", and incomplete scans. I have listed these errors below. Note that the list of files with missing pages will inadvertandly grow since I have not gone through individual issues (and might never do).

  • errors:
    • Missing scans (some of these pages show "Hello world"):
      • The purported scan of #142 is indeed a duplicate of #141
      • No file displayed for #168
      • #170
      • #254
      • #373
      • #377
      • #265 has not been scanned
      • #345 has not been scanned
      • #360 has not been scanned
      • #372 has not been scanned
    • Cut-off scans with illegible columns:
    • Missing pages:
      • page 4 is missing from #154
      • page 3 is missing from #224
      • page 3 is missing from #336
    • URLs with different patterns:

About

Bibliographic metadata as TEI and MODS xml for Jirjī Ḥabīb Ḥanāniyā's newspaper al-Quds (القدس) from Jerusalem, 1908-1914

Resources

License

Stars

Watchers

Forks

Packages

No packages published