Skip to content

schierlm/STEPBible-Data

 
 

Repository files navigation

STEPBible Data Repository CC BY 4.0

Data created initially by Tyndale House Cambridge now curated by www.STEPBible.org
(The code for wwww.STEPBible.org is also on an open licence)

This licence allows...

This public licence allows you to:

  • Include any part of STEPBible-Data in any software or publications without requesting permission  
    (Though we'd love to hear from you about your project when you make it available.)
  • Make changes to the data and record the differences
    You can make corrections or report possible errors to be checked at STEPBibleATgmail.com
    Any changes made to data should be recorded and made available to subsequent users.
  • Refer others to this repository as the source of the data.
    Updates or corrections are easier to implement when the data is distributed from a single source.
    You are welcome to make a mirror, so long as it is kept up-to-date and has a link back here.

And you should:

STEPBible is...

A Charitable Incorporated Organisation registered in the UK #1193950 run by Bible scholars and computer enthusiasts, as well as members who help to decide priorities.
The datasets are based on work by scholars at Tyndale House - an international Biblical Studies research institute in Cambridge, UK (see www.TyndaleHouse.com)

The repository aims to provide reliable and freely usable data for studying the Bible without any denominational or doctrinal bias. Much of the data is based on other publically licenced sources, and has been compared with non-public sources so that differences can be checked by Tyndale scholars. Corrections and proposed updates are welcomed - please send them to STEPBibleATgmail.com for checking.

Datasets available

The data is available as downloadable tab-separated text files (see notes on the data format below). The following datasets are already posted

  • Bible modules for OSIS Sword software Bibles in the same format as Crosswire modules which can be used in any Sword-compatible software.

  • TTESV - Translators Tags for ESV
    Tags for Greek & Hebrew Extended Strongs (compatible with original Strongs) for the translated text of the ESV.

  • TOTHT - Translators OT Hebrew Tagged text
    The Leningrad codex based on Westminster via OpenScriptures, with full morphological and semantic tags for all words, prefixes and suffixes. Semantic tags use the extended Strongs linked to BDB by OS, is backwardly compatible with simple Strongs tags and includes all affixes (as defined in TBESH). Morphological tags are from ETCBC converted to the format of OS (similar to Westminster) with different morphology for Ketiv/Qere when needed.

  • TAGNT - Translators Amalgamated Greek NT in Sheets or data
    Greek text that includes all the words in NA27/28, TR and other major editions (SBLGNT, Treg, Byz, WH, THGNT). Each word is marked with the editions that contain it, positional variants, and meaning variants. All words and meaning variants are tagged lexically (disambiguated Strong linked to LSJ) and morphologically (Robinson based on Tauber with missing details) plus context-sensitive translations Punctuation is based on THGNT with spellings from NA28 or other editions for words not in NA27/28.

  • TBESH - Translators Brief lexicon of Extended Strongs for Hebrew in Sheets or data
    Abridged BDB linked to extended Strongs (compatible with OpenScriptures and backwardly compatible with original Strongs)

  • TBESG - Translators Brief lexicon of Extended Strongs for Greek in Sheets or data
    Brief definitions for all Greek Bible words (NT, LXX, Apoc, & variants) using corrected Abbott-Smith when available, completed with other similar definitions. Backwardly compatible with original Strongs.

  • TFLSJ - Translators Formatted full LSJ Bible lexicon up to G5624 in Sheets and Extra entries or data
    Full LSJ entries for all Bible words (NT, LXX, Apoc & variants), formatted for easy reading (all bibliographic data hidden as hover-text) linked to extended Strongs (backwardly compatible with original Strongs).

  • TIPNR - Translators Individualised Proper Names with all References in Sheets or data
    Every name in the Bible, linked to all Hebrew & Greek forms of that name and separated into individual people & places. Each form of the names for each individual includes exhaustive refs for where that individual is named with data of their spouses, siblings and offspring or the places' geolocation (based on OpenBible).

  • TVTMS - Translators Versification Traditions with Methodology for Standardisation: Eng+Heb+Lat+Grk+Others in Sheets or data All the versification differences in the OT traditional texts in Hebrew, Latin and Greek, and NT early versification, compared with English standard (defined by NRSV which is virtually identical to KJV). Bible translations have an almost infinite variety of versifications because they may follow (for example) Latin in several sections, Hebrew in a few and English most of the time. The Methodology provides simple rules for every section, such as "if this chapter has 29 verses, it is using Greek versification". Using this, a whole Bible can be reversified according to English or traditional Hebrew or Greek or Latin versification, or compared with Bibles using that versification.

  • TEHMC - Translators Expansion of Hebrew Morphology Codes
    Hebrew morphology codes with expanded explanations in terms of parsing, meaning and example. The codes are based on OpenScripture which is similar to the Westminster code system used in BibleWorks and other commercial software. They include extra codes which occur in STEPBible data which distinguishes sequential perfectives, gentilics, gender/location for personal pronouns, and non-Jussive/Cohortative as well as Jussive/Cohortative & possibly-Jussive/Cohortative forms.

  • TEGMC - Translators Expansion of Greek Morphology Codes
    Greek morphology codes with expanded explanations in terms of parsing, meaning and example. The codes are based on Robinson, developed for the Majority text and used in most open-source texts. They include extra codes which occur in STEPBible data which distinguishes persons in possessive and reflexive pronouns, 2nd forms of verbs, and distinctions between deponant forms and ambiguous passive/middle.

Datasets coming

The followins datasets are still being finished and/or being checked. If you see data that you have need of which isn't yet available, please contact us and perhaps you can become part of the checking process.

  • TOTGT - Translators OT Greek Tagged text
    LXX text with later Ecclesiastical variants. The base text is Rhalfs with variants from the Apostolic Bible (based on Sixtine, Aldine and Complutensian texts). Both have been tagged to LSJ (compatible with extended Strongs) and most of morphology has been tagged (based on CCAT) but variant tagging need completing.

  • TFBDB - Translators Formatted full BDB lexicon
    Full BDB formatted for easy reading (all bibliographic data hidden as hover-text) linked to extended Strongs (compatible with OpenScriptures and backwardly compatible with original Strongs)

  • TOTMM - Translators OT Manuscripts and Meanings
    Translation, Hebrew form and witnesses for each variant that affects the meaning of the text, as determined by Barthélemy's UBS committee. Also, alternate meanings found in standard translations. Shown as alternate renderings of a base text (ESV 2011).

  • TNTMM - Translators NT Manuscripts and Meanings
    Translation, Greek form and witnesses up to 400 AD for each variant that affects the meaning of the text, as determined by the UBS apparatus. Also, alternate meanings found in standard translations. Shown as alternate renderings of a base text (ESV 2011).

Data format

Data is in plain unicode text (UTF-8) with fields separated by tabs, so that they can be loaded into any text editor or spreadsheet.

  • To open in spreadsheets, (e.g. Excel): In Github, click on the file, then "Download" then Save (Ctr+S) to your drive. In Excel "Browse" for it using "All Files" (not "All Excel Files") and open it. When asked, select "Unicode UTF8", "Delimited", "Tab", "General".

  • By default, datasets are one-line records, so a Record ends with a NewLine, and each line has identical fields.

  • Some datasets have multi-line records. Records are separated by a line starting with "$". The first line is a Header with fields that apply to each subsequent subRecord line. SubRecord lines all start with a tab.
    For example, in the ProperNames dataset, the first line is a header with information about the type (individual, place, title etc) and other data. These details apply to each of the subsequent subRecords which contain fields for the specific tag, Hebrew/Greek, translation, and the list of references. So the Header effectively contains fields which belong to each of its subRecords and would be identical for each of them if they were included on each line.

  • Hebrew glyphs are separated and normalised in the order:
    consonant; sin/shin dot; dagesh; vowel; metheg/raphe; accents

    • Glyphs NOT used for Hebrew include:
      װ ױ ײ ﭏ ײַ שׁ שׂ שּׁ שּׂ אַ אָ אּ בּ גּ דּ הּ וּ זּ טּ יּ ךּ כּ לּ מּ נּ סּ ףּ פּ צּ קּ רּ שּ תּ וֹ בֿ כֿ פֿ ﬠ ﬡ ﬢ ﬣ ﬤ ﬥ ﬦ ﬧ ﬨ
  • Greek glyphs are normalised to include only:
    ; · . , ᾽ ά ά ὰ ᾷ ᾷ ἀ ά ᾴ ᾄ ᾅ Ἆ Ἅ Ἃ ᾍ Ἀ Ἀ ἁ Ἁ ἄ ἄ Ἄ Ἄ ἅ ἂ ἂ ἅ ἃ ἃ ᾶ ᾳ ἆ ἆ έ έ ὲ ἐ έ Ἓ Ἐ Ἐ ἑ Ἑ ἔ Ἔ ἒ ἕ ἕ Ἒ Ἕ Ἕ ἓ ἓ ή ή ὴ ῇ ῇ ἠ ή ῇ ᾑ Ἥ Ἣ Ἠ Ἠ ἡ Ἡ ἤ ἤ Ἤ Ἤ ἢ ἢ ἥ ἥ Ἢ Ἢ ἣ ἣ ᾖ ᾖ ᾗ ᾗ ᾗ ῆ ῃ ῄ ῄ ἦ ἦ Ἦ Ἦ ἧ ἧ ᾐ ᾐ ᾑ ᾔ ᾔ ί ί ὶ ϊ ΐ ΐ ΐ ῒ ῒ ἰ ἲ ί Ἰ Ἰ ἱ Ἱ ἴ ἴ Ἴ Ἴ ἵ ἵ Ἵ Ἵ ἳ ἳ ῖ ἶ ἶ ἷ ἷ ό ό ὸ ὀ ό Ὀ Ὀ ὁ Ὁ ὄ ὄ Ὄ Ὄ ὅ ὅ ὂ ὂ Ὅ ὃ ὃ Ὃ Ὃ Ὃ ῥ ῤ Ῥ ̔Ρ ύ ύ Ύ ὺ ϋ ΰ ΰ ΰ ῢ ῢ ὐ ὑ ύ Ὕ Ὗ Ὑ ὔ ὔ ὒ ὒ ὕ ὕ ὓ ὓ ῦ ὖ ὖ ὗ ὗ ώ ώ ὼ ῷ ῷ ὠ ὣ Ὢ ᾯ Ὠ ὡ Ὡ ὤ ὤ Ὤ ὢ ὢ ὥ ὥ Ὥ Ὥ ᾦ ᾧ ᾧ Ὧ ᾯ ᾯ ῶ ῳ ῴ ῴ ὦ ὦ Ὦ ὧ ὧ ὧ ᾠ ᾠ ς

  • Glyphs NOT used for Greek include:
    ; ' ᾿ ` ῾ ’ ‘ ‛ ′ ΄ ʹ̛̀́̓̒̓̔̕ ʹ ʻ ʼ ʽ ʾ ʿ ˈ ˊ ˋ ' ` ´ o ά ά ὰ ᾷ ἀ Ἀ ἁ Ἁ ἄ Ἄ ἅ ἂ ἃ ᾶ ᾳ ἆ έ έ ὲ ἐ Ἐ ἑ Ἑ ἔ Ἔ ἕ Ἕ ἓ ή ῇ ή ὴ ῇ ἠ Ἠ ἡ Ἡ ἤ Ἤ ἢ ἥ Ἢ ἣ ᾗ ῆ ῃ ῄ ἦ Ἦ ᾖ ἧ ᾐ ᾑ ᾔ ί i ί ὶ ϊ ΐ ῒ ἰ Ἰ ἱ Ἱ ἴ Ἴ ἵ Ἵ ἳ ῖ ἶ ἷ ό ό ὸ ὀ Ὀ ὁ Ὁ ὄ Ὄ ὅ ὂ Ὅ ὃ Ὃ ῥ Ῥ ύ ύ ὺ ϋ ΰ ῢ ὐ ὑ Ὑ ὔ ὕ ὒ ὓ ῦ ὖ ὗ ώ ὼ ῷ ὠ ὡ Ὡ ὤ Ὤ ὢ ὥ Ὥ ᾦ ᾧ ᾯ ῶ ῳ ῴ ὦ Ὦ ὧ Ὧ ᾠ ϛ

  • Bible reference abbreviations are based on UBS with slightly different formatting:
    References are e.g. Gen.1.10-12; 1Ki.2.4,5; Phm.2; Job.1.3--2.4;
    OT: Gen, Exo, Lev, Num, Deu, Jos, Jdg, Rut, 1Sa, 2Sa, 1Ki, 2Ki, 1Ch, 2Ch, Ezr, Neh, Est, Job, Psa, Pro, Ecc, Sng, Isa, Jer, Lam, Ezk, Dan, Hos, Jol, Amo, Oba, Jon, Mic, Nam, Hab, Zep, Hag, Zec, Mal,
    Apoc: Tob, Jdt, EsG, Wis, Sir, Bar, LJe, S3Y, Sus, Bel, 1Ma, 2Ma, 3Ma, 4Ma, 1Es, 2Es, Man, Ps2, Oda, PsS, Alternate MSS: JsA, JdB, TbS, SsT, DnT, BlT,
    NT: Mat, Mrk, Luk, Jhn, Act, Rom, 1Co, 2Co, Gal, Eph, Php, Col, 1Th, 2Th, 1Ti, 2Ti, Tit, Phm, Heb, Jas, 1Pe, 2Pe, 1Jn, 2Jn, 3Jn, Jud, Rev
    (OT+NT are all based on the first 3 characters, except: Jdg, Sng, Ezk, Jol, Nam, Mrk, Jhn, Php, Phm, Jas, 1Jn, 2Jn, 3Jn)

Error reporting

Please report all errors at Feedback@STEPBible.org See Current reported errors

About

Data created for www.STEPBible.org, available to other projects under CC BY 4.0

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published