Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Badly formatted data in TTESV #40

Open
dlee opened this issue Mar 13, 2021 · 3 comments
Open

Badly formatted data in TTESV #40

dlee opened this issue Mar 13, 2021 · 3 comments

Comments

@dlee
Copy link

dlee commented Mar 13, 2021

There are some lines in TTESV that do not conform to the specified format. I couldn't really figure out how to fix the errors, but they seem to generally fall within the lines of the word index having a +00 and then a long list of strongs numbers.

Some examples:

$Num 1:43	02=<06485>	05=<04294>	07=<05321>	53+00=<07969>+<02572>+<00505>+<00702>+<03967>	
$Num 2:30	02=<06635>	04=<06485>	53+00=<07969>+<02572>+<00505>+<00702>+<03967>	
$Num 4:44	02=<06485>	04=<04940>	3+00=<07969>+<00505>+<03967>	
$Num 26:47	01=<00428>	04=<04940>	07=<01121>	09=<00836>	13=<06485>	53+00=<07969>+<02572>+<00505>+<00702>+<03967>	
$Num 26:62	03=<06485>	23+00=<07969>+<06242>+<00505>	05=<03605>	06=<02145>	09=<02320>	10=<01121>	12=<04605>	13=<03588>	16=<03808>	17=<06485>	18=<08432>	20=<01121>	22=<03478>	23=<03588>	26=<03808>	27=<05159>	28=<05414>	31=<08432>	33=<01121>	35=<03478>	
$Jdg 15:11	3+00=<07969>+<00505>	02=<00376>	04=<03063>	05+06=<03381>	09=<05585>	12=<05553>	14=<05862>	16=<00559>	18=<08123>	22=<03045>	25=<06430>	27=<04910>	30=<04100>	33=<02088>	37=<06213>	42=<00559>	47=<06213>	53=<06213>	
$Jdg 16:27	03=<01004>	05=<04390>	07=<00582>	09=<00802>	10=<03605>	12=<05633>	15=<06430>	17=<08033>	21=<01406>	3+00=<07969>+<00505>	25=<00376>	27=<00802>	29=<07200>	32=<08123>	33=<07832>	
$1Ki 4:32	03=<01696>	3+00=<07969>+<00505>	04=<04912>	07=<07892>	1+05=<00505>+<02568>	
$1Ki 5:16	01=<00905>	02+03=<08010>	3+00=<07969>+<00505>+<07969>+<03967>	04=<08269>	05=<05324>	06=<00834>	08=<05921>	10=<04399>	13=<07287>	16=<05971>	18+19=<06213>	21=<04399>	
$1Ch 12:27	02=<05057>	03=<03077>	08=<00175>	3+00.	<07969>+<00505>+<07651>+<03967>	
$1Ch 12:29	03=<01121>+<01144>	05=<00251>	07=<07586>	3+00=<07969>+<00505>	11=<04768>	16=<08104>	18=<04931>	21=<01004>	23=<07586>	
$1Ch 29:4	3+00=<07969>+<00505>	01=<03603>	03=<02091>	06=<02091>	08=<00211>	7+00=<07651>+<00505>	10=<03603>	12=<02212>	13=<03701>	15=<02902>	17=<07023>	20=<01004>	
$2Ch 2:2	02=<08010>	03=<05608>	70+00=<07657>+<00505>	04=<00376>	06+07=<05449>	80+00=<08084>+<00505>+<00376>	10=<02672>	13+14=<02022>	3+00=<07969>+<00505>+<08337>+<03967>	17=<05329>	
$2Ch 2:17	02=<08010>	03=<05608>	04=<03605>	06+07=<00582>+<01616>	08=<00834>	12=<00776>	14=<03478>	15=<00310>	17=<05610>	21=<01732>	23=<00001>	25=<05608>	29=<04672>	153+00=<03967>+<02572>+<00505>+<07969>+<00505>+<08337>+<03967>	
$2Ch 2:18	01=<07657>	02=<00505>	06=<06213>	08+09=<05449>	80+00=<08084>+<00505>	11=<02672>	14+15=<02022>	3+00=<07969>+<00505>+<08337>+<03967>	18=<05329>	22=<05971>	23=<05647>	
$2Ch 4:5	02=<05672>	05=<02947>	08=<08193>	10=<04639>	13=<08193>	16=<03563>	19=<06525>	22=<07799>	24=<02388>+<03557>	3+00=<07969>+<00505>	25=<01324>	
$2Ch 25:13	03=<01121>	06=<01416>	08=<00558>	10=<07725>	14=<01980>	18=<04421>	19=<06584>	21=<05892>	23=<03063>	25=<08111>	27+28=<01032>	30+31=<05221>	3+00=<07969>+<00505>	36=<00962>	37=<07227>	38=<00961>	
$2Ch 29:33	03+04=<06944>	600=<08337>+<03967>	06=<01241>	3+00=<07969>+<00505>	08=<06629>	
$2Ch 35:7	02=<02977>	03=<07311>	06=<01121>	07=<05971>	09+10=<06453>	12=<03605>	15=<04672>	16=<03532>	18=<01121>	19=<05795>	20=<04480>	22=<06629>	25=<04557>	30+00=<07970>+<00505>	3+00=<07969>+<00505>	28=<01241>	29=<00428>	31=<04480>	33+34=<04428>	35=<07399>	
$Job 1:3	02=<04735>	7+00=<07651>+<00505>	03=<06629>	3+00=<07969>+<00505>	04=<01581>	500=<02568>+<03967>	05=<06776>	07=<01241>	500=<02568>+<03967>	09+10=<00860>	12=<03966>	13=<07227>	14=<05657>	18=<00376>	21=<01419>	23=<03605>	25=<01121>	28=<06924>	

There's also this line that has a word index of 601:

$Num 26:51	04=<06485>	07=<01121>	09=<03478>	601+30=<08337>+<03967>+<00505>+<00505>+<07651>+<03967>+<07970>

There's also a line that has an invalid strongs number (0100419):

$2Sa 15:17	03=<04428>	04+05=<03318>	07=<03605>	09=<05971>	10=<07272>	14=<05975>	17=<04801>	18=<01004>	 <0100419+04801>

I think the last entry is supposed to be 19=<01004+04801>

@DavidIB
Copy link
Contributor

DavidIB commented Mar 13, 2021

Thanks for taking time to point these out.
This dataset is due for a complete revamp. In the future I plan to link to the individual tagged words in ESV - ie I'll avoid copyright issues by not including in-between untagged words.
The dataset is also being updated to tagging that includes all Hebrew prefixes and suffixes.
This means, in the short term, I won't be fixing these issues. Sorry!

@dlee
Copy link
Author

dlee commented Mar 13, 2021

Thank you for the update. Do you have an estimated timeline for the updated dataset?

@DavidIB
Copy link
Contributor

DavidIB commented Mar 13, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants