Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

theWord import: <WT*> tags not implemented #78

Open
paul1149 opened this issue Mar 6, 2023 · 4 comments
Open

theWord import: <WT*> tags not implemented #78

paul1149 opened this issue Mar 6, 2023 · 4 comments

Comments

@paul1149
Copy link

paul1149 commented Mar 6, 2023

Wondering if my command line is deficient. I've got a tagged NT that looks like this, (from line 1):

<GR>Βίβλος<Gr><EN> Book <En><WG976><WTN-NSF l="βίβλος">

Using BMC SQLite Edition 0.0.8, I use the command:

java -jar BMC.jar TheWord sblgnt.nt MyBibleZone sblgnt

The output on my Android phone, where MyBible is installed, is:
<wt>Βίβλος<WG976><WTN-NSF l="βίβλος">

  • It shows the tag <wt,
  • it doesn't show the English Book (which is good, I don't want it),
  • it shows the strong's number as a clickable tag, which is perfect
  • it show the entire morphology tag in plain text, which is not good.
  • Clicking the strong's number brings up the definition, but it lacks the morphology parsing I expect.

So something is wrong with how the tags are being processed. Can I correct this?

Thanks very much.

@schierlm
Copy link
Owner

schierlm commented Mar 6, 2023

Hello Paul,

thank you for this bug report.

Indeed, morphology import from TheWord is not implemented:

When I implemented it in 2015, I ran into some problems and decided to first release without that support (I don't exactly remember which ones, but I assume it is caused by the intermediary format requires start and end markers for strongs and morphology, while TheWord only has the end markers). Then I must have forgotten it. The <Gr> and <En> tags are not implemented either :-(

Export of morphology tags to TheWord would work fine, though, as well as exporting to MyBible.Zone

I will keep this bug open to remind me that there is still an open issue.

To correct it yourself (apart from implementing the missing code if you know Java), I think the only viable option would be to first convert to Diffable format, then use some regular expressions to try to fix up the tags, and then export to MyBible.Zone. Or look if you can find your input file in a different format than TheWord.

[MyBible.Zone comes with SBLGNT, but from your description I assume that the module is more like MorphGNT, which includes morphology information. I am not aware of any MorphGNT modules for MyBible.Zone]

@schierlm schierlm changed the title theWord -> MyBibleZone: morphology not picked up as tags theWord import: <WT*> tags not implemented Mar 6, 2023
@paul1149
Copy link
Author

paul1149 commented Mar 6, 2023

Thanks schierlm, I greatly appreciate that explanation. I didn't yet come up with a morphgnt module that I could convert to Mybiblezone. If that proves not possible, can you point me to what the required input format would look like? I might be able to regex it into compliance.

Thanks much. This is a great project!

@schierlm
Copy link
Owner

schierlm commented Mar 6, 2023

In Diffable format, a word tagged with both strong and morphology would look like

<grammar strong="976" rmac="N-NSF">Βίβλος</>.

I guess the format is pretty self-explanatory if you convert something to it and have a look yourself.

If you prefer some XML with schema, you can use RoundtripXML whose XSD you can find here:
https://github.com/schierlm/BibleMultiConverter/blob/master/biblemulticonverter-schemas/src/main/resources/RoundtripXML.xsd

@paul1149
Copy link
Author

paul1149 commented Mar 7, 2023

Thanks much. I will look into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants