Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create hocr parser #1

Open
dcloud opened this issue Feb 23, 2014 · 6 comments
Open

Create hocr parser #1

dcloud opened this issue Feb 23, 2014 · 6 comments
Assignees

Comments

@dcloud
Copy link
Contributor

dcloud commented Feb 23, 2014

Found existing one in Github, but it didn't work. See if we can quick write one of our own.

@dcloud dcloud self-assigned this Feb 23, 2014
@dcloud
Copy link
Contributor Author

dcloud commented Feb 23, 2014

Looking at https://gist.github.com/dcloud/9173113, fwiw

@dcloud
Copy link
Contributor Author

dcloud commented Feb 24, 2014

Basics done in 9036cdc.

@dcloud dcloud closed this as completed Feb 24, 2014
@jsfenfen
Copy link

Note that spans for words are sometimes ocrx_word and sometimes just ocr_word -- in other words, the x is sometimes missing.

@dcloud
Copy link
Contributor Author

dcloud commented Feb 24, 2014

Ah, I wasn't sure about that. Reopening. Do you know the difference (what the x means)?

@dcloud dcloud reopened this Feb 24, 2014
@jsfenfen
Copy link

I dunno. I'm not sure it's intentional or a bug. But since I ran into this
I've gotten more skeptical about how tight the spec is...

On Mon, Feb 24, 2014 at 12:22 PM, Daniel Cloud notifications@github.comwrote:

Ah, I wasn't sure about that. Reopening. Do you know the difference (what
the x means)?

Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-35910887
.

@jsfenfen
Copy link

Yeah, so fwiw ocrx_word might not be a formal part of the spec -- this doc
https://docs.google.com/document/d/1QQnIQtvdAC_8n92-LhwPcjtAUFwBlzE8EWnKAxlgVf0--
I'm not totally sure of how authoritative it is -- describes it as
being
part of the 'engine-specific markup'. Which gives me pause...

On Mon, Feb 24, 2014 at 12:22 PM, Daniel Cloud notifications@github.comwrote:

Reopened #1 #1.

Reply to this email directly or view it on GitHubhttps://github.com//issues/1
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants