Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Can you add webpages text highlighting to Read Aloud #358

Open
17 tasks
darvon123 opened this issue Nov 11, 2023 · 3 comments
Open
17 tasks

Comments

@darvon123
Copy link

darvon123 commented Nov 11, 2023

the requested feature for Read Aloud that I wish for is the ability to have the read text be highlighted on the webpages as it is being read too me. instead of having a small box in the top left-hand corner. why not have the selected text be highlighted on the page itself instead , This would make it easier to follow along and focus on the content on the webpage itself.


Holy-molly I just relies how offensive this might be, but I mean no harm. I'm just formulating the possible issue and or milestone you guys might face. I'm not a great programmer just a beginner at this stuff and the first thing that comes to mind is thinking up obstacles that come be face and milestones that must be meant to continue.


- Possible obstruction that could be stopping you guys [ I guess but might be wrong]?

  • is the reason why it can't be done is that Read-Aloud load the whole webpage into the speech synthesizer

    • Possible solution: load the synthesizer sentence by sentence [ paragraph tags ]1 and fetch the data as its needed stopping when it can't see no more
  • is an alternative reason why it can't be done is that you need a language tokenizers to tell the difference between each paragraph , sentence, or word on a webpage

    • Possible solution: you could use a language tokenizer library that's already been made by someone else or build one yourself instead.

Milestones [Phase 1]

  • Be able to Highlight the whole webpage
  • Be able to Highlight A single paragraph on the webpage
  • Be able to Highlight A single sentence on the webpage
  • Be able to Highlight A single word on the webpage

Milestones [Phase 2]

  • Paragraph Section goals
    • Be able to Highlight the first paragraph1
    • Be able to find the next paragraph to Highlight and read
    • Be able to keep track any newly added paragraphs/ changes to the webpage and respond accordingly to it
    • Add a " custom marker " at the beginning of the first and current paragraphs to Display to the user what has already been read so far.
  • sentence Section goals A.K.A. [The parts of speech goals]
    • figure out what counts as a word. I think it's just a string of letters separated by an "Whitespace" but I might be wrong though
    • figure out what counts as a sentence . I think it's just a series of "words" separated by a "period" or an ending meta-tag2 but I might be wrong about that too though
    • Be able to Highlight the first word on the first paragraph
    • Be able to find the next word in the paragraph to Highlight
    • Come up with a percentage base system that measure how much of the speech synthesizer has left to read and display it as the current word being highlighted instead.
    • [Bug type] build a ignore metadata system/[function] for different types of Unicode characters that might cause the whole system to crash
    • [Bug type] build a ignore meta-tag data system/[function] for cases where the graphical meta-tags might interfere with the highlighting

Milestones [Phase 3]

Here the thing about this phase. it will require a UI over hall that I can't visualize yet until I got a firm grasp CSS know how.
sorry


Footnotes

  1. I don't just mean Paragraph. I'm also Including <div> , <section> , <body> 2

  2. A Meta-tag is just a html/xml tag (e.g. "</p>")

@ken107
Copy link
Owner

ken107 commented Nov 11, 2023

We try to break the text into paragraphs rather than sentences, because the latency associated with cloud voices may cause delay between sentences to be too long. On the other hand, some native voices like the "Google US English" require us to break text into chunks that must be no longer than 15 seconds due to the voice engine's imposed limitation. In other words, the chunk size varies depending on the voice and we just have to accept that.

So we need to highlight the text on the page that corresponds to the current chunk being read. The challenge indeed is that the boundaries of the chunk, i.e. its start/end indices, may not align precisely at the DOM element boundaries. The start/end index may fall in the middle of a span. This mean we have to break the span up into two spans just for the purpose of text highlighting. This is slightly tricky to do, as our algorithm has to be designed to work for light mode and darkmode, and work on a variety of websites on the internet that use varying markups and styling.

For this reason I have not invested the time into this, though I've seen a few other extension have been able to do this fairly successfully.

@darvon123
Copy link
Author

I'll try exploring the source code of those extensions you mention to see what makes them tick. maybe I can find some function they made have used to accomplish their feats.

the errors I encounter with text highlighter in the past:

  1. column shifting
  2. metadata bleed1
  3. poor highlighter timing 2
  4. desyncing issue
  5. weirdly bad CSS priority leading to washout or hard to read highlighted text

The other extensions: first glance

I find that from a first glance that these extensions you mention seems to have "solved" the problem of highlighting text. but with a closer examination it seems to me that they just sidestep the issue all together by reconstructing the site's main contents in their own style while keep most of the site's formatting "the same". Main the culprits founded doing this is "speechify" and "Natural Reader". Like it looks perfect at a glance but look a little closer and it feels off in some way. I find that it leads to some jarring formatting errors for text heavy sites that I visit frequently. like some text having slightly smaller or larger font-sizes then usually. easy to miss if you're not looking for it. though it does allow for some cool visual aid feature like a Dyslexia font changer and a clear reader mode too.


Edge's: first glance

I think the best approach to this is to follow how edge does it highlighting of text. by only changing what is needed to be change and leave everything else alone; this does seem to cause little of column shifting occasionally.

it seems to me that edge tense to keep it simple. by highlighting in light blue, the current paragraphs its reading and the individually words in yellow. the words themselves aren't completely in sync with what's being said but that's kind of okay since audio syncing is something that's easy to ignore if you're not looking for it. though edges approach is imperfect and still suffers from desyncing especially over large spans of text.


though I do believe that much of the internet's webpages are pretty standardize in my opinion with much of the contents being very simple html documents with very little CSS standing in the way.

if you're worry about CSS priority fighting then why not place the highlighter under a custom tag and set it to a high priority like edge.

besides nobody expect perfection from a free open-source extension.

Footnotes

  1. this is what I came across when trying to make my own higher lighter. [its basically what happens when your higher Lighter function clip a span element leading to it bleeding into your main content this also leads to poor highlighter timing and desyncing]

  2. when your highlighter function is delayed by metadata or by a connection timeout/issue.

@nhan000
Copy link

nhan000 commented Nov 14, 2023

If this ever gets implemented, please have an op-out option. I like how this extension allows me to interact with the text (highlight, copy) without it changing the reading position like with other extensions and the Edge built-in Read Aloud feature. I'm willing to trade that for the text highlighting capability.

The best thing for my use would be to have 2 modes:

  • Highlight the text but doesn't change the reading location when I interact with the text.
  • Regular mode: change reading location when click a word (as with other extensions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants