Skip to content
This repository has been archived by the owner on Feb 28, 2023. It is now read-only.

Text added to cards may be incomplete #31

Open
Mincka opened this issue Aug 16, 2017 · 0 comments
Open

Text added to cards may be incomplete #31

Mincka opened this issue Aug 16, 2017 · 0 comments

Comments

@Mincka
Copy link
Owner

Mincka commented Aug 16, 2017

When a link is shared and user adds additional text, the added text may not be included in the log.

In the following generated sample, "This is a test." is not included.

  <p class="TweetTextSize  js-tweet-text tweet-text" lang="" data-aria-label-part="0">How I lost my 25-year battle against corporate claptrap <a href="https://t.co/gIrbtXuRSv" rel="nofollow noopener" dir="ltr" data-expanded-url="https://www.ft.com/lucycolumn" class="twitter-timeline-link" target="_blank" title="https://www.ft.com/lucycolumn" >
        <span class="tco-ellipsis"/>
        <span class="invisible">https://www.</span>
        <span class="js-display-url">ft.com/lucycolumn</span>
        <span class="invisible"/>
        <span class="tco-ellipsis">
            <span class="invisible">&nbsp;</span>
        </span>
    </a> This is a test.</p>

This is because cssselect extracts only the text node before the . A workaround could be to use text_content():

def _parse_dm_text(self, element):
    dm_text = '' text_tweet = element.cssselect("p.tweet-text")[0]
    dm_text = text_tweet.text_content()
    return DirectMessageText(dm_text)

The output would be:
[2017-08-16 13:37:49] <Julien Ehrhart> [Card-summary_large_image] https://www.ft.com/lucycolumn How I lost my 25-year battle against corporate claptrap https://www.ft.com/lucycolumn This is a test.

Two issues here:

  1. The link appears twice (once during the parsing of the card, once during the parsing of the text) -> Acceptable
  2. The emojis are not in the text so they are stripped from the output -> Not acceptable
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant