Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
chore: Improve email parsing using email trimmer gem (#3611)
- Loading branch information
1 parent
009abc1
commit 44486fc
Showing
9 changed files
with
1,232 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
class HtmlParser | ||
def self.parse_reply(raw_body) | ||
new(raw_body).filtered_text | ||
end | ||
|
||
attr_reader :raw_body | ||
|
||
def initialize(raw_body) | ||
@raw_body = raw_body | ||
end | ||
|
||
def document | ||
@document ||= Nokogiri::HTML(raw_body) | ||
end | ||
|
||
def filter_replies! | ||
document.xpath('//blockquote').each { |n| n.replace('> ') } | ||
document.xpath('//table').each(&:remove) | ||
end | ||
|
||
def filtered_html | ||
@filtered_html ||= begin | ||
filter_replies! | ||
document.inner_html | ||
end | ||
end | ||
|
||
def filtered_text | ||
@filtered_text ||= Html2Text.convert(filtered_html) | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
MIME-Version: 1.0 | ||
Date: Thu, 19 Aug 2021 14:14:31 +0530 | ||
References: <CAFkiBVxGoURoqdkY-O_25F-8b41kb-GWBc6hh4Djd5ynwOikXA@mail.gmail.com> <0100017b5d8efc70-c7f18809-aa55-48f6-91fd-b626092ed8b3-000000@email.amazonses.com> | ||
In-Reply-To: <0100017b5d8efc70-c7f18809-aa55-48f6-91fd-b626092ed8b3-000000@email.amazonses.com> | ||
Message-ID: <CAFkiBVwJjO_k_e-LpiKi7MAQAKbHX5nkEPcf0y1R=bjcEHogMg@mail.gmail.com> | ||
Subject: Re: Checking mail forwarding to cw inbox | ||
From: Sony Mathew <sony@chatwoot.com> | ||
To: Tejaswini <reply+6bdc3f4d-0bec-4515-a284-5d916fdde489@example.com> | ||
Content-Type: multipart/alternative; boundary="0000000000004af64505c9e58f03" | ||
|
||
--0000000000004af64505c9e58f03 | ||
Content-Type: text/plain; charset="UTF-8" | ||
Yes, I am providing you step how to reproduce this issue | ||
On Thu, Aug 19, 2021 at 2:07 PM Tejaswini from Email sender test < | ||
tejaswini@chatwoot.com> wrote: | ||
> Any update on this? | ||
> | ||
> | ||
-- | ||
* Sony Mathew* | ||
Software developer | ||
*Mob:9999999999 | ||
|
||
--0000000000004af64505c9e58f03 | ||
Content-Type: text/html; charset="UTF-8" | ||
Content-Transfer-Encoding: quoted-printable | ||
|
||
<div dir=3D"ltr">Yes, I am providing you step how to reproduce this issue</= | ||
div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On= | ||
Thu, Aug 19, 2021 at 2:07 PM Tejaswini from Email sender test &l= | ||
t;<a href=3D"mailto:tejaswini@chatwoot.com">tejaswini@chatwoot.com</a>> wrot= | ||
e:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0= | ||
.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <p> | ||
</p><p>Any update on this?</p> | ||
|
||
<p></p> | ||
</blockquote></div><br clear=3D"all"><div><br></div>-- <br><div dir=3D"ltr"= | ||
class=3D"gmail_signature"><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div= | ||
><b>Sony Mathew.</b><br></div><span style=3D"font-family:"times ne= | ||
w roman",serif"><span></span><span></span>Software developer</span><br= | ||
></div><b>Mob:9999999999</b></div></div></div></div> | ||
|
||
--0000000000004af64505c9e58f03-- |
Oops, something went wrong.