Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Novelbin #2355

Open
SirGryphin opened this issue Apr 22, 2024 · 2 comments
Open

Novelbin #2355

SirGryphin opened this issue Apr 22, 2024 · 2 comments

Comments

@SirGryphin
Copy link
Contributor

This happens on all novelbin sites ovelbin.me, novelbin.com, novelbin.net. Any of the multiple forks of this site. As I've now noticed when you actually open a chapter the books are all hosted on novelbin.novel-online.org.

The Problem

It's the Fancy Text, if you don't know what it is just a quick google and you will see what I mean.

So, if you are talking about fancy symbols such as '𝓐𝓑𝓒' etc. then those are Unicode characters. There are a large number of Unicode Characters out of which some are Mathematical, Latin, etc. that are looked at as fancy text symbols.

The site has this feature where if you scrape it using lncrawler or other it adds this text with watermark all over chapters randomly. You can see it on site if your quick enough, when you click on chapter when it loading just look out for a quick flash of text that disappears. I don't know how it works but it's annoying.

Solution so far

I have to use find and replace in sigil and do with for " and '. Also a few other characters and then run this regex [^\x00-\x7F]+ to find any others and remove them.

Help

I wish there was a way to remove these before epub is made or if there is a way to block site adding it I don't know, Not even sure if this is something anyone can fix. It's just novelbin always has latest chapters I don't know if there is a better site with clean chapters.

@CryZFix
Copy link
Contributor

CryZFix commented Apr 22, 2024

Can’t we use a «cleaner» for this?

Similar to self.cleaner.bad_css.update([".thumbnail"])?

Or prepare a similar tool for this.

@zGadli
Copy link
Contributor

zGadli commented Apr 22, 2024

@CryZFix I think using cleaner should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants