Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will it be better to detect text first then gen the mask? #59

Open
xulihang opened this issue Sep 5, 2019 · 3 comments
Open

Will it be better to detect text first then gen the mask? #59

xulihang opened this issue Sep 5, 2019 · 3 comments
Labels
Enhancement Performance optimization, Edit typos, good things etc...

Comments

@xulihang
Copy link

xulihang commented Sep 5, 2019

This tool works great to remove text. But many non-text parts can be recognized as text.

I think it may work better if the text regions are detected at first.

image

@xulihang xulihang changed the title Will it be better to detect text first then gen the mask Will it be better to detect text first then gen the mask? Sep 5, 2019
@zhangyanbo2007
Copy link

hello,Let's talk.add me wechat:15821444815

@zhangyanbo2007
Copy link

你说的是先做文字目标检测,再做文字语义分割是吗?

@KUR-creative
Copy link
Owner

KUR-creative commented Sep 10, 2019

That would be Mask R-CNN. I think that will work out well.

I'm working on improving performance.
But recently I'm too busy with something else, so I can't handle it right now...

@KUR-creative
Copy link
Owner

KUR-creative commented Sep 19, 2019

For developers and researchers, I plan to release part of the data used for learning SZMC.

And like Kaggle, I want to create a website where users submit their code and programs, and then evaluate them with data from SZMC.
Or we could just use Kaggle. (Though I think a website dedicated to SZMC is better)
With such a public place, more people, not just me, will be able to research and share results to improve the performance of SZMC. #24

But I don't know if I can do this alone. I never created a website ...
I can't handle this right now, but I want to achieve it someday.


나중에 개발자들과 연구자들을 위해서 학습에 사용한 데이터를 일부 공개할 생각입니다.

그리고 Kaggle처럼 코드와 프로그램을 제출하면 평가를 해주는 웹사이트를 만들고 싶습니다.
그냥 Kaggle을 쓸 수도 있고요. (그렇지만 왠만하면 식질머신만을 위한 사이트를 생각하고 있습니다)
그런 공개된 장소가 있으면 저 뿐만 아니라 더 많은 분들이 식질머신의 성능 향상을 위해 연구하고 그 결과를 공유할 수 있을 것입니다. #24

하지만 이런 일을 혼자서 할 수 있을지는 모르겠네요. 웹사이트는 만들어 본 적도 없고...
이 또한 지금 당장은 힘들지만, 언젠가는 꼭 이루고 싶습니다.

@yu45020
Copy link

yu45020 commented Sep 19, 2019

@KUR-creative
@xulihang

Based on my previous experience, splitting the task into two steps is better. You may find my last year's project, thought I don't have time to update it yet ><.

Mask RNN is overkilling. MobileNet V2 is sufficient to detect text locations, but depth-wise separable convolution was not optimized last year, so it's even slower than a modified Xecption.

My idea is to first detect text locations and generate masks, then I apply image inpainting to fix the holes. If we have non-text images and the original images, it's very easy to generate text masks for training.

The problem is that image brightness & non-random text removals will create lots of noise. Different translation group prefer to erase/translate different sets of words, so I was considering to generate manga-like words on non-text images for training.

Image inpainting is relatively more difficult because it depends more on the first step. Although we can generate random masks and train the model to fill them, it will be more efficient to fill in text' holes only because manga text are added with patterns. It's rare to see text on someone's face.

@KUR-creative KUR-creative added the Enhancement Performance optimization, Edit typos, good things etc... label Sep 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Performance optimization, Edit typos, good things etc...
Projects
None yet
Development

No branches or pull requests

4 participants