Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The text is not recognized from a png #182

Closed
FlorinMax opened this issue Jun 19, 2015 · 21 comments
Closed

The text is not recognized from a png #182

FlorinMax opened this issue Jun 19, 2015 · 21 comments
Labels

Comments

@FlorinMax
Copy link

I have this imagine, but the tesseract doesn't recognize the text from the imagine

banner2

The output after running the tesseract is:

Ammmz e um

Bzndmary Pbffiularamr
ugsmmm gmmm
Rzfiaume P3yMuiR:6aua
Stams Pay hefare 20‘ arnsrzz

@FlorinMax FlorinMax changed the title The text isn ot recognized from a png The text is not recognized from a png Jun 19, 2015
@charlesw
Copy link
Owner

Not sure if your already doing some preprocessing however this might help. Issue #115 describes some techniques which might be helpful. You can also enable the tessedit_write_images option (fixed by issue #160) to see exactly what image is being fed into tesseract (tesseract does some pre-processing itself). Finally specific to your example I'd do at least the following as a starter:

  1. Resize image to be 300dpi
  2. Set the region of interest to only the fields component\panel (assumes layout is fixed)
  3. Consider updating the word and pattern dictionaries to support the possible form field values (see https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality#Dictionaries,_word_lists,_and_patterns).

@FlorinMax
Copy link
Author

I tried to follow your steps:
I resized the image, crop the image (a small part of it), apply a grayscale and set the variables (I cannot set the ' tessedit_write_images ' to true), my method failed to retrieve value for tessedit_write_images . So I post the code, maybe is something wrong in the code.

The image cropped:
spscale
After that, this is the result: , but is not enough
Amount
Beneficwary
Dzscnmmn
Refierenoz
scams

This is my code:
public void CropImage()
{
Bitmap image = new Bitmap(localPath);
// var rect = new Rectangle(130,10,125,70);
var rect = new Rectangle(60,10,70,70);
Bitmap imageCrop = image.Clone(rect, image.PixelFormat);
imageCrop.SetResolution(300, 300);
Graphics g = Graphics.FromImage(imageCrop);
g.DrawImage(imageCrop, 0, 0);
g.Dispose();
imageCrop.Save(localPathCrop);
}

    public void GrayScaleImage()
    {
        Bitmap c = new Bitmap(localPathCrop);
        Bitmap d;
        int rgb;

        for (int y = 0; y < c.Height; y++)
            for (int x = 0; x < c.Width; x++)
            {
                Color pixelColor = c.GetPixel(x, y);
                rgb = (int)((pixelColor.R + pixelColor.G + pixelColor.B)/3);
                c.SetPixel(x, y, Color.FromArgb(rgb, rgb, rgb));
            }
        d = c;
        d.Save(localPathGrayScale);
    }

public void readOCR()
{
var pathToLangFolder = @"D:\Automation Tests\OCRTest\Tesseract-OCR";

        using (var engine = new TesseractEngine(pathToLangFolder, "eng", EngineMode.Default))
        {
            engine.SetVariable("load_system_dawg", false);
            engine.SetVariable("load_freq_dawg", false);
            engine.SetVariable("tessedit_write_imag", true);
            bool result ;

            if (engine.TryGetBoolVariable("tessedit_write_imag", out result))
            {
                Assert.AreEqual(false, result, "The values are not equal");
            }
            else
            {
                Assert.Fail("Failed to retrieve value for '{0}'.", "tessedit_write_imag");
            }

            using (Bitmap image = new Bitmap(localPathGrayScale))
            {
                using (var pix = PixConverter.ToPix(image))
                {
                    using (var page = engine.Process(pix))
                    {
                        Console.WriteLine(page.GetMeanConfidence() + " : " + page.GetText());
                    }
                }
            }
        }
    }

@charlesw
Copy link
Owner

In regards to converting the image to grayscale the actual formula is 0.2126 * R + 0.7152 * G + 0.0722 * B (https://en.wikipedia.org/wiki/Grayscale) however Pix exposes this functionality through the Pix.ConvertRGBToGray (set all parameters to 0 to use the default values defined by leptonica). Though I wouldn't bother with this unless your doing further processing that requires a grayscale image (like running it through a custom thresholder\binerization algorithm). Note that Tesseract already does this and it's generally considered good enough for most cases.

I'm also pretty sure your not correctly resizing the image to 300dpi in your crop function. If you check the output I think you'll find that it's actually the same size. What you'll need to do is us the source resolution and the work out a scaling factor from that. So assuming the source is 70 dpi (typical screen resolution) something like the following should work:

public static Bitmap ResizeImage(Bitmap src, Single targetResolution)
{
        if(targetResolution <= 0.0f) throw new ArgumentOutOfRangeException ("targetResolution", "The target resolution must be greater than zero.");

        if(src.HorizontalResolution <= 0.0f) throw new ArgumentOutOfRangeException ("src", "The src image doesn't specify a horizontal resolution.");

        if(src.VerticalResolution<= 0.0f) throw new ArgumentOutOfRangleException("src", "The src image doesn't specify a vertical resolution.");

        Single horizontalScale = targetResolution / src.HorizontalResolution;
        Single verticalScale = targetResolution / src.VerticalResolution;

        Bitmap result = new Bitmap(src.Width * horizontalScale , src.Height * verticalScale);
        b.SetResolution(targetResolution, targetResolution )
        using (Graphics g = Graphics.FromImage((Image)b))
        {
            g.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.HighQualityBicubic;
            g.DrawImage(src, 0, 0, result .Width  , result.Height);
        }
        return b;
}

Finally the code bellow will always fail as you've just set tessedit_write_images to true (also note it's tessedit_write_images not tessedit_write_imag :

engine.SetVariable("tessedit_write_images", true);
if (engine.TryGetBoolVariable("tessedit_write_images", out result))
{
    Assert.AreEqual(false, result, "The values are not equal");
}

What you probably want is something like this:

engine.SetVariable("tessedit_write_images", true);
if (engine.TryGetBoolVariable("tessedit_write_images", out result))
{
    Assert.AreEqual(true, result, "The variable 'tessedit_write_images' should be enabled.");
}

@charlesw
Copy link
Owner

Note I've created an issue, #183, to support resizing\scaling Pix's as its probably a common operation. No promises that it'll be implemented anytime soon.

@FlorinMax
Copy link
Author

Related to ResizeImage method, I thonk, instead of b.SetResolution(targetResolution, targetResolution )
we should have src.SetResolution(targetResolution, targetResolution ) ? and also I encountered an error on Bitmap result = new Bitmap(src.Width * horizontalScale , src.Height * verticalScale);
Argument 1 and 2 cannot convert from float to string

@charlesw
Copy link
Owner

Opps sorry should have been result.SetResolution(targetResolution, targetResolution) and Bitmap result = new Bitmap((int)(src.Width * horizontalScale),(int)(src.Height * verticalScale))

@FlorinMax
Copy link
Author

After I made your changes, this is the result, but is still not what I expected
Amount
Beneficxary
Descnvhun
Rekreno:
Status

@FlorinMax
Copy link
Author

And this is the CropImage method after your changes
public void CropImage()
{
Bitmap image = new Bitmap(localPath);
var rect = new Rectangle(60,10,70,70);
Bitmap imageCrop = image.Clone(rect, image.PixelFormat);
ResizeImage(imageCrop, 300);
imageCrop.Save(localPathCrop);

    }

@charlesw
Copy link
Owner

charlesw commented Jun 25, 2015 via email

@FlorinMax
Copy link
Author

Before you fix the issue, do you have any idea to step over this ?

@FlorinMax
Copy link
Author

You know what it's funny, using the same tesseract library in Java, it works fine. I don't have to crop the image, just scale it.

@charlesw
Copy link
Owner

What happens if you use the tesseract command line tool?
On 25 Jun 2015 21:21, "FlorinMax" notifications@github.com wrote:

You now what it's funny, using the same tesseract library in Java, it
works fine. I don't have to crop the image, just scale it.


Reply to this email directly or view it on GitHub
#182 (comment).

@FlorinMax
Copy link
Author

I come with some updates. After looking to find the issue, I found what was the problem. Our method to Resize the image is not doing what we expect. Basically , the method doesn't resize the image, it draws with the same resolution.
So, instead of :
/Bitmap result = new Bitmap((int)(src.Width * horizontalScale) , (int)(src.Height * verticalScale));
//result.SetResolution(targetResolution, targetResolution);

I added :
int width = (int)(src.Width * horizontalScale);
int height = (int)(src.Height * verticalScale);
Bitmap result = new Bitmap(src, width, height);

After this, our image get a higher resolution:
Dimensions 1334 x 375
Width 1334 pixels
Height 375 pixels
Bit depth 32
and I get all the text from the image.

As I said in the previously comments, in Java , using AffineTransform, I get an image with better resolution:
Dimensions 640 x 180
Width 640
Height 180 pixels
Bit depth 24

Trying to obtain the same as with VS , the text is not recognized completely, so I have to give the maxim targetresolution.

In conclusion, not the Tesseract was the problem, our resize method was the problem, and I think is not fully optimized.

@charlesw
Copy link
Owner

Okay, I'll see if I can find some time this weekend to expose the resize
functionality offrred by leptonica. Should solve these kinds of issues.
On 26 Jun 2015 23:54, "FlorinMax" notifications@github.com wrote:

I come with some updates. After looking to find the issue, I found what
was the problem. Our method to Resize the image is not doing what we
expect. Basically , the method doesn't resize the image, it draw with the
same resolution.
So, instead of :
/Bitmap result = new Bitmap((int)(src.Width * horizontalScale) ,
(int)(src.Height * verticalScale));
//result.SetResolution(targetResolution, targetResolution);

I added :
int width = (int)(src.Width * horizontalScale);
int height = (int)(src.Height * verticalScale);
Bitmap result = new Bitmap(src, width, height);

After this, our image get a higher resolution:
Dimensions 1334 x 375
Width 1334 pixels
Height 375 pixels
Bit depth 32
and I get all the text from the image.

As I said in the previously comments, in Java , using AffineTransform, I
get an image with better resolution:
Dimensions 640 x 180
Width 640
Height 180 pixels
Bit depth 24

Trying to obtain the same as with VS , the text is not recognized
completely, so I have to give the maxim targetresolution.

In conclusion, not the Tesseract was the problem, our resize method was
the problem, and I think is not fully optimized.


Reply to this email directly or view it on GitHub
#182 (comment).

@charlesw
Copy link
Owner

I've added a new Scale method to Pix which should work better for the use case. Can you get the latest source code and try it out? You can build a NuGet package by double clicking the build.bat file.

@FlorinMax
Copy link
Author

Where should I found the build.bat file ? I have to uninstall the tesseract orc from NuGet Pacages and reinstall it again ?

@FlorinMax
Copy link
Author

Anyway... the behavior of the library is very strange. Some times it recognizes all the characters and numbers, some time not. Has some difficulty to recognize numbers. for this, I have to play (increase/ decrease) with targetresolution to get the text from the image. I saw that the date is the most difficult to recognize from the image.

@charlesw
Copy link
Owner

charlesw commented Jun 30, 2015 via email

@FlorinMax
Copy link
Author

Sorry, but I am not so familiar with this. Maybe you can give me more details...

@FlorinMax
Copy link
Author

In Tesseract -master, I found a build.bat file.... this is the one ?

@charlesw
Copy link
Owner

Yes, however you'll need to change the brach to develop. Master only
contains released code.
On 30/06/2015 9:29 pm, "FlorinMax" notifications@github.com wrote:

In Tesseract -master, I found a build.bat file.... this is the one ?


Reply to this email directly or view it on GitHub
#182 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants