Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing testdata files for unittests #13

Open
Shreeshrii opened this issue Jan 21, 2019 · 9 comments
Open

Missing testdata files for unittests #13

Shreeshrii opened this issue Jan 21, 2019 · 9 comments

Comments

@Shreeshrii
Copy link
Contributor

Shreeshrii commented Jan 21, 2019

testdata/lstm_training.txt is required for building training data for lstm_test

https://github.com/tesseract-ocr/tesseract/blob/master/unittest/lstm_test.cc#L6

// Generating the training data:
// If the format of the lstmf (ImageData) file changes, the training data will
// have to be regenerated as follows:

// ./tesseract/text2image --xsize=800 --font=Arial
// --text=tesseract/testdata/lstm_training.txt --leading=32
// --outputbase=tesseract/testdata/lstm_training.arial
// ./tesseract tesseract/testdata/lstm_training.arial.tif
// tesseract/testdata/lstm_training.arial lstm.train
// --pageseg_mode=6

@Shreeshrii Shreeshrii changed the title Missing file: testdata/lstm_training.txt Missing testdata files for unittests Jan 27, 2019
@Shreeshrii
Copy link
Contributor Author

Shreeshrii commented Jan 27, 2019

0146_281.3B.tif
line6.tiff
5318c4b679264.jpg

@stweil
Copy link
Contributor

stweil commented Jan 28, 2019

Cc'ing @jbreiden.

@Shreeshrii
Copy link
Contributor Author

@stweil Do we still need more images/testdata from Google?

@stweil
Copy link
Contributor

stweil commented Jul 12, 2019

I'm afraid that we have to find our own solutions without waiting for Google. They cannot provide all images and test data because some might be copyrighted. Therefore it is important to find free replacement images and data. We have nearly all images needed for the unit tests (equationdetect_test still needs an image).

@AndersonMartins1
Copy link

If you are looking for solutions to find free replacement images and data for use in unit testing, there are several options you can consider:

Free Image Banks: There are several free image banks available on the internet, where you can find high-quality, public domain images to use in your tests. Some examples include Unsplash, Pixabay and Pexels.

Test Data Databases: In addition to images, you may need test data for your test units. There are databases of test data freely available on the web that can be used to create realistic test scenarios. Search for open datasets related to your application domain.

Creating Images and Test Data: If you are unable to find suitable images or test data, consider creating your own. You can create simple images using free image editing tools like GIMP or Paint.NET, and generate test data using random data generation libraries in Python like Faker.

Community Resources: Don't underestimate the power of community. Search forums, discussion groups, and online communities related to your application domain. Many times, other developers are willing to share images and test data that they have created or found.

Creative Commons Licenses: When searching for free replacement images and data, be sure to check usage licenses. Many free resources are available under Creative Commons licenses, which may have specific attribution requirements or commercial use restrictions.

@stweil
Copy link
Contributor

stweil commented Mar 10, 2024

Thanks, but this issue is not about finding any image. It is about finding very specific images for a very specific task which is part of the unittests.

@AndersonMartins1
Copy link

To resolve this issue, you can follow these steps:

Clearly identify which specific images are required for the test cases in question.

Make sure these images are available somewhere accessible for testing. This could be in an internal image repository, a cloud storage server, or another accessible location.

If images are not available, you may need to create or purchase the necessary images and ensure they are stored in a suitable location.

After ensuring that the required images are available, you can update your unit tests to reference these specific images when running your tests.

Be sure to clearly document the image requirements for each test case so future developers know which images are needed and where to find them.

Rerun your unit tests to ensure that the images are being used correctly and that the tests are passing as expected.

By following these steps, you should be able to solve the problem of finding the specific images needed for the test cases in your unit tests.

@stweil
Copy link
Contributor

stweil commented Mar 10, 2024

I am sorry to say that, but your comments (and your pull requests) are not helpful. They sound like the result of an AI chat bot. If you want to help, you should read this issue carefully (it lists the missing images), look into the test code where these images are used and try to activate that code with replacement images.

@AndersonMartins1
Copy link

Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants