Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to set Dynamically set ocr.extra.commands option based on some logic? #58

Open
DEEPAK-KESWANI opened this issue Oct 28, 2018 · 2 comments

Comments

@DEEPAK-KESWANI
Copy link

Hi,

I want to enable auto rotation only on 1st version and disable it for rest of the version.

I'm using OCRMyPDF tool with following property in alfresco-global.properties file.

# OCRmyPDF
ocr.extra.commands=--verbose 1 --force-ocr -l eng --output-type pdf

The above setting should apply for all versions of documents except 1.0.

I'm trying to enable auto rotation for 1.0 version by adding below code in properties Map but it's not considering. Any help is appreciated.

For 1.0 version, the ocr.extra.commands property should be:
ocr.extra.commands=--verbose 1 --force-ocr -l eng --output-type pdf --rotate-pages --rotate-pages-threshold 1

Thanks.

### OCRTransformWorker.java
public final void transform(ContentReader reader, ContentWriter writer, TransformationOptions options)
			throws Exception {

		File sourceFile = null;
		File targetFile = null;
		try {

			String sourceMimetype = getMimetype(reader);
			String sourceExtension = mimetypeService.getExtension(sourceMimetype);
			sourceFile = TempFileProvider.createTempFile(getClass().getSimpleName() + "_source_",
					"." + sourceExtension);
			reader.getContent(sourceFile);

			String path = sourceFile.getAbsolutePath();
			String targetPath = path.substring(0, path.toLowerCase().lastIndexOf(".")) + "_ocr.pdf";

			Map<String, String> properties = new HashMap<String, String>(1);

			properties.put(VAR_SOURCE, sourceFile.getAbsolutePath());
			properties.put(VAR_TARGET, targetPath);
                        
/**
* Custom Code STARTS for setting Auto Rotation
*/
 if(options != null) // I'm passing options as non-null from OCRExtractAction.java when version is 1.0 and null when version >  1.0 
{
  properties.put("ocr.extra.commands", "--verbose 1 --force-ocr -l eng --output-type pdf --rotate-pages --rotate-pages-threshold 1");
}  
                           
/**
* Custom Code ENDS for setting Auto Rotation
*/

			RuntimeExec.ExecutionResult result = obtainExecuter(properties);

			if (verbose) {
				logger.info("EXIT VALUE: " + result.getExitValue());
				logger.info("STDOUT: " + result.getStdOut());
				logger.info("STDERR: " + result.getStdErr());
			}

			if (result.getExitValue() == 143) {
				logger.warn(result.getStdErr());
			} else if (result.getExitValue() != 0 && result.getStdErr() != null && result.getStdErr().length() > 0) {
				throw new ContentIOException("Failed to perform OCR transformation: \n" + result);
			}

			targetFile = new File(targetPath);
			writer.putContent(targetFile);

		} catch (Throwable t) {
			throw new RuntimeException(t);
		} finally {

		}
@angelborroy-ks
Copy link
Contributor

Probably it should be better to declare a new OCR Transformer with your extra options similar to

https://github.com/keensoft/alfresco-simple-ocr/blob/master/simple-ocr-repo/src/main/resources/alfresco/module/simple-ocr-repo/context/service-context.xml#L16

and then inject this new bean to your TransformWorker at

https://github.com/keensoft/alfresco-simple-ocr/blob/master/simple-ocr-repo/src/main/resources/alfresco/module/simple-ocr-repo/context/service-context.xml#L6

Then just decide which to use in your Java code.

@DEEPAK-KESWANI
Copy link
Author

Thanks a lot for your quick response and inputs. It worked for me. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants