New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]"Error opening data file /usr/share/eng.traineddata" error, regardless of TESSDATA_PREFIX #1412
Comments
I came across this bug when facing the same issue. I noticed there was a wget for the traineddata in the ccextractor/linux/build-static.sh file
though the path appears to have changed. I can see all the traineddata files here though https://github.com/tesseract-ocr/tessdata.git I downloaded the eng.traineddata via GitHub and copied it to the tesseract tessdata dir
This then allowed me to run ccextractor against a file with burned in subs (no need to set TESSDATA_PREFIX) and it (mostly) worked. It ran at least and generated an SRT file.. I think I just need to play around with some thresholds to get more accurate OCR.
btw - if you prefix ccextractor with
|
At least, on mac using brew for tesseract 5 installation, tessdata directory /usr/local/share/tessdata is never found. Minor changes in CMakeLists.txt are also required to build on mac Big Sur. |
Could you please check if this is still an issue on the latest master? Should have been fixed by #1479 |
Can't test for hardsub
|
FYI, for my use cases (with these options -DWITH_OCR=ON -DWITHOUT_RUST=ON), it is ok. I still needed to apply minor CMake modifications to be able to build it on Mac Os Big Sur (see tesseract_5_mac.patch.zip above) |
@prateekmedia could you look into this? Seems like the |
Two environment variable are needed to be set, see linux CI in
build_ocr_hardsubx
…On Thu, 16 Mar, 2023, 04:38 Punit Lodha, ***@***.***> wrote:
Can't test for hardsub
~/ccextractor-master/linux % RUST_BACKTRACE=full ./build_hardsubx
Running pre-build script...
Obtaining Git commit
Git command not present, trying folder approach
Storing variables in file
Commit: Unknown
Date: 2023-03-15
Stored all in compile_info_real.h
Done.
Trying to compile...
Checking for cargo...
rustc >= MSRV(1.54.0)
Building rust files...
Compiling rusty_ffmpeg v0.10.0+ffmpeg.5.1
Compiling ccx_rust v0.1.0 (/home/me/ccextractor-master/src/rust)
error: failed to run custom build command for `rusty_ffmpeg v0.10.0+ffmpeg.5.1`
Caused by:
process didn't exit successfully: `/home/me/ccextractor-master/src/rust/../../linux/rust/release/build/rusty_ffmpeg-0ee1e7c47bb1286e/build-script-build` (exit status: 101)
--- stderr
thread 'main' panicked at 'No linking method set!', /home/me/.cargo/registry/src/github.com-1ecc6299db9ec823/rusty_ffmpeg-0.10.0+ffmpeg.5.1/build.rs:343:13
stack backtrace:
0: 0x5603810c832d - std::backtrace_rs::backtrace::libunwind::trace::h8217d0a8f3fd2f41
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
1: 0x5603810c832d - std::backtrace_rs::backtrace::trace_unsynchronized::h308103876b3af410
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x5603810c832d - std::sys_common::backtrace::_print_fmt::hc208018c6153605e
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:66:5
3: 0x5603810c832d - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hf89a7ed694dfb585
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:45:22
4: 0x5603810edecc - core::fmt::write::h21038c1382fe4264
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/fmt/mod.rs:1197:17
5: 0x5603810c4ba1 - std::io::Write::write_fmt::h7dbb1c9a3c254aef
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/io/mod.rs:1672:15
6: 0x5603810c9c05 - std::sys_common::backtrace::_print::h4e8889719c9ddeb8
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:48:5
7: 0x5603810c9c05 - std::sys_common::backtrace::print::h1506fe2cb3022667
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:35:9
8: 0x5603810c9c05 - std::panicking::default_hook::{{closure}}::hd9d7ce2a8a782440
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:295:22
9: 0x5603810c9926 - std::panicking::default_hook::h5b16ec25444b1b5d
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:314:9
10: 0x5603810ca196 - std::panicking::rust_panic_with_hook::hb0138cb6e6fea3e4
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:698:17
11: 0x5603810ca049 - std::panicking::begin_panic_handler::{{closure}}::h4cb67095557cd1aa
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:586:13
12: 0x5603810c87e4 - std::sys_common::backtrace::__rust_end_short_backtrace::h2bfcac279dcdc911
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/sys_common/backtrace.rs:138:18
13: 0x5603810c9db9 - rust_begin_unwind
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:584:5
14: 0x560380bddf13 - core::panicking::panic_fmt::h1de71520faaa17d3
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:142:14
15: 0x560380be17fa - build_script_build::static_linking::h5176694b9f3d639f
16: 0x560380be207f - build_script_build::main::h1d1e33981c90d847
17: 0x560380be7783 - core::ops::function::FnOnce::call_once::hd5fa772c5bfe5459
18: 0x560380be2259 - std::sys_common::backtrace::__rust_begin_short_backtrace::ha9a647e66b4a3dc7
19: 0x560380be7479 - std::rt::lang_start::{{closure}}::h824bfacb332716dd
20: 0x5603810c082e - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h4937aaa125c8d4b2
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/ops/function.rs:280:13
21: 0x5603810c082e - std::panicking::try::do_call::h6f5c70e8b0a34f92
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:492:40
22: 0x5603810c082e - std::panicking::try::h68766ba264ecf2e2
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:456:19
23: 0x5603810c082e - std::panic::catch_unwind::hc36033d2f9cc04af
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panic.rs:137:14
24: 0x5603810c082e - std::rt::lang_start_internal::{{closure}}::h78c037f4a1a28ded
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/rt.rs:128:48
25: 0x5603810c082e - std::panicking::try::do_call::he6e1fffda4c750ee
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:492:40
26: 0x5603810c082e - std::panicking::try::h48a77ddbb2f4c87a
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:456:19
27: 0x5603810c082e - std::panic::catch_unwind::hfa809b06a550a9e7
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panic.rs:137:14
28: 0x5603810c082e - std::rt::lang_start_internal::h4db69ed48eaca005
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/rt.rs:128:20
29: 0x560380be7461 - std::rt::lang_start::h0fe959b208925438
30: 0x560380be2143 - main
31: 0x7fefa92e3790 - <unknown>
32: 0x7fefa92e384a - __libc_start_main
33: 0x560380bde205 - _start
34: 0x0 - <unknown>
warning: build failed, waiting for other jobs to finish...
Failed.
@prateekmedia <https://github.com/prateekmedia> could you look into this?
Seems like the build_hardsubx script is broken by adding rusty_ffmpeg
which is a dependency of rsmpeg
—
Reply to this email directly, view it on GitHub
<#1412 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJ3UGXBRTABS3N3OHQ3UCULW4JDWZANCNFSM5NWHQKOQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@rezad1393 can you try adding the env variables, FFMPEG_INCLUDE_DIR and FFMPEG_PKG_CONFIG_PATH and then trying again?
or to whatever the correct path for your machine is |
@PunitLodha @cfsmp3 when do you think we could see a new release? |
When we can merge all the pending PRs I guess. |
To get the version of CCExtractor, you can use
--version
.CCExtractor version: CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
In raising this issue, I confirm the following:
Necessary information
Video links
anything really
Additional information
what ever I set as TESSDATA_PREFIX ccextract still says the same error with the same path,
setting TESSDATA_PREFIX affects tesseract so I know it is not that.
but CCExtractor seems to look at a hardocded path.
The text was updated successfully, but these errors were encountered: