Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

makescene crashes when importing PNG files #368

Closed
MaxDidIt opened this issue Sep 4, 2017 · 30 comments
Closed

makescene crashes when importing PNG files #368

MaxDidIt opened this issue Sep 4, 2017 · 30 comments

Comments

@MaxDidIt
Copy link

MaxDidIt commented Sep 4, 2017

I have an issue with the makescene command I can't resolve myself.

I have cloned and build the repository according to the instructions in the Readme.md under Ubuntu 16.04 64 bit. The build compiles without errors and all the apps are there.

However, if I run the makescene command with the parameters -i <IMAGE_FOLDER> <SCENE_FOLDER>, it will only create the scene folder and then crashes with the message "Ungültiger Maschinenbefehl (Speicherabzug geschrieben)" (I run the German version of Ubuntu)

I have added output messages to the code to see how far the command gets executed and apparently, the crash occurs in the image_io.cc file, in the load_png_file function, around the lines 311-314, where the pointers are set up.

I don't know how to fix it, though. The values for headers.height and headers.channels seem to make sense.

@simonfuhrmann
Copy link
Owner

simonfuhrmann commented Sep 5, 2017

A crash in image_io is relatively unlikely. Can you try to write a test program and load the image manually?

#include "mve/image_io.h"
int main() {
  mve::image::load_png_file("your path");
  return 0;
}

Is there anything else special about the image? Is it an 8 or 16 bit PNG?

@MaxDidIt
Copy link
Author

MaxDidIt commented Sep 6, 2017

Okay, I have compiled the test program and try to load a PNG that is in the same folder as the test app. I get the same result, the program crashes with "Ungültiger Maschinenbefehl (Speicherabzug geschrieben)".

However, if I change the function call to "load_tiff_file" and load a TIFF file instead, it seems to run fine. Maybe the load_png_file function is broken specifically?

@MaxDidIt
Copy link
Author

MaxDidIt commented Sep 6, 2017

Not sure if this is of any help, but I have changed the code back to load_png_file and used strace to get the system logs related to this app. These are the last few lines of that log before it crashes:

14:00:46.197536 open("./frame_0001.png", O_RDONLY) = 3
14:00:46.197561 fstat(3, {st_mode=S_IFREG|0664, st_size=407653, ...}) = 0
14:00:46.197582 read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\5\0\0\0\2\320\10\2\0\0\0@\37J"..., 4096) = 4096
14:00:46.197651 mmap(NULL, 2768896, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fcadc163000
14:00:46.198666 --- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x40b69d} ---
14:00:46.337334 +++ killed by SIGILL (core dumped) +++

@simonfuhrmann
Copy link
Owner

Do you mind sending me a link to that image?

@flanggut
Copy link
Collaborator

flanggut commented Sep 6, 2017

My guess is the image just has a png extension but isn’t actually a png... I’ve seen similar issues where a “.png” file was actually jpeg encoded.

@MaxDidIt
Copy link
Author

MaxDidIt commented Sep 6, 2017

http://maxdid.it/gamejam/img/frame_0001.png

I used ffmpeg to export the frames of a video into image files.

I thought it might be the wrong file format as well, but I have used the convert command to convert it from png to tiff and back, with no effect.

@simonfuhrmann
Copy link
Owner

The file loads fine here.

open("frame_0001.png", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0640, st_size=691915, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5395d8a000
read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\5\0\0\0\2\320\10\2\0\0\0@\37J"..., 4096) = 4096
mmap(NULL, 2768896, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5393b87000
read(3, "\7\0?)\363\207i\27\30x'\356\330\214Xg\7\260\334W\23G\371\323\31 \323\2S\273\353\\"..., 4096) = 4096
...

@simonfuhrmann
Copy link
Owner

According to your strace, your program is killed by SIGILL (not KILL). I believe this means that your processor or OS is attempting to execute an invalid instruction. I don't know what to make of this, but I believe it's your system, or your libpng library that is somehow messed up. Can you try reinstalling/upgrading libpng?

@MaxDidIt
Copy link
Author

MaxDidIt commented Sep 7, 2017

So, I have uninstalled the libpng version I installed via apt-get install and instead downloaded the libpng repository directly, compiled it an installed that.

Unfortunately, it has the same effect. Maybe it really is my processor? I currently use a AMD Phenom II X4 965 processor, which isn't the newest model, I guess. Still, I don't think I ever had any similar problems before.

Anyways, since you guys can't reproduce the problem, I guess I will close the issue.

@MaxDidIt MaxDidIt closed this as completed Sep 7, 2017
@simonfuhrmann
Copy link
Owner

I sorry to hear it still doesn't work. I don't really think it's your hardware... but I am out of ideas. Maybe there are compile-time flags to disable certain acceleration features for libpng? At this point I'm just guessing without really knowing what's causing the issue.

@timlgy
Copy link

timlgy commented Nov 1, 2017

I know why makescene crashes when importing image files after many experiments. It's not the reason of libjpeg or libpng, the actual reason is openMP. You could fix this bug with these codes in makescene.cc:
845 Line
mve::ImageBase::Ptr image;
#pragma omp parallel for ordered schedule(dynamic,1)
for (std::size_t i = 0; i < dir.size(); ++i)
....
std::string exif;
#pragma omp critical
image = load_any_image(afname, &exif);
...

@simonfuhrmann
Copy link
Owner

But that change doesn't make a lot of sense. You're storing the image in the image variable that is shared across all threads, and although loading the image is serialized now, you will overwrite the image due to race conditions and use the wrong data after loading the image...

Have you tried just putting the #pragma omp critical before image loading?

@andre-schulz
Copy link
Collaborator

@timlgy, can you please tell us a bit more about your system? Which CPU, operating system and compiler are you using?

@timlgy
Copy link

timlgy commented Nov 3, 2017

It works for me on Ubuntu 14.04.5 LTS, 16U32G and compiler is gcc 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) P.S. 'wrong data' what is that? prebundle.sfm?

@timlgy
Copy link

timlgy commented Nov 3, 2017

according to this article #pragma omp critical
The omp critical directive identifies a section of code that must be executed by a single thread at a time.
So I thought maybe 'load_any_image' is dangerous when putting this code inside mve::ImageBase::Ptr imagel=oad_any_image(afname, &exif);

@simonfuhrmann
Copy link
Owner

load_any_image just uses libpng in the background which is thread-safe when properly used, which I hope we do. By "wrong data", I mean that, in the above code, image could be overwritten with another image before it's used and thus you end up with the wrong image data in your view.

Now just to get a feeling for this issue, who is actually affected by these crashes?

@timlgy
Copy link

timlgy commented Nov 8, 2017

When I load only one image(any type jpg tif or png) it is OK, but when I load two images(any type) it crashes with a high probability.

@simonfuhrmann
Copy link
Owner

simonfuhrmann commented Nov 9, 2017

I don't know what causes the issue. Loading of images should be possible in parallel, and I have never seen issues. Is this crash maybe related to your available memory, if you load really large images? I can imagine that loading 32 images (in case you have that many cores) in parallel on a 2GB RAM machine will cause issues.

In the meantime, just put a line #pragma omp critical before load_any_image, which should resolve the issue.

@andre-schulz
Copy link
Collaborator

@timlgy, can you post a backtrace of the crash?
Also, what do you mean with "16U32G"?

@timlgy
Copy link

timlgy commented Dec 14, 2017

@andre-schulz 16U32G means CPU 16 multiple threads and 32GB Mem

@stiv-yakovenko
Copy link

I observe same crash on Windows 7 64x with lastest mve build by instruction:

crash

@stiv-yakovenko
Copy link

Tried to ''put a line #pragma omp critical before load_any_image", but this doesn't help, in this case I can't build. How can we solve this? What is purpose of using OpenMP here?

image

@stiv-yakovenko
Copy link

Tried to import just one image: I also get same crash.

@andre-schulz
Copy link
Collaborator

Hi @stiv-yakovenko,
thanks for the information. Is it possible for you to post the png file that's causing the problem?
Also, which CPU are you using?

@stiv-yakovenko
Copy link

stiv-yakovenko commented Mar 26, 2018

0.zip

@stiv-yakovenko
Copy link

cpuz

@andre-schulz
Copy link
Collaborator

andre-schulz commented Mar 26, 2018

The problem seems to be that makescene was compiled in Debug mode but linked to libpng in Release mode; probably a problem with linking against different runtime libraries (/MD vs. /MDd). This is a problem with the way the current build system works. I have to investigate this further.
Can you confirm that makescene works in RelWithDebInfo mode?

If you need to run makescene in Debug mode, you can compile the 3rdparty libraries in Debug mode by adding Debug to CMAKE_CONFIGURATION_TYPES in the 3rdparty CMakeLists.txt. Then, recompile and copy the debug versions of the libraries by hand to the location of the Debug version of makescene. I'll try to find out if I can somehow fix/automate this.

@stiv-yakovenko
Copy link

I confirm ReleaseWithDeb version works.

@CaptainEven
Copy link

@andre-schulz Hi, thanks for your analysis about debug mode. I have re-compile the 3rdparty libraries in Debug mode and add tiffd.dll, zlibd.dll to the Debug version of makescene. The program still crashed in

png_read_image(png, &row_pointers[0]);

image

@CaptainEven
Copy link

@andre-schulz It's OK now! After a complete compiling of debug after modifying CMAKE_CONFIGURATION_TYPES in CMakeLists.txt of the 3rdparty libraries and modifying input lib configuration of MVE.sln, it's resolved for the jpg loading problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants