Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processable image validation seems not working in all cases #199

Open
floganz opened this issue Sep 8, 2023 · 4 comments
Open

Processable image validation seems not working in all cases #199

floganz opened this issue Sep 8, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@floganz
Copy link

floganz commented Sep 8, 2023

Hi,

I've noticed that processable_image validation works kind of strange. It's possible to create any invalid image file (like create .txt file and then change extension to .png) and upload it even if that validation enabled in the model.

Here is test application I've created to show the issue.
There's both instruction how to reproduce it locally via interface and test that indicated the same behaviour. Sample images also added to that repo.

I'm using Vips for image analysis and it looks like all needed for vips libs installed.
Running vips --vips-config on my machine next.

Output

enable debug: false
enable deprecated: true
enable modules: true
enable cplusplus: true
enable RAD load/save: true
enable Analyze7 load/save: true
enable PPM load/save: true
enable GIF load: true
use fftw for FFTs: true
accelerate loops with ORC: true
ICC profile support with lcms: true
zlib: true
text rendering with pangocairo: true
font file support with fontconfig: true
EXIF metadata support with libexif: true
JPEG load/save with libjpeg: true
JXL load/save with libjxl: true (dynamic module: true)
JPEG2000 load/save with OpenJPEG: true
PNG load/save with libspng: true
PNG load/save with libpng: false
selected quantisation package: imagequant
TIFF load/save with libtiff: true
image pyramid save with libgsf: true
HEIC/AVIF load/save with libheif: true (dynamic module: true)
WebP load/save with libwebp: true
PDF load with PDFium: false
PDF load with poppler-glib: true (dynamic module: true)
SVG load with librsvg: true
EXR load with OpenEXR: true
OpenSlide load: true (dynamic module: true)
Matlab load with libmatio: true
NIfTI load/save with niftiio: false
FITS load/save with cfitsio: true
GIF save with cgif: true
selected Magick package: MagickCore (dynamic module: true)
Magick API version: magick7
Magick load: true
Magick save: true

@Mth0158 Mth0158 added the bug Something isn't working label Nov 12, 2023
@Mth0158
Copy link
Collaborator

Mth0158 commented Nov 12, 2023

Thanks @floganz for this issue.
It's a bit late, but I'll try to investigate it in the coming days. Do you have any update on that issue btw?

@floganz
Copy link
Author

floganz commented Nov 17, 2023

I've debugged gem a bit and discovered track trace for this issue and possible solution. However I'm lacking context why it was made in that way, maybe you know why. @Mth0158

So, basically for processable_image is responsible ProcessableImageValidator class which delegated decision of file being image or not to another class Metadata.new(file).valid?, e.g. here.

Inside Metadata defined #valid? which seems OK.

def valid?
  read_image
  true
rescue InvalidImageError
  false
end

But #read_image has suspicious behaviour

def read_image
  # different ways to build a path to file
  ...
  image = new_image_from_path(tempfile.path) # or other way to get path to file

  raise InvalidImageError unless valid_image?(image)
  yield image if block_given?
rescue LoadError, NameError
  logger.info "Skipping image analysis because the mini_magick or ruby-vips gem isn't installed"
  {}
rescue exception_class => error
  logger.error "Skipping image analysis due to an #{exception_class.name.split('::').map(&:downcase).join(' ').capitalize} error: #{error.message}"
  {}
ensure
  image = nil
end

So here we have two rescues: rescue LoadError, NameError and rescue exception_class => error. Both will just skip the check, if those are triggered we're not checking for file being opened and silently return nothing to #valid? which in the end returns true in these cases.

What I don't get why if we could not read the file at all we return true? My understanding that if file cannot be read it's should invalid unless opposite is confirmed.

So, my guess on fixing it is to modify rescues in the #read_image so that they throw InvalidImageError together with those prints in the log.

@Mth0158
Copy link
Collaborator

Mth0158 commented Nov 21, 2023

Hi @floganz
Thanks for the investigation, it helps a lot. I'll have a look at it in the following days :)
@igorkasyanchuk do you have more context for this issue? It's a bit suspicious that we return true with valid? when skipping analysis?

@igorkasyanchuk
Copy link
Owner

Hi @floganz @Mth0158 to be honest I don't remember, maybe it was not me who wrote it 🤣 . I would only suggest starting with specs that fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants