Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate CSVw schema throws error #486

Open
quadrophobiac opened this issue Jun 26, 2017 · 3 comments
Open

Validate CSVw schema throws error #486

quadrophobiac opened this issue Jun 26, 2017 · 3 comments

Comments

@quadrophobiac
Copy link
Collaborator

Steps to Reproduce (for problems)

Add spec/fixtures/valid-cotw.csv dataset to Octopub file upload
Attach spec/fixtures/schemas/csv-on-the-web-schema.json for schema
Click upload
Octopub will hang

Current Behaviour (for problems)

The logs indicate that a problem occurs in the function get_file_for_validation_from_file() https://github.com/theodi/octopub/blob/master/app/models/dataset_file.rb#L156-L158
This raises an exception WARN: TypeError: no implicit conversion of StringIO into String. It's not clear if this exception is what is causing the service to hang

Your Environment

  • Mac OS Sierra accessing Octopub through Chrome browser
@quadrophobiac
Copy link
Collaborator Author

Running into problems duplicating this locally, as the same steps to follow run aground far earlier within create schema flow, throwing the following error

method=POST path=/datasets format=*/* controller=DatasetsController action=create status=500 
error='ActiveModel::UnknownAttributeError: unknown attribute 'owner_username' for DatasetFileSchema.' duration=487.74 view=0.00 db=0.00
ActiveModel::UnknownAttributeError - unknown attribute 'owner_username' for DatasetFileSchema.:

Appears to relate to this part of dataset_file_schema.rb

validates_presence_of :owner_username, message: 'Please select an owner for the schema'

@quadrophobiac
Copy link
Collaborator Author

A similar error is encountered if you follow these steps

  1. Add spec/fixtures/valid-cotw.csv dataset to Octopub file upload
  2. Upload without adding schema
  3. add spec/fixtures/schemas/csv-on-the-web-schema.json through form at https://octopub.io/dataset_file_schemas/new
  4. Edit dataset added in Step 1
  5. Select schema added in Step 3
  6. Octopub will hang

In this sequence of steps the same method hangs but a different error is thrown:
NoMethodError: undefined method tempfile' for nil:NilClass`

@caiwilliamson
Copy link
Collaborator

I reproduced the issue with the original instructions. The issue starts in dataset_file.rb in validate_schema_cotw with the code: tempfile = get_file_for_validation_from_file which calls the following function:

def get_file_for_validation_from_file
  File.new(file.tempfile)
end

File.new(file.tempfile) is trying to create a file from a StringIO which doesn't work.

Furthermore, later in validate_schema_cotw, some code tries to get the path from the tempfile, which doesn't work since it is never saved anywhere.

The solution is to create the tempfile, by doing something like:

def self.file_from_url_with_storage_key(file, storage_key)
  Rails.logger.info "DatasetFile: In file_from_url_with_storage_key"

  fs_file = FileStorageService.get_string_io(storage_key)
  tempfile = Tempfile.new
  tempfile.write fs_file
  tempfile.rewind
  ActionDispatch::Http::UploadedFile.new filename: File.basename(file),
                                         # content_type: 'text/csv',
                                         tempfile: tempfile
end

def get_file_for_validation_from_file
  file.tempfile
end

This resolves the original issue. But now we see the error:

NoMethodError: undefined method 'validate_header' for nil:NilClass

This is thrown by the following line in validate_schema_cotw:

validation = Csvlint::Validator.new(tempfile, {}, schema)

This issue appears to be with the CSV Lint gem itself in the way that it handles CSVw files, since the supplied arguments are correct. The stack trace points to the issue being in csvlint/csvw/table_group.rb:27:in 'validate_header'. So that's as far as I can get with this one I'm afraid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants