Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big Set Blows Up inputs #38

Open
forgotpw1 opened this issue Apr 9, 2012 · 1 comment
Open

Big Set Blows Up inputs #38

forgotpw1 opened this issue Apr 9, 2012 · 1 comment

Comments

@forgotpw1
Copy link

I have hit a problem with a big set of inputs (500+). I am building a file compression box. An action to zip a set of inputs

When I try and merge the input it appears that upstream, in the process, the inputs have "blown up. " Process is just saving each file to S3, and should return the new path to the inputs Array.

Specifically, in the merge inputs should be an Array, but instead it is coming though as a String.

Error Message looks like this

Worker #18890: {:pid=>18890, :id=>370, :time=>0.007631538, :status=>"failed", :output=>"{\"output\":\"undefined method `each' for #<String:0x00000002995078>\"}"}

Anyone else ever hit this?

I thought this Could this be due to a text field on the database filling up with many characters. I switched this to a longtext field but that didn't do the trick.

Is there some other memory issue with filling an array?

Here's my action. It is erroring on the block with inputs.each

require 'zip/zip'
require 'zip/zipfilesystem'
require 'fileutils'
require 'rest-client' 
require 'json'
class ScanZipper < CloudCrowd::Action

  #Download files
  def process
    save_path = save("#{file_name}")
    save_path
  end

  #Archive them.
  def merge
    puts input.class
    puts input
    name = options['last_name']
    date = Time.now
    date = date.strftime("%Y%m%d")
    url = options["point"]
    scan_id = options["scan_id"]
    files_to_remove = []

    zip_file_name = "#{name}#{date}.zip"
    zipfile = Zip::ZipFile.open(zip_file_name, Zip::ZipFile::CREATE) do |zip|
      input.each do |batch_url|
        batch_path = File.basename(batch_url)
        file = download(batch_url, batch_path)
        puts batch_path
        tmp_file = batch_path
        zip.add batch_path, file
        files_to_remove << file
      end
    end

    zip_path = save("#{name}#{date}.zip")

    files_to_remove.each {|f| File.delete f}  

    zip_path
  end


end
@forgotpw1
Copy link
Author

I'm still experiencing this. My guess is there is some limit getting hit. I was able to temporarily avoid it by not using s3 authentication, but with a big set (700+ inputs) it exploded again. I think all the characters in the keys and signatures are giving the input object more than it can handle.

This is pretty much a show stopper and makes it really hard to use this in production.

I'm optimistic though that someone out there knows what's going on here.

What is the overall lifecycle of input?

Is input really a JSON object?

Is there a memory limit on thin or in ruby for JSON objects? Could another JSON library solve this.

Stuff works great when the input size is small, but unfortunately large sets aren't working and it's really hard to pin down what makes this happen. It's like the server instance can't handle the size of the input array.

Any insight into this would be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant