Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assigning multiple files per worker in parallel_import #49

Open
thricedotted opened this issue Apr 17, 2015 · 3 comments
Open

assigning multiple files per worker in parallel_import #49

thricedotted opened this issue Apr 17, 2015 · 3 comments

Comments

@thricedotted
Copy link

(NB: Not sure if this is an upstream bug or upstream intended behavior, but I was directed here for this issue!)

I have data spread across a number of files that is larger than the number of workers, but I had trouble using parallel_import to upload them to Myria. See the following queries:

https://rest.myria.cs.washington.edu:1776/query/query-70837 -- in this query, I've assigned five files to three workers, and get edu.washington.escience.myria.DbException: Query #70837.0 failed: ErrorCode: 0, SQLState: 42P07, Msg: ERROR: relation "public:adhoc:supertinyngramtest" already exists

https://rest.myria.cs.washington.edu:1776/query/query-70838 -- exactly the same query, except each file is assigned to a unique worker. This one runs successfully.

Both queries have "argOverwriteTable": true, since at first I thought I was double-ingesting -- however, an earlier query where this was false also failed.

@BrandonHaynes
Copy link
Member

Yep -- there must currently be a one-to-one correspondence between input URL and worker. This makes me sad, and will hopefully be fixed soon.

In the case of URLs>workers, it's just a matter of unioning the extra sources prior to the DbInsert operator. A workaround is to generate the JSON plan by hand and do this manually. Yuck.

@thricedotted
Copy link
Author

Darn... makes me sad too :( I'm getting around it by combining the extra files for now. Thanks!

@BrandonHaynes
Copy link
Member

Going to keep the issue open so I don't forget to fix this soon.

@BrandonHaynes BrandonHaynes reopened this Apr 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants