Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terribly slow for big files #6

Open
berarma opened this issue Sep 13, 2017 · 7 comments
Open

Terribly slow for big files #6

berarma opened this issue Sep 13, 2017 · 7 comments

Comments

@berarma
Copy link

berarma commented Sep 13, 2017

I'm trying to use this script on a 70GB compressed file and it's terribly slow. It's decompressing the whole source file again and again for every table and it does a sed over the whole file everytime.

@kedarvj
Copy link
Owner

kedarvj commented Sep 13, 2017

This is one of the items I really have plans to work on.

@bendathierrycom
Copy link

It has just worked well for a 3,3 Go backup right now. Not fast, but well. Sometimes it is much important. Especially concerning datas, isn't it ?

@n9yty
Copy link

n9yty commented Feb 3, 2020

I tried it on a 20GB file. It was a big file, that is why I wanted to split it. It was very slow. I think it is because it is using sed to step through the entire file for each table it extracts, at least in that use case which is what I am doing. It seems it would instead have to capture the environment stuff to save it, then write out each table as it comes across it on one pass through the file which would speed it up. But that is a non-trivial reworking of the way the script processes things.

@kedarvj
Copy link
Owner

kedarvj commented Feb 3, 2020

I still didn't get chance to change the script logic to extract all tables anyways and write to a file if it passes through the filter. I guess, until that happens, it is best to extract all the tables and choose the ones you need. Thank you.

@kedarvj kedarvj closed this as completed Feb 3, 2020
@kedarvj
Copy link
Owner

kedarvj commented Feb 3, 2020

(mistakenly closed)

@kedarvj kedarvj reopened this Feb 3, 2020
@berarma
Copy link
Author

berarma commented Feb 3, 2020

I solved this some time ago by switching to mydumper/myloader. It's much faster and writes tables in individual files. It might work for others.

@Lusitaniae
Copy link

worked ok, but so slow

Initially I took a full database dump, about 1gb compressed, 8gb uncompressed, in less than 30 min

splitting the uncompressed dump into uncompressed databases took easily 10h or more using this tool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants