-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make tests run faster #517
Comments
For the change of zip loading, the code that is called a lot is the load function :
Since this load function calls new PizZip(content), if new PizZip is slow, then by changing PizZip to something faster would make the tests run faster. It would be interesting to see, for a test run, how much time is spent running new Pizzip(content) here. |
If the measurements make it clear that PizZip is slow, it would make sense to write basically :
and the MockedZip would "behave" exactly as the PizZip library, except that underneath, it uses a folder which is already unzipped. |
The two ideas I had were just a starting point, probably by using a profiler we could find which spots cause the slowlessness. |
You can now to use |
Benchmark Resultsmethod for this process.hrtime();
let loadFunctionCallFrequency = 0;
let totalCallTimeInNanoSeconds = 0;
function load(name, content, obj) {
let start = process.hrtime();
loadFunctionCallFrequency++;
const zip = new PizZip(content);
obj[name] = new Docxtemplater();
obj[name].loadZip(zip);
obj[name].loadedName = name;
obj[name].loadedContent = content;
start = process.hrtime(start);
totalCallTimeInNanoSeconds += start[1];
console.log(
`In calling all ${loadFunctionCallFrequency} times load function total time taken in seconds:`,
totalCallTimeInNanoSeconds / 1e9
);
console.log(
"Total number of load function calls:",
loadFunctionCallFrequency
);
return obj[name];
} Benchmark Results Code Block1 My Environment:
|
Can you do the same measurements with Here I suspect that you are running all tests (even the ones we know take more than 2 or 3 seconds). When I run Can you do the same measurement just for |
Benchmark Results
Average time taken IndividualTimesInSeconds: Code block for this test is exactly same as this reply Benchmark Results Code Block1
By this what I am able to understand is two things. 2.1 Do benchmarking for load function just with let loadFunctionCallFrequency = 0;
let totalCallTimeInNanoSeconds = 0;
function load(name, content, obj) {
let start = process.hrtime();
loadFunctionCallFrequency++;
const zip = new PizZip(content);
start = process.hrtime(start);
totalCallTimeInNanoSeconds += start[1];
console.log(
`In calling all ${loadFunctionCallFrequency} times load function with only PizZip total time taken in seconds:`,
totalCallTimeInNanoSeconds / 1e9
);
console.log(
"Total number of load function calls with only PizZip:",
loadFunctionCallFrequency
);
// obj[name] = new Docxtemplater();
// obj[name].loadZip(zip);
// obj[name].loadedName = name;
// obj[name].loadedContent = content;
return obj[name];
} Benchmark Results Code Block2 2.2 And case two that I should benchmark const PizZip = require("pizzip");
const fs = require("fs");
const path = require("path");
//Load the docx file as a binary
const content = fs.readFileSync(
path.resolve(__dirname, "../../examples/cyrillic.docx"),
"binary"
);
let totalCallTimeInNanoSeconds = 0;
let start = process.hrtime();
const zip = new PizZip(content);
start = process.hrtime(start);
totalCallTimeInNanoSeconds = start[1];
console.log(
`In calling PizZip time taken in seconds:`,
totalCallTimeInNanoSeconds / 1e9
); Benchmark Results Code Block3 |
I meant measuring the time of
|
Ok, so it means 0.26 seconds of improvement for a total run time of about 2 seconds. I think in this case, it doesn't make much sense to try to optimize it, since it won't be possible to do an improvement of more than 10%. Can you try to profile the test code to see if there are some bottlenecks that could be made faster ? Else, I think it would maybe be faster by parallelizing, I'm not sure how this is doable with mocha. |
For my computer the test with this command |
I was just giving an initial try for parallel testing idea and I installed mocha-parallel-tests npm module and tried running our test cases but it is taking even more time as compared to |
When you run The command run by
If you use mocha-parallel-tests, you should use the same env to be able to compare. |
Yes I am using the same env to compare both.
|
After using nodejs inspector with this command
As you can see a lot of functions which are taking more time are associated with PizZip and I guess if we can moc PizZip as discussed by you here maybe we can increase the test speed. |
Yes, I think you're right, lets give it a go then. |
I will start looking into it how to write a mock for PizZip. Please do share any suggestions if you feel so. |
Hello @shadab14meb346 , the idea is to mock PizZip, meaning that it should behave in the same way as the PizZip library but be as fast as possible. For example, one should be able to do :
|
Does this make things more clear ? |
Hi. Ya, sure thanks. it is helping me.
2. Now I am trying to use the step one const PizZip = require("pizzip");
const path = require("path");
const zippedFilePath = path.join(__dirname, "/testZip.zip");
const zip = new PizZip(zippedFilePath);
zip.file(/./).map(({ name }) => {
// this should print all filenames of the faked zip
console.log(name);
}); But this is giving me the below error
|
I think it is because when using a real pizzip instance, you need to use the file contents, not the file path. IE :
|
It got worked but in the place of const zippedFilePath = path.join(__dirname, "/testZip.zip");
const content = fs.readFileSync(zippedFilePath);
const zip = new PizZip(content); |
I tried to understand but I am facing a little difficulty so can you please help me In the Idea 1 (cached zip) as you mentioned
Can you please point me out the folder and line number where the unzipping of docx file is happening for the test cases. var zip = new MockedPizzip(filepath);
zip.file(/./).map(({ name }) => {
// this should print all filenames of the faked zip
console.log(name);
});
const content = zip.files["word/document.xml"].asText();
// This should log the content of word/document.xml
console.log(content);
const newContent = "foobar";
// This should change the content of word/document.xml
zip.file("word/document.xml", newContent, { createFolders: true }); |
The zip creation is done here :
The usage of the zip is done at several places, for example :
In all these cases, the zip instance created in |
@shadab14meb346 are you alright ? I think I will tackle this issue soon if I don't get a response in the next few days. |
I am very willing to have fast tests. This is because if you have tests that take less than a second to run, it is possible to have them run in watch mode, and save your changes and see immediately that your tests are still passing.
With the time however, the tests are running in a bit more than 1 second in watch mode (on my computer, it takes 1940ms on average).
I think this could be improved upon.
First things I would like to do is to measure things. There are separate steps for running the tests :
Since 1. is probably cached by the "page cache" : https://en.wikipedia.org/wiki/Page_cache
And I expect that 3. should be fast, I'm guessing that 2 && 4 could be improved upon.
Measuring is really the first step, the ideas should not be implemented before doing proper measuring and reporting the results in this issue.
Idea 1 (cached zip)
One of my ideas would be that for tests, we don't use Pizzip to unzip the docx for each test run (which takes time), but instead, we could create a mock that would have a "cached" version of a zip, which would be stored as a folder. For example, if you have in :
examples/foo.docx
, a docx document is a zipfile which you can unzip. If you unzip that, you get a folder like this :We could store this folder instead of foo.docx, and by creating a good mock of Pizzip, we could avoid to unzip the docx for each test run but instead read the files that we want from disk.
Idea 2 (parallel testing)
As far as I can tell, mocha currently runs each test suite sequentially (on one CPU only). Since most computers nowadays have at least 4 or 8 CPUs, it would make sense to run the tests in multiple processes.
The text was updated successfully, but these errors were encountered: