Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endurance Test! #23

Closed
shahrokhDaijavad opened this issue Apr 29, 2024 · 6 comments
Closed

Endurance Test! #23

shahrokhDaijavad opened this issue Apr 29, 2024 · 6 comments
Assignees
Labels
fixed Marks an issues as fixed in the dev branch question Further information is requested

Comments

@shahrokhDaijavad
Copy link
Member

As we discussed issue #388 in the internal repo, we want to test if the framework has a memory leak or not. To do this, we use the noop transform and a large dataset (like test set 3 that has 1500 zip files) and 1) Run ingest2parquet on it first and 2) Try the noop transform while monitoring the memory usage of the laptop and see if it reaches a flat plateau or not.

@shahrokhDaijavad shahrokhDaijavad added the question Further information is requested label Apr 29, 2024
@shivdeep-singh-ibm
Copy link
Collaborator

I tested with a set of 1483 files on (32GB, 4CPUs)and used traceback library to check for memory leak. I ran 10 iterations and observed the memory usage which was peaking around 4 GB. There were no obvious signs of a memory leak in this test.

@shivdeep-singh-ibm
Copy link
Collaborator

shivdeep-singh-ibm commented May 1, 2024

I ran another test of 1483 files on podman VM with different memory configurations. The results are below.
It seems that it needed around 4GB of available memory to run successfully for these 1483 files.

CPUs Total Memory Memory Used by Ray Transform Files Processed Successfully Total Files Status of JOB
4 8GB 4.2GB NOOP 1483 1483 Passed
4 6GB 3 GB NOOP 910 1483 Crashed
4 4GB 2GB NOOP 504 1483 Crashed

@shahrokhDaijavad shahrokhDaijavad added the fixed Marks an issues as fixed in the dev branch label May 1, 2024
@blublinsky
Copy link
Collaborator

blublinsky commented May 17, 2024

@shahrokhDaijavad, @shivdeep-singh-ibm This is an important piece of info not only for us but also for potential users. Can we please:

  1. Bring all of the results together in a separate document inside the project
  2. Once 1 is complete, remove this issue

@shivdeep-singh-ibm shivdeep-singh-ibm self-assigned this May 17, 2024
@shahrokhDaijavad
Copy link
Member Author

I agree, @blublinsky. I will create an md file called memorytest under doc with this information and link from the mac.md file to it.

@blublinsky
Copy link
Collaborator

@shahrokhDaijavad great. Should it be memory? or endurance?

@shahrokhDaijavad
Copy link
Member Author

@blublinsky It's a combination of testing for memory leak (which peaks and flattens around 4GB, i.e., no leak) and endurance that shows with smaller memory (4GB and 6GB total memory), it is still possible to process 500 or 900 files successfully before it crashes. I will explain in the readme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Marks an issues as fixed in the dev branch question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants