Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate increasing bulk publish limit of 10 activities #1408

Closed
emmajclegg opened this issue Mar 23, 2024 · 13 comments
Closed

Investigate increasing bulk publish limit of 10 activities #1408

emmajclegg opened this issue Mar 23, 2024 · 13 comments
Assignees
Labels
Enhancement Estimated ODS Issue initiated by ODS

Comments

@emmajclegg
Copy link
Collaborator

To check initially - what would the impact be of increasing the bulk publish limit from 10 to e.g. 50 or 100 activities? (100 is the maximum number of activities we recommend for Publisher, so it seems a reasonable upper limit)
 
While users who are manually editing individual activities may never need to bulk publish, there are some users importing higher activity volumes where having to publish in batches of 10 must be limiting.
 
It's also not completely clear currently how many activities are being selected with the button below, when the user has activities across multiple pages.

image

Happy to discuss what's proportionate here.

@emmajclegg emmajclegg added ODS Issue initiated by ODS Enhancement labels Mar 23, 2024
@praweshsth praweshsth assigned PG-Momik and unassigned praweshsth Apr 3, 2024
@PG-Momik
Copy link
Collaborator

PG-Momik commented Apr 3, 2024

@emmajclegg @dan-odsc

We're currently unable state the exact number as the limit of bulk publish. The limitation is relative and we believe it is mostly relative to the number of nodes and the depth of nodes of an activity rather than the number of activities.

To increase the bulk publish limit, we'll need to test the max number of activities we can bulk publish without getting a system level error (like queue timeout, or memory error). To test this, we've come up with a test plan that looks something like this:

  1. Temporarily remove the pagination limit of 10, to show 100 activities in the activities listing page.
  2. Bulk publish all 100 activities via the UI as a normal person would, rather than testing via script. (To track workflow time rather than just tracking individual execution of modules involved in bulk publish).
  3. We plan to test this with, 100 activities with 50 results and 50 transactions for each activity. The goal is to check if we face any kind of system level limitation.
  4. We'll decrease the number of activities we test if we get errors when testing 100 activities. We'll then repeat steps 2 to 4 until we no longer face issue and everything looks okay.

Upon finding the limit, we can then look into performing other improvements or workflow change/reconsideration.


cc: @praweshsth @sarinasindurakar

@praweshsth praweshsth added the Discussion A query or seeking clarification on parts of the spec. label Apr 9, 2024
@praweshsth praweshsth assigned emmajclegg and unassigned PG-Momik Apr 10, 2024
@emmajclegg
Copy link
Collaborator Author

@PG-Momik @praweshsth - the above approach sounds sensible to me as a way to start testing limits. Feel free to estimate work if applicable or let me know if you want to discuss further?

@emmajclegg emmajclegg added Need Estimation and removed Discussion A query or seeking clarification on parts of the spec. labels Apr 15, 2024
@PG-Momik
Copy link
Collaborator

PG-Momik commented Apr 23, 2024

@emmajclegg here's a quick expansion on the things we want to test (Testables) and how we will be testing (Testing process/ Task) on the limit of bulk publish. Estimation for this issue have been made accordingly.

Testables

  • Can 100 activities be displayed on the activities listing page
  • Can 100 activities be selected deselected without lag in UI
  • How long will it take to verify core elements.
  • Can bottom popup handle 100 activities title in UI?
  • Does IATI validator have some kind of rate limit that will prevent us from requesting 100 activity validation.
  • How long will it take for validation of 100 activities to be completed.
  • What is the difference in validation process when activities have errors vs activities have no errors.
  • Can 100 activties be queued for publish.
  • Does Aws have some kind of rate limit that will prevent us from uploading 100 files.
  • Will the queue timeout when trying to publish.
  • How long will the publish process take.

Testing process/ Task

  • Make activity with 100% completeness, with 50 result and 50 transactions.
  • Write script to create 100 activities with 50 results and 50 transactions each with no validation error.
  • Write script to create 100 activities with 50 results and 50 transactions each with some validation errors.
  • Remove limit of 10 items sent from backend. (send 100 activities to frontend).
  • Remove limit of 10 items being shown in front-end.
  • Test if page is rendered with 100 activities.
  • Test if 100 activities core elements are verified.
  • Test/time how long validation process takes to complete IATI validation.
  • Test if any error occurs after clicking ''Proceed".
  • Test how long it will take for publish process to complete after proceed is clicked.

Important

  • We'll do test runs for 100 activities with validation errors, then with no validation errors, and check if we get any errors during the entire bulk publish process.
  • If we get any errors, we'll look at the errors and make possible changes. If the changes do not work, we'll lower the number of activities being published by 10 in next run. We'll continue to repeat the processes till we reach a conclusion.

cc: @praweshsth @Sanilblank @sarinasindurakar

@emmajclegg
Copy link
Collaborator Author

Thanks @PG-Momik, @praweshsth - sounds like a good plan so I'm happy for you to start working on this.

To note, the design work on #1423 is higher priority if that can be worked on this week. It's affecting more end users than this bulk publishing limit.

@praweshsth praweshsth assigned Sanilblank and unassigned emmajclegg Apr 24, 2024
@Sanilblank
Copy link
Collaborator

@emmajclegg Hello Emma, just wanted to give you an update regarding this issue. We have started testing the bulk publish process.

Initially we created an activity with 100% completeness which included 50 transactions, 50 results where each result had 50 indicators and each indicator had 25 periods each. With this configuration, the XML file created reached over 62MB for a single activity (not the merged file with multiple activities). It seems that IATI validator does not allow files over 60MB to be validated and hence, we received an error when trying to validate a single activity. We realized that we had input too much data in a single activity and no user would have so much data in a single activity.

So, we updated the data by reducing the no of periods of each indicator to 10 and also reduced the total no of indicator for each result to 10 instead of the previously placed 25 and 50 respectively. We did not change the no of transactions or the no of results. With these new changes, the size of the XML file for an activity reduced to a little over 16MB which can be validated using IATI validator and so we will be moving forward with the aforementioned tests with the data we have input in the activity.

However, unlike initially stated, we will start testing with a small no of activities and then move on for larger number instead of going the opposite direction i.e we will test initially for 5 activities, then for 10 and slowing increament the no of activities.

cc. @praweshsth @sarinasindurakar @PG-Momik

@Sanilblank
Copy link
Collaborator

@emmajclegg We moved ahead as stated above but found that when publishing even a single activity (with the large amounts of data), it was taking a considerably large amount of time.
I have been looking into the code and found places which required some minor changes involving eager loading. After making the said changes, the system seemed to be able to publish the activities.
We have completed the test for 1, 2, 5 acitivities, however, when trying for 10 activities, we are encountering an issue related to horizon (which is used for running background tasks).
We are currently looking into this issue and do not yet have a concrete answer about why the issue has been seen.

cc. @praweshsth @PG-Momik @sarinasindurakar

@emmajclegg
Copy link
Collaborator Author

Ok thanks for the update @Sanilblank - it sounds like you're still troubleshooting the issue found.

Let me know once you reach an activity limit - i.e. the point where a decision's needed on whether to invest development time on being able to bulk publish more activities. We can discuss together what's worth doing from that point.

@Sanilblank
Copy link
Collaborator

Sanilblank commented May 16, 2024

@emmajclegg We made a small adjustment to the timeout variable in the code for the bulk publishing activities job. With this change the system was able to perform the tasks of bulk publishing 25 activities at one time. Upon trying for 35 or more, we encountered an error and so we stopped the process here.
Also, we have decided to make the changes mentioned above and in this part to the production branch as it would help in making the existing publishing feature faster. The changes have already been made but are yet to be deployed.
We have created a sheet file and recorded the findings of the bulk publishing testing there. Should i share the file here? or would you prefer it being sent via email?

cc. @praweshsth @PG-Momik @sarinasindurakar

@emmajclegg
Copy link
Collaborator Author

emmajclegg commented May 17, 2024

Thanks @Sanilblank

Also, we have decided to make the changes mentioned above and in this part to the production branch as it would help in making the existing publishing feature faster. The changes have already been made but are yet to be deployed.

Do you need any feedback here, or will push changes to production when ready?

We have created a sheet file and recorded the findings of the bulk publishing testing there. Should i share the file here? or would you prefer it being sent via email?

If it includes no identifiable information on individual users or publishers then feel free to share it here, attached to the issue.


I'll have a look at the findings from the bulk publishing testing and will likely suggest we pause work on it here, to gather more evidence from users on whether raising the activity publishing limit is worth additional work. To inform that, please share any ideas you have on what development work would be needed to increase the bulk publishing limit at this point.

@Sanilblank
Copy link
Collaborator

Sanilblank commented May 20, 2024

@emmajclegg I have attached the link to the spreadsheet [here].(https://docs.google.com/spreadsheets/d/1s31ptnV_L3Qwog56QEzhpYiBQ7nspCLnpyIwnby1OiI/edit?usp=sharing)

The time has been recorded in seconds instead of minutes.
The highlighted text 'Registry validator job for activity' represents the total time taken for the validation for a certain number of activities. Each validation process is treated as a separate job by the system and based on the no of workers responsible for carrying out the job, the entire validation process actually finishes faster than the total time recorded as the jobs are carried out parallelly. Hope that makes sense.
The 'Bulk publish activities' part is treated as a single job so the time shown here is the exact time taken for the completion.

Do you need any feedback here, or will push changes to production when ready?

Basically the code is not changed all that much, we have just performed an eager loading process which tries to reduce the no of queries done to the database and improves performance. Our QA team will test these changes to ensure that no problems arise. I don't think we will need any other feedback here as not much has changed.

We are performing the testing from another branch. So, if you do not have any confusions, you can close this issue.

cc. @praweshsth @PG-Momik @sarinasindurakar

@emmajclegg
Copy link
Collaborator Author

Ok, sounds good @Sanilblank .

To check I understand the spreadsheet testing results:

  • this was testing activities with 100% completeness, 50 transactions and 50 results (where each result had 10 indicators and each indicator had 10 periods)?
  • apart from the "Registry validator job for activity" step (which happens in parallel when there's more than 1 activity), I can total seconds across rows to get the total time taken to validate and publish?

So it took 116 seconds (approx 2 minutes) to publish 1 activity:

Image

@Sanilblank
Copy link
Collaborator

@emmajclegg

his was testing activities with 100% completeness, 50 transactions and 50 results (where each result had 10 indicators and each indicator had 10 periods)?

Yes, this is correct.

apart from the "Registry validator job for activity" step (which happens in parallel when there's more than 1 activity), I can total seconds across rows to get the total time taken to validate and publish?
So it took 116 seconds (approx 2 minutes) to publish 1 activity

Not exactly, the ones in red represent the overall total time for the registry validation job and bulk publishing job respectively.
You can just use the time given by these red texts to know the total time required for the completion of the jobs. The other ones in black are just some parts of the overall job (the ones present in red) which require some time to complete and so were recorded to get extra information of how much time the execution of the parts were taking.
For the registry validator job for multiple activities, they are handled in parallel as you have mentioned and so the actual total for more than one activity is less than what is shown in the spreadsheet.
For the image you have uploaded (publishing for one activity), the total time required for validation part is 39 sec and for publishing part is 33 sec to total will be 72 sec, a little more than 1 minute.
Hope this will make things clear.

cc. @praweshsth @PG-Momik @sarinasindurakar

@emmajclegg
Copy link
Collaborator Author

Thanks @Sanilblank - that makes sense.

I'll close this issue as advised.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Estimated ODS Issue initiated by ODS
Projects
None yet
Development

No branches or pull requests

4 participants