Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize for thousands of subscribers #195

Open
Keffr3n opened this issue Nov 25, 2018 · 25 comments
Open

Optimize for thousands of subscribers #195

Keffr3n opened this issue Nov 25, 2018 · 25 comments

Comments

@Keffr3n
Copy link

Keffr3n commented Nov 25, 2018

Hello,
If I flush() more than 200 notifications at a time, the push fails. Also, each batch of 200 messages needs over 10 seconds to get sent to endpoints, so for 20k subscribers i would need 20 ~min.

Can this be optimized? What If i filter Chrome endpoints and Firefox endpoints so that each batch has same endpoint?

Did anyone managed to send more than 200 notifications at a time?

I am using php 7.2

@ozgurhangisi
Copy link

Hi,

Sending bulk webpushes heavily uses network and CPU.

Be sure to use latest version of the library (4.x versions)
Be sure to increase memory limit to at least 500 MB. ini_set('memory_limit', '500M');
Be sure to decrease timeout to 6 seconds and disable ssl verify using $push = new WebPush($authenticationData,[],6,['verify'=>false]);

I think you can send 500 webpushes in 5-7 seconds by doing the simple things above. Of course there are many things (like storing and reusing local keys, etc..) that you can do but you have to deep dive to the code to make advanced optimizations.

BTW We send around 1 million webpushes/minute using this library (and lots of servers ) but we did many optimizations to do that.

@t1gor
Copy link
Contributor

t1gor commented Nov 26, 2018

BTW We send around 1 million webpushes/minute using this library (and lots of servers ) but we did many optimizations to do that.

Could you please share those? At least like a list of things you did above.

@ozgurhangisi
Copy link

ozgurhangisi commented Nov 26, 2018

Hi,

We use 45 servers (aws t2.micro 1GB RAM) to send WebPushes. We send 500 webpushes in one flush() function and it takes 1 second to send 500 webpushes in one server. (if there is no network issue).

    • Creating Local keys and shared secret gets ~%90 of total process. So when we get new subscription we store Local Keys and shared secret in db. And we change it daily via cron job.
    • We changed subscription object to get local keys and shared secret . If these fields are empty we create local keys and shared secret like this library does. If subscription object has these fields we use them directly.
    • Also we cache VAPID headers for each endpoint url path and in one flush call (500 webpushes) we create only a few VAPID headers for each endpoint path in one flush call. (we don't create 500 VAPID headers.)
    • We use beanstalkd queue system (db queue is slow for large operations)

As I know using same local keys and shared secrets are not recommended by protocol. So we change local keys and shared secrets daily for each subscription.

These are the main things we did to make sending process faster. I hope it helps. Please let me know if you need any extra information.

@barmax
Copy link

barmax commented Dec 5, 2018

    • Creating Local keys and shared secret gets ~%90 of total process. So when we get new subscription we store Local Keys and shared secret in db. And we change it daily via cron job.
    • We changed subscription object to get local keys and shared secret . If these fields are empty we create local keys and shared secret like this library does. If subscription object has these fields we use them directly.

Thanks for the advices. Now I store the local public key and shared secret in DB.
Before I sent 1k pushes in 76 sec
After in 47 sec.
Can you get more advices for optimization?

@ozgurhangisi
Copy link

Great 👍 but 47 seconds is too long as well. I think there must be any other factor on your server. Can you check the things below :

  • Some Intel CPU's has special thing for encryption (Intel AES New Instructions (AES-NI)). Please check your server if it's support AES-NI
  • You should look at which process takes how many seconds. (It's been long time but as I remember there are some functions like prepare, encrypt, flush) please look at which function takes how many seconds. I think this is the fastest method that you can find the bottle neck.
  • Payload size is also important because it makes encryption process slower. We generally use t for title, b for body, etc... for JSON payload keys. We also don't send image full path we send only file name and add domain name and path in service worker.
  • Network speed is another important factor. Probably you get many data from db, send webpushes and delete expired tokens. In this case you have 3 major network operations. Try to seperate them and check which one takes how many seconds.
  • Did you decreased guzzle timeout to 6 seconds and set ssl verify to false ? ($push = new WebPush($authenticationData,[],6,['verify'=>false]); ) Library's default timeout is 20 sec. and if any request has problem you have to wait for 20 seconds. Try to decrease it. We use 6 sec. for 500 pushes. But you should find your optimum number.
  • If it's easy for you, you can create a new account on AWS for testing. We use t2.micro servers with 1GB RAM and we can send 500 pushes in one second. (it's around 10 $ in a month (If you are a new user it's free for 12 months) you can create new server try sending pushes and close the server. )

Please let me know if these advices works for you.

@ozgurhangisi
Copy link

And the last one : try to disable auto padding if it's enabled.

@barmax
Copy link

barmax commented Dec 5, 2018

@ozgurhangisi, thanks for the answer. I will try it.

@ozgurhangisi
Copy link

No I don't have any CPU issue. I prefer to use many small servers instead of 1 powerful server. I use cloud servers and if I want to get powerful server I should spend 400-500 $ in a month. So I can get 40 small servers for this price. If you use 1 server and if there is a problem on the server you can not send webpushes. So if you have many you wouldn't have this problem.

@t1gor
Copy link
Contributor

t1gor commented Dec 6, 2018

@Keffr3n answering your questions:

Can this be optimized?

Yes, see @ozgurhangisi's comments above. But that would require changing the lib source code, I guess.

What If i filter Chrome endpoints and Firefox endpoints so that each batch has same endpoint?

I don't think that would be much of help here 😟

Did anyone managed to send more than 200 notifications at a time?

Yes, @ozgurhangisi claims to be sending about 500 in a batch small (aws t2.micro 1GB RAM) server.

Also, my five cents:

  • be sure not to include any db/cache/network overhead into the processing loop, so that the measurement for sending time would become inaccurate. E.g. you can probably avoid using heavy ORM tool just to query the subscriptions from db table and such.
  • make sure you've got OpenSSL and openssl php extension installed, that would speed up the payload encryption. Reference: Key Creation Improvement #147

Are your questions answered? Can we close this issue?

@ozgurhangisi
Copy link

Yes it does support multi consumers. we use a few producers and 45 consumers on beanstalkd. (sometimes we do some critical changes and old and new system has to work at the same time. We use up to 90 consumers at this point. I don't know beanstalkd's limit but it can support 90 consumers at least. And It's enough for us.). As I know it has FIFO rule. It doesn't give you random item. It gives you first waiting item on the queue.

@ozgurhangisi
Copy link

We use crontab and each cron works in every minute and check the queue. All of our servers are php producers and php consumers.

We developed special cron class and we can disable/enable or slow down each server. Also we add last cron status, last startdate, last end date, memory usage, cpu usage to the db and watch them. If something is wrong system send us an automated email.

@ozgurhangisi
Copy link

Yes we get every critical data from the cron class.

@ozgurhangisi
Copy link

ozgurhangisi commented Dec 17, 2018

I haven't use v5 yet. We get expired endpoints from the results so we should get all of the results to see which endpoints are expired. So in both version you should get all of the results. I don't think it's a performance improvement. (May be it causes efficient memory usage but we send max. 500 pushes in
a flush method so we don't need this improvement.)

We use completely different version of this library. So It's not easy for us to use the new version. BTW It's my opinion. I don't have any performance results. It's just an opinion :)

@ozgurhangisi
Copy link

I didn't try anything except this library. May be @t1gor can help you to understand the problem.

@t1gor
Copy link
Contributor

t1gor commented Dec 17, 2018

@fpilee

Version 5 sends the message when I'm iterating over the result... Is this a performance improvement?

In my opinion, there are 2 answers here:

  1. Yes. the \Generator object was introduced as an attempt to save some memory while iterating on a large sets of messages.

  2. As a side effect, which was not intentional BTW, the flush() wouldn't do anything until you start iterating over the results. @Minishlink seems like this is a bug.

What If I don't want to iterate over the results.

You probably want to iterate over the results to clear the expired subs, as mentioned above.

My messages will not be send?

That is something I need to confirm. The way we use the lib in our project, the v5 vs v5 didn't change much, except for the results format, as we are checking the results. If that is true, I guess doing a simple solution like below should hot-fix it, while we're working on a proper solution.

$flushResults = $webpush->flush();
// do not save or process, just loop
iterator_to_array($flushResults);

@t1gor
Copy link
Contributor

t1gor commented Dec 18, 2018

is v5 slower (right now ) than v4 ?

Do you have a benchmark? I don't think this should be relevant as only the place of actually sending requests is changed, not the payload size or smth.

@ozgurhangisi
Copy link

It's not the data that you get on subscription. It's the data that the library creates when you send webpushes.

If you have thousands of customers and want to send webpushes so fast I recommend you to send it without payload and you can get payload in service worker. In this way you can send 1000 webpush in under a second and it's gonna be so fast.

We send webpushes with payloads but we spend about 6 months, we developed our own encryption library and we did many things to reach this speed and we use many servers queue systems, etc..

Sending webpushes with payload (to many users) spends so much cpu, ram and network resources. (Because library has to encrypt payload for each subscriber and has to send one by one to each subscriber. There is no bulk sending option for webpush with payload.)

If you need to send your webpushes so fast I recommend you to send it without payload or to use a service provider.

javiermarinros pushed a commit to javiermarinros/web-push-php that referenced this issue Jan 30, 2019
javiermarinros pushed a commit to javiermarinros/web-push-php that referenced this issue Jan 30, 2019
Minishlink pushed a commit that referenced this issue Feb 23, 2019
* Implemented VAPID header caching

As @ozgurhangisi suggested in #195

* Implemented VAPID header caching

As @ozgurhangisi suggested in #195

* Docs updated
@vzhivkov
Copy link

vzhivkov commented Mar 1, 2019

Hello, after the update of Minishlink things are way faster. However to make them even better I tried to start asynchronously a few processes. If I send all the notifications (about 30 000 now) at one process it takes about 10 minutes and no problems. If I try to send it in parallel processes some of them get this error:

PHP Fatal error: Uncaught Error: Class 'GuzzleHttp\Psr7\Response' not found in /home/www/client2/web/tools/composer-php72/vendor/guzzlehttp/guzzle/src/Handler/EasyHandle.php:76
Stack trace:
#0 /home/www/client2/web/tools/composer-php72/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php(545): GuzzleHttp\Handler\EasyHandle->createResponse()
#1 [internal function]: GuzzleHttp\Handler\CurlFactory->GuzzleHttp\Handler{closure}(Resource id #3086, '\r\n')
#2 /home/www/client2/web/tools/composer-php72/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(108): curl_multi_exec(Resource id #3089, 999)
#3 /home/www/client2/web/tools/composer-php72/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(125): GuzzleHttp\Handler\CurlMultiHandler->tick()
#4 /home/www/client2/web/tools/composer-php72/vendor/guzzlehttp/promises/src/Promise.php(246): GuzzleHttp\Handler\CurlMultiHandler->execute(true)
#5 /home/www/client2/web/tools/composer-php72/vendor/guzzl in /home/www/client2/web/tools/composer-php72/vendor/guzzlehttp/guzzle/src/Handler/EasyHandle.php on line 76

@oktayla
Copy link

oktayla commented Mar 5, 2019

We use crontab and each cron works in every minute and check the queue. All of our servers are php producers and php consumers.

We developed special cron class and we can disable/enable or slow down each server. Also we add last cron status, last startdate, last end date, memory usage, cpu usage to the db and watch them. If something is wrong system send us an automated email.

do you want to share with us ?😄

@kanaldro
Copy link

kanaldro commented Oct 9, 2019

Hello @ozgurhangisi

You talked about "Creating Local keys and shared secret gets ~%90 of total process. ".
Din you modify the Encryption class so it draws the local keys and shared secret from the DB or am I missing some parameters to send those from the WebPush object?

Thanks!

@tarekalmslmany
Copy link

Hello
If I update local key daily,
Is it safe to use same local key for all subscribers ?
Or should I use different local keys for each subscribers ?

@salimbuet09
Copy link

salimbuet09 commented Jan 4, 2022

Hi, I am so new to push.
can anyone explain this term in more details?
""Creating Local keys and shared secret gets ~%90 of total process. "."

Thanks

@agron2017
Copy link

Just the very first command this project

  • composer require minishlink/web-push

brings in 10 other libraries. If you want to optimize anything, we should optimize by reducing the number of libraries the autoload has to go through every time your website gets a page hit.

`
drwxr-xr-x 12 agron agron 4096 Nën 26 23:45 .
drwxr-xr-x 4 agron agron 4096 Nën 26 23:45 ..
-rw-r--r-- 1 agron agron 771 Nën 26 23:45 autoload.php
drwxr-xr-x 3 agron agron 4096 Nën 26 23:45 brick
drwxr-xr-x 2 agron agron 4096 Nën 26 23:45 composer
drwxr-xr-x 5 agron agron 4096 Nën 26 23:45 guzzlehttp
drwxr-xr-x 3 agron agron 4096 Nën 26 23:45 minishlink
drwxr-xr-x 3 agron agron 4096 Nën 26 23:45 paragonie
drwxr-xr-x 5 agron agron 4096 Nën 26 23:45 psr
drwxr-xr-x 3 agron agron 4096 Nën 26 23:45 ralouphie
drwxr-xr-x 4 agron agron 4096 Nën 26 23:45 spomky-labs
drwxr-xr-x 3 agron agron 4096 Nën 26 23:45 symfony
drwxr-xr-x 7 agron agron 4096 Nën 26 23:45 web-token

`

@t1gor
Copy link
Contributor

t1gor commented Nov 27, 2023

@agron2017 if you're trying to look smarter, it did not work. This lib does not run on page hits, at least it should not.

@agron2017
Copy link

@agron2017 if you're trying to look smarter, it did not work. This lib does not run on page hits, at least it should not.

And your response with a personal insult is somehow smarter, you think Igor?

Whatever. I don't have to listen your insults. I fork it and clone it and you'll never see my dumb ass in here again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants