Server Grinding under medium load


(Tbaatar) #1

Hi,

We’ve just started running some campaigns via SnapChat and traffic to the website is roughly around 80-200 concurrent connections. Above 60 concurrent connections it adds huge load to the Server/CPU and it shoots upto 80-90%. Not sure if this is related to an issue we experienced yesterday when the server went down for 5 minutes and this morning we started experiencing the CPU load.

Running some logs from the server seems to suggest it is something to do with Postgres. Are there any best practices for doing mainteance for Postgres (9.4) and are there any specific tools from within Matrix admin we can run to clear any stuck jobs related to DB and cache tables?

I did see quite a few posts on the forum but most go back to 2009 - 2011 so I’m wondering if they’re still valid guides to follow specially the Postgres commands?

Many thanks.


(Tbaatar) #2

I performed quick load test on our production server with Matrix 5.4.3.1 running Apache and PHP 5.6 and it seems to just fallover at around 100 concurrent users per second.

Did a similar test on a new Matrix build with lower spec server with 2GB Ram and Matrix 5.5.4.2 running Apache and PHP 7.3. It handles 100 concurrent users with similar load so I don’t think it is the cache tables or logs getting to big that is causing overall issues.

What do you guys do to optimize the DB, PHP and Apache to get more concurrent users? are there any best practices?

Thanks.


(Nic Hubbard) #3

How you are doing the 100 concurrent user test?


(Tbaatar) #4

SnapChat ad :slight_smile:
Also load tested with artillery.io

Normal traffic with 50-60 visitors the load and CPU is running quite high at around 70-80%.


(Aleks Bochniak) #5

Can you share more information about your server infrastructure?

One suggestion. Upgrade your web server stack to the latest versions of everything.

—> PHP 5.6 is a dog. I’m not surprised to see PHP 7.3 performing much better.


(Tbaatar) #6

We are on Debain 8, Apache, Postgres 9.4, PHP 5.6. Server wise using the AWS large which is 100GB with 8GB Ram and 4 CPU.

As you have mentioned we will need to upgrade to modern stack and potentially separate the DB.

Even with this I’m keen to learn what sort of maintenance and house keeping should be performed or checked to keep the system running as efficient as possible.


(Aleks Bochniak) #7

First thing is to keep up to date with the supported system requirements of the matrix application stack. Make sure you regularly patch and update your matrix system, stack and host OS. Debian 8 is quite old.

If you’re using Matrix 5.5, then consider upgrading to red hat based OS for your host (eg. centos 7), nginx intead of apache and php 7.3. I would also look at upgrading Postgres to the latest supported version for your host’s OS.

Honestly, there’s no quick fix for your performance issue as the best ‘fix’ would be to upgrade your stack.


#8

Debian 8 is EOL, you should upgrade your OS at the very least


(Tbaatar) #9

Thanks for the info.

We last had Matrix upgrade done by Squiz early 2018. For 2019 we waited for Matrix 6 but since this is delayed we will need to upgrade as soon as possible.

Concern here is the stack update and software update and the cost. Currently waiting for quote from Matrix.

If we were to upgrade it ourselves would it be simple matter of backing up the database and restoring it into the new system? or more involved which would require Squiz?

The other issue is the wait time for Squiz to upgrade which is usually 3 month wait time.

One of the things we have also found with the Snapchat Ad it was setup to load page on view which meant it was hitting the server every time the ad appeared on someone’s wall and getting 100-200 concurrent hits. Since fixing the Ad implementation it is now 20-40 hits. Still server is running between 40-60% load.