[Intermediate +] Simple Tips to Speed Up Matrix - by Richard Hulse


(Nic Hubbard) #1

Just reposting a buried post by Richard Hulse that really should be in this tips section:


OK. Are you all comfortable? Then we'll begin.



This is based on our experience - your millage may vary.



There are two components to this, build time in Matrix, and time to send the pages over the wire to the browser.:



Part 1: Fixing broken stuff in Matrix.



[list=1]


  • [b]1. Make sure the sq_cache table is cleared of old entries every night.[/b]
    [*]The cache entry table expands without bounds and is never cleaned up by Matrix.
    [*]On older systems (been in production a while), or systems with high content churn this table can get very large.
    [*]As it gets larger it takes longer to add new rows, and to find and update existing rows.
    [*]One symptom of this is that system feeling slower over time, but you have not changed anything.
    [*]We delete entries older than the default cache time every night.
    [*]As a side note the file system is never cleaned up either. We clean up entries older than 1 day every night.
    [*]IMHO all production systems should take these steps.
    [*]Note that you cannot clear the cache in systems that use memcache.
    [*][b]2. Stop sq_messages being written to the DB.[/b]
    [*]On a busy system with lots of editing, this results in a huge table. Unless you really, really need it, don't log to the DB.
    [*]This info is written to the file system, and we've only had to use it twice in 5 years, so much inconvenience there.
    [*]We found that bloated sq_cache and sq_message caused huge problems on our live system when it was under high load.
    [*][b]3. Beware of plain asset listings[/b]
    [*]On Version 3.24.x the asset listing loads the entire dataset into memory. If you have these pointed at an asset with many chidren then the list is going to take a long time to run.
    [*]For example, an asset listing pointed at a folder with 38,000 news items, ordering by publish time, descending, limited to 5 items takes 20 seconds to run on a 4 proc server with 8 gigs of RAM. Using a What's New limited to 1 week ago runs on 1-2 seconds.
    [*]If you have assets listings with a lot of children these will take a whole processor on your server for the duration of the build.
    [*]If you have that asset listing nested in a design (which is a very common pattern in Matrix implementations) and you have 4 pages with expired caches that have the nesting that you have to rebuild, then your 4 cores would all be tied up for 20 seconds. This would make every other event (page builds) have to wait.
    [*]On the front end you'd see fast page builds sometimes and very slow ones at others. You would eventually see blank pages served.
    [*]The way to fix this (in 3.24 - I cannot say for later versions) is to NEVER, EVER, EVER, EVER use a plain asset listing if you can use a What's News - especially if you have a lots of regular content being added and the date constraint will not cause you problems. The shorter the better. See bug 4225.
    [*]I do not know what a reasonable limit for asset listing is, but personally I would limit it to 500 children on a 4 core system.
    [*][b]4. Use Squid and use it correctly.[/b]
    [*]If you do not have a replicated system, then you should have triggers to clear squid when Matrix content is updated. The matrix cache time should be set long, but the headers sent should be much shorter.
    [*]You would set Matrix cache time to say 6 hours, but set the system to send headers to 30 minutes. This allows you to update content in 30 minutes windows without having the penalty of rebuilding pages that were not updated. Squid will come back to Matrix for pages every 30 minutes, but this won't matter as the Matrix cache will have them.
    [*]NOTE: If you are clearing the cache table as above, take care that you keep enough entries.


    [b]Part 2: Optimise the system to reduce over-the-wire-times and page rendering.[/b]

    In most websites the time spent requesting and getting a page and all the content on it (images, css, javascript) swamps the time the server needs to build the page.

    These are the things you should be doing in any site to speed things up.

    [list=0]

    [*][b]1. Put all Javacript at the bottom of the page.[/b]
    [*]Javascript blocks the download of all other content. If it is in the head of the page, then that slows the rendering of the page.
    [*]Try to reduce the number of js files. Put all your libraries in one files and site specific stuff in another if you must).
    [*]DON'T deploy javascript that has not been minified!
    [*]If you use Google Analytics switch to the asynchronous version of their tracking code. This goes at the top of the page, instead of the bottom, but loads in parallel with other content.
    [*]Make sure any page load events run when the DOM is ready, not later.
    [*][b]2. Put CSS in the head of the document.[/b]
    [*]AND minify it. AND try to stick to just one file.
    [*]If you have a good understanding of CSS you can use the cascade ('C') to good effect to reduce the number of rules. Radio NZ has about 33k of CSS for 3 different brands. GZIPPED this is 9.8K.
    [*]1 and 2 combined means that the html will download first, then the CSS. The browser has enough information to start rendering the page. If you were on dial-up yould see images slowly appear as they downloaded and filled in the img tags, and finally the javascript would download and execute on the page.
    [*][b]3. Use GZIP/Deflate on the web server.[/b]
    [*]THis will reduce the content size down even more. It is a myth that you do not have to minify css and js because gzip/deflate will take care of it.
    [*][b]4. Set you content headers correctly.[/b]
    [*]Matrix by default does not set future headers for static content like images, css and js.
    [*]You can set a time in the future, and the browser will cache the content instead of coming back to you to get it.
    [*]This is where much of the huge time-saving is made. If most of your static content can be fectched locally, that saves a lot of time.
    [*]Be aware that once you have set a far future header the browser may not fetch new content until that time is up.
    [*]In our case we version our CSS and JS files. When we update the js and css we bump the version number in the filename so that the browser will come and get the new one.
    [*]Our CSS and JS files expire in 12 months!
    [/list]

    When you revisit one of our pages you only have to download the html, any new images, and the analytics code. That is 7 k of data to get the page instead of the original 127k. Obviously the first hit is slow, but once the CSS and JS are cached it reduces most payloads for a page by 70%.

    These the the main things that we did to speed up the site. Matrix does not make some of this easy to do, but it is worth the trouble. As I said when we did the above steps in part 2 there was a big improvement in rendering times.

    The steps in part one took place over a much longer period, mostly in response to performance problems. Our use case for Matrix made these stand out, but they will still apply to other systems, albeit the effect may not be as marked.

    The other (which is most relevant to folks in the US), is to use jQuery served from Google's CDN. This save us bandwidth costs, and there is a chance that it may already be cached by the browser or their ISP.

    GZIP on it own is probably not going to help that much. If you are seeing slow build times for uncached page it may even make it worse if your servers do not have any spare CPU time.

    I hope that all is of some use.

  • (Scott Hall) #2

    Nice, I lost my bookmark to this one a few months back, cheers!


    (Bruce Kyle) #3

    Hi all - thought i'd add some postgres lines in here as well. We have found even with auto-vacuum turned on, its good practice (maybe every quarter) to shell into postgres and perform full vacuums on all tables. Here's our procedure:


    First off - clear filesystem and DB caches (replace DBUSER/MATRIXDB with your values)



    cd /pathto/mysource_matrix

    mv cache cache.old && mkdir cache && chown -R apache.apache cache

    psql -U DBUSER -d MATRIXDB -c 'TRUNCATE sq_cache;'

    rm -rf cache.old



    Then



    su postgres

    cd /var/lib/pgsql/data/

    cp postgresql.conf postgresql.conf.backup

    nano -w postgresql.conf

    look for entry max_fsm_pages

    change 20000 to 200000

    exit and change to root user

    restart database/etc/init.d/postgresql restart



    Then



    psql MATRIXDB DBUSER (enter then type password)

    vacuum;

    vacuum analyze;

    vacuum full;



    type \q and repeat for every database (even the postgres and test DB's)



    Then undo the change to postgres.conf above and restart postgres.



    It's best to do this off-peak, we usually do it 7:00am-7:30am. Note, on a larger installation the 'vacuum full' will take sometime and may impact on end users accessing the site.



    As real world example/outcome: our matrix DB might be 2.2Gb - after vacuuming its down to 500Mb (of course then it starts to grow again).



    Another tip - after you have done this load up in a browser your most visited pages/sections to put them back in the cache (saves end-users doing it!).


    (D Jackson) #4

    Thanks very much for the tips. Could you point me towards where I can stop sq_messages being written to the DB?


    (Benjamin Pearson) #5

    [quote]
    Thanks very much for the tips. Could you point me towards where I can stop sq_messages being written to the DB?

    [/quote]



    Under System Configuration -> Global Messages, you can filter particular message types or the whole lot (*) by filling in the log messages to DB black list.


    (Raena Armitage) #6

    Given the age of the system that's being discussed in the first post (3.24, according to the comment about asset lists), what if anything has changed since then?


    #7

    Sweet post! I think no matter what the version is, these tips are good practice.


    There are some tracking scripts that we can't get away with not having in the head of our site and stop the rendering of our pages. However all are fantastic summary which will help our team - and with some of these we are trying to get some action on.



    I'll also approach our team about the asynchronous version of Google Analytics - I had spoke to the team some time ago so maybe I might be able to convince them to switch. Our tracking scripts do suck a heap of the load.



    Thanks again!


    (Raena Armitage) #8

    [quote]
    Sweet post! I think no matter what the version is, these tips are good practice.

    [/quote]



    That's true, although I'm aware that there have been some changes and improvements since then – for example there's now a regular Cron Manager job that looks at old cache entries. Does that mean we can intervene less/not at all?



    http://matrix.squiz.net/developer/dev-newsletters/2010/issues-272-275-april/273


    (M L Sanders) #9

    To what extent do the following make a difference to speed of site:

    • Number of items stored in Remap Manager
    • Number of items in Trash
    • Number of enabled/disabled Triggers



      thanks

    (Nic Hubbard) #10

    [quote]

    • Number of enabled/disabled Triggers

      [/quote]



      This greatly depends on what conditions you are using for your triggers. If every trigger was wide open and fired for every asset on the site, it would definitely slow it down. Always make sure to use a root node and asset type conditions if you can.

    (Marc Duval) #11

    We have over 63,000 remaps entries.
    How much would that be affecting performance?


    (Nic Hubbard) #12

    [quote]
    We have over 63,000 remaps entries.

    How much would that be affecting performance?

    [/quote]



    Wow, that is a lot. We make a point to clean out remaps each weeks so that it doesn't get out of control.



    I am not sure this would be a huge performance hit since Matrix probably does a quick match, but the more remaps the longer that match will take.