3.4 Performance and 3.6 Futures

A couple of weeks into this and we now have a management apache/php/msm server and a postgres/slony db pair. We also have a non-management rsync'ed apache/php/msm serving readonly pages.


MSM serves circa 5-7 pps and uses lots of pg connections. A db intensive page with an assetlist of news items barely gets to 1 pps, mainly because it's blocked by the lack of persistent db conections. trying to switch-on pconnect does sfa, while pg_pool

helps a bit, but not enough.



On the same h/w a conventional apache/php/app with mysql on the same db backend, easily does >50pps at 30% idle.



Is performance a target of the 3.6 release ?



We also appear to have no way, other than custom assets, of implementing relatively simple relationships between assets traditionally done with a pivot table.



My take is that the level of abstraction represented by the "asset" is too far away from the data model.



In the same vein, the "asset" abstraction makes the importing of data and it's relationships/links a non-trivial programing task, rather than one of simple scripting.



Does 3.6 target assets, their dynamic relationship to others, and the tools necessary to import and maintain them ?



All in all, it's a nice tool for managing sites that present lots of documents, but the amount of serious development required to do anything dynamic is prohibitive.



The performance issues are a show stopper !

I cant comment too much on your performance issues because I dont know your exact setup. All I can really say is that asset listings are very intensive pages and really require caching. Caching will help with most of your pages.


As for performance in general, we have been making changes but these are mostly to do with content editing, which is where we have identified some issues.



On the issue of assets and having to do too much development, well yes - thats the idea. We are working on import/export tools at the moment and there are plans for other tools to make asset building easier (like a GUI) but they are not in development. The goal is to be able to provide users with a series of tools they can use to build systems - not to build them for you. For example, creating an image library is not just an Image Library asset, it is a combination of several other assets configured to provide that functionality. Import/export will help with this a lot.



Another issue you might have is our data storage technique where each asset is treated in exactly the same way and has data stored in the same locations. For our system this is also our biggest advantage. The assets themselves define their interactions while the underlying architecture allows for asset relationships to be built. This kind of setup will not be addressed in 3.6 because we see nothing wrong with it.

I'm aware that the various accelerators/caches/JITs can be used to improve matters, and variations in configurations have a major impact on performance, but …


What I was doing was making a direct comparison between a PHP app filling-in a skeleton with a pages fetched from the db, and the same page emitted by MSM.



I regard an order of magnitude difference in performance is a major anomoly, but from your response, it does not appear to feature on Squiz's radar.



The suicidal performance for db intensive tasks e.g. Asset Lists of NewsItems lots of romm for investigation and improvement.



Sites can't just keep throwing hw at these issues !

Just adding more tightly coupled cpus to the page servers and/or the db very quickly hits hw limits, not to mention $$$s.



1+N server clusters with replicated data and fs are the conventional solution.

but Squiz's belief in the miracle of filers/SANs and multi-cpu db servers is a mistake that bankrupted many in the 90's boom and is still popular in enterprise intranets 8(:



The Google/Yahoo/AOL community know this doesn't work and have devised 1+N clustered solutions that do scale.



My point is that performance DOES matter, no matter what illusions the application developer community have.



And FYI, I've spent most of my 30+ years in IT building and testing systems that do scale to the 9am login, June30/July1 data input, etc, etc.

I'm not sure what part of my post gave you the impression that we dont take performace seriously or that we just throw hardware at the solution???


What I actually said was that the things we are focused on for 3.6 are backend content editing improvements because we identified problems there. We did the same for 3.4 although there were some frontend improvements too.



I mentioned caching, which may have thrown you. I'm talking about the inbuilt Cache Manager in Matrix - not additional hardware.



Matrix does A LOT of work during the content editing phase to cache content (not just the cache manager) and to create lookup tables. This is all to improve frontend performance and is why we are trying to make content editing faster.



If you look at individual assets, they do a lot to reduce the amount of queries and processing they have to do on the frontend, but they are highly dynamic tools. The asset listing does try to reduce its queries by executing its own rather than use the provided methods (very rare for our assets to do this), but it still has to do a lot of work and is very dependant on the DB for pulling dynamic content.



I'm not going to try and explain any more, you can ask specific questions if you really want. I'll just say that performance is an issue we always consider, during design, development, testing and post-production.

Well, just to prove I'm not a total grouch …


I re-enabled eaccelerator and pgpool for the matrix vhost and pushed things again.



The vanilla php/mysql app now does >200pps with page server cpu 20% idle, the same page in MSM now does 45pps but it's the db that's running out of cpu.

What would happen when I hit it with 4 pageservers … more cpus req.



The AssetList (20 summary lines of news from 1K items) scrapes in at 5pps but the db server is on it's knees.



Maybe this will do … if I'm being generous … and we completely re-think news items and keep them away from the db, but I suspect the metadata in the db will be just as bad.



I'll have to look at AssetList code (something I'd hoped to avoid) and start a process offlist to solve some of these issues.

Do you have Matrix caching enabled?


As a background, any sort of asset has to do a lot of work identifiying URLs and the contents of pages when a user accesses it. If you have caching enabled, that work (normally DB intensive) is cached so the next user doesnt have to wait as long. Our own internal benchmarking indicates a very large increase in pps (esp for asset listings).



If you decide to go through the code, I'd be more than happy to answer any questions you may have regarding the way things work or how they are cached.



Finally, I'm interested to know how PostgreSQL is running for you. We normally have to play around with various settings to get it to run at its best. Avi is probably the best person to talk on specific settings but I think the recommendation we got is to ensure PostgreSQL has large enough shared buffers to fit the DB into memory. Helped a lot during our tests.



Its been a while since we had PostgreSQL running systems along side MySQL so I cant comment on differences between those two DBs.

Caching is ON <g>


With more shared memory allocated to the db I can push the assetlist page

to 15pps … enought for today.



Just for my amusement I pushed the proxy fronting our site to

see how far away from the C10K figure we get …

I got to 2500pps before I chickened out, it's a production machine.



Thanks for putting-up with my trying to break your code, it's all part

of the trial-by-fire that I'd rather do now than with real users.



Off to see the Moody Blues !

One last thing I'll put here for anyone else reading this thread, caching doesnt do anything for site designs yet, so while the contents of the page may come out quite fast, the site design will not.


Different types of design areas will also take longer to paint (each menu adds extra time for example) so different designs will require different numbers of queries to be executed and take different times to print.



So if you want to check the time taken for a page load, you really need an almost empty site design because the design adds most of the time for a cached page.

Hi,
Following on from this thread.



I have two sites currently running on the same Matrix instalaltion. One is loading pretty snappily, the other kinda slow. On the one that's slowing down, I have a what's new asset listing, together with another asset list on each page, both set to randomly cycle between their respective contents. The what's new asset is set to the root node of my site, with about 200 odd pages in total under it)



Would these two assets listings be the reason for the slowdown? Does caching help an asset listing thats using the random list format selection option? What postgres settings would you recommend to improve speed?



I'm using PHPaccelerator, which seemed to help a little.



Any suggestions appreciated.



Cheers

Dale

Displaying an asset listing is very intensive. Due to its random nature, caching can not be used for random asset listings - so you are not getting any cache improvements in this setup.


So I'd say that the asset listings are almost certainly the reason why your second site is slower.



The number of assets under the root node would not slow things down. You'll find most of the slow-down comes from printing each asset. Depending on the version of Matrix you have and the keyword you are using in the display format, the asset listing will have to load all the assets on the page to display them.



There is not much you can do about this besides not using the random feature, or limiting the number of assets displayed.

thanks for the reply.


I only have 5 assets being display in a table for the what's new asset. And only one (which I'm using to show a single thumbnail per page) for the other asset listing. Both of which are embeded in a remote content area.



it really is quite slow with both of these on the page. Be a shame to lose the functionality, think we'll see any speed improvements with asset listings in the future?



Thanks



Dale

Im not sue the asset listing improvements we have made will speed up your system. I would love to get a log of the queries executed during on eof these pages loads though.


Is there any chance you can turn on postgres logging for a small amount of time and record the queries from a page load. We could then analyse the queries and see if we can speed any up for you. It would also allow us to work out if the DB is your bottleneck for that page, or if it is PHP processing time.

I've changed the whats new to be a 5 asset normal listing , with a more button to see the full list of what's changed. Matrix seems to be happily caching my pages now and is quite snappy, so it's definately the random asset listings that are chunking the system to a crawl.


A caching system that had the option to automically update the cache when a page changes is the logical extension of the current system and would be a great addition IMHO.



Cheers

Dale

That caching system will never work for random asset listings though. You just cant cache them. Mind you, we know of an extension to the asset listing to help cache them - which should make them faster.


Basically, it needs to cache the way Search Page does. I wont go into it - but it is on the cards for development.

Yes, I can see a random page wouldnt be cacheable, I was more referring to my desire for a gerneral page edit clearing the cache for that page. Even if it's not the default behavior, having it as option would be excellent.


Cheers

Dale