Showcase of MSM Sites

Hey all I was wondering, if it would be worthwhile starting a thread in here, which showcases some of the sites that people have developed using MSM?


Would be good to see what others are using, what little tips and tricks people have used to acheive certain things etc?



If you think this is a good Idea, maybe post a link to sites, with maybe a breif description of what things are being used etc?

Radio New Zealand


Overview



The site hosts three of our 5 brands: National Radio, Concert FM and News. The other two brands will be migrated to the system later this year.



It has audio (live, on-demand and podcasts), news (text on site and RSS) and informaton about programmes such as schedules and highlights.



Matrix



We use a bunch of asset listings (nested) to agregate news and audio content onto the pages we want.



The site make use of a number of RNZ built Perl and PHP scripts to parse and generate content in XML format, and Squiz PHP scripts to import this into the site.



Surpisingly (perhaps) the staff who publish news and audio do so from internal RNZ broadcast systems which then publish automatically into Matrix.



Extensive use is made of future statuses to release content in sync with the brodcast of programmes, and to remove it from the site after a fixed period of time.



We use a custom audio asset to handle serving multiple formats from a sinlge URL (switched via browser cookie), and this asset also allows us to seamlessly generate Podcast feeds (at the moment these use an XML design as the RSS asset type does not yet allow an XSLT file to be attached to a feed. We style the feed pages so that the double as download pages.



Design



The site was designed in Wellington by Radio New Zealand, and uses CSS for layout and styling, meeting the NZ E-government guidlines.



Traffic

About 650,00 page impressions (Neilson/Netratings) per month. We have just passed 1 million audio-on-demand items served since the site launched last October.



Technical



On the technical side we have two Dual Opteron servers in Auckland - one for the DB and one to run the web server, and a matching pair in Wellington. All 4 servers have 8 gigs of RAM and 6 x 15,000 RPM 72 gig SCSI drives running as RAID 1+0.



The AK boxes server the public site and the WN boxes are used for content publishing. The WN boxes are replicated to AK over a private data link.

Richard,
No Squid or caching external to Matrix?

How about Matrix's own caching?



PS: I'll add our details as soon as we have something to showcase…

I would be interested in hearing what other folks have used in terms of hardware for thier respective Matrix installations. At Griffith Uni we are in the last phase of hardware determination…


Thanks,

[quote]No Squid or caching external to Matrix?
How about Matrix’s own caching?

[right][post=“11264”]<{POST_SNAPBACK}>[/post][/right][/quote]



RNZ has both Squid and Matrix caching running. :slight_smile: Though, the system copes nicely enough without the Squid cache these days.

[quote]Richard,
No Squid or caching external to Matrix?

How about Matrix’s own caching?

[right][post=“11264”]<{POST_SNAPBACK}>[/post][/right][/quote]



As Avi mentioned we are using both. We found with 3.6 that Squid was very good at issolating Matrix from big traffic peaks, and does make for a slightly snappier response when hitting the site (not to say the Matrix isn’t fast already).



If we refer to some content on-air (we can have 200,000+ people listening at any one time), this can cause some really big peaks.

[quote]Traffic
About 650,00 page impressions (Neilson/Netratings) per month. We have just passed 1 million audio-on-demand items served since the site launched last October.



Technical



On the technical side we have two Dual Opteron servers in Auckland - one for the DB and one to run the web server, and a matching pair in Wellington. All 4 servers have 8 gigs of RAM and 6 x 15,000 RPM 72 gig SCSI drives running as RAID 1+0.



The AK boxes server the public site and the WN boxes are used for content publishing. The WN boxes are replicated to AK over a private data link.

[right][post=“11215”]<{POST_SNAPBACK}>[/post][/right][/quote]



Hi, I’m curious. 4 servers and 64 GB RAM seems a lot to me for a site serving 650,000 page impressions per month. What do you need all of that memory (and servers) for. Is it the audio streaming?

Actually, the site is served just from two of the boxes during normal operations. The other two are used for authoring (in Wellington) and for warm fail-over.

[quote]Hi, I’m curious. 4 servers and 64 GB RAM seems a lot to me for a site serving 650,000 page impressions per month. What do you need all of that memory (and servers) for. Is it the audio streaming?
[right][post=“11356”]<{POST_SNAPBACK}>[/post][/right][/quote]



Good questions and I’ll go into some detail for you as this may be helpful for others…



Mirroring



The mirroring of servers is for redundancy, so only one pair is seeing public traffic. The other is for content publishing.



We are about to increase the number of content publishers to around 20, and we wanted this load to be isolated from the public site. Nothing is more frustrating than having your editing session (not cached) slowed down by people searching (also not cached) on the site.



Also, we had a hardware problem on our old single server which occurred at the same time as a power outage in Auckland. We had power and connectivity, but they could not get a service engineer on site until the next day.



This was a very big new day for us, and not having a web site for 30 hours was a major problem. We don’t want to be in this position again.



The publishing servers can be deployed as the public servers with 15 minutes notice if need be.



The other thing is the Radio NZ is used during National Civil Emergencies, so we have to plans in place to make sure our infrastructure remains viable.



Specification



The reasons for going high spec were as follows:


  1. Allow for growth.



    The traffic to the site had increased 10 times since we launched only 10 months ago. I did not want to be faced with another upgrade in 10 months.



    I should add here that we do not automatically roll over servers every 3 years. We use HP servers and the first batch we installed for our broadcast operations 6 years ago have had no issues.



    Buying good quality hardware that lasts and is reliable (low maintenance costs) is a better use of public money than "other hardware strategies".


  2. Ensure fast page delivery



    Studies have shown that the speed of page loading affects users perception of the quality and credibility of the sites content.



    Obviously these are two attributes of any public broadcaster’s "brand", and so they should be reflected on the web site as well as on air.



    A lot of New Zealanders are still on dial-up too, so we don’t want a slow server at our end to contribute to their waiting time.


  3. Handling of peaks



    We are publishing content in near real-time that reflects material broadcast on our two networks. We often refer to additional material being available on the web site.



    Word of mouth can have a big impact on traffic. A few months back we had an interview in the afternoon about warcraft. Within a few hours that content had been accessed 10,000 times.



    We often have unique content that may be of niche (or general) interest. You have to allow for word of mouth!


  4. Performance Issues



    We do quite a lot of asset listing and nested asset lists, and this puts more load on the servers than just standard pages. Remember too that each page generates hits for style sheets and images. The figure quoted (650K) is complete pages delivered, not server hits. If we look at server hits it is in the region of 1 million per DAY.



    Streaming



    The streaming is carried out from a cluster of separate servers distributed across the internet. Three of these are in New Zealand (located at NZ Internet exchanges) and one in the USA. The servers are optimised for streaming traffic only.



    The servers are managed by Citylink and surprisingly also carry streaming traffic for our two biggest competitors in NZ. This may seem odd, but the sharing of common infrastructure is quite common in NZ, and the benefit is lower hardware costs and the lower bandwidth costs due to the purchase of mass bandwidth.



    The cost is significantly lower than running the streaming on the same servers.



    By optimising each part of our content delivery systems, we ensure that we have the best possible performance for the type of service.



    It also means that streaming continues even if we cannot serve pages - even if for a few minutes. Quite a lot of people listen to the streams for hours at a time, so this is quite useful. Also, during the hardware problem mentioned above, we were still able to serve a static page with links to our live streams.



    cheers

    Richard

Ah - thanks for the response

By the way - what sort of caching do you do? Is it via squid, do you use a hardware cache, are only parts of the page cached?

I am interested in knowing if using Squid server is nessissary for us, or if the matrix cache will work just fine for us. We peak out at about 10,000 visitors per day, so it is not a huge load. Do you think that using the cache is still a good idea? Even if it is just the matrix cache?

[quote]I am interested in knowing if using Squid server is nessissary for us, or if the matrix cache will work just fine for us.  We peak out at about 10,000 visitors per day, so it is not a huge load.  Do you think that using the cache is still a good idea?  Even if it is just the matrix cache?
[right][post=“14152”]<{POST_SNAPBACK}>[/post][/right][/quote]



You should always, always, always have the Matrix cache enabled, at least for Public level caching. There is absolutely no production scenario in which having the Matrix cache disabled is a good idea. :slight_smile:



As for whether you need a Squid cache or not, that’s entirely up to you and your own performance testing to determine. You need to work out whether Matrix (and its internal caching) can handle peak load sufficiently on your hardware with your configuration.

[quote]By the way - what sort of caching do you do? Is it via squid, do you use a hardware cache, are only parts of the page cached?
[right][post=“11412”]<{POST_SNAPBACK}>[/post][/right][/quote]



Radio NZ use Squid caching on a dedicated server. Entire pages are cached, including images and CSS.

What is the math that can be used to figure out the number of Cache Buckets to be used?


Also, on pages that I don't want to be cached, so I just set that in the page settings, or should I use the _nocache flag? I am just trying to figure out how everyone deals with caching pages that have items that might change often.

You shouldn't need to worry about tuning cache buckets on your site – it only really becomes an issue on very, very large sities (500,000+ assets).


The _nocache flag is usually just for authors/editors, so that they see the latest version of the page. If you are concern about currency of content after the editing process, consider setting up triggers to automatically clear the cache once a page is made live.

We have added the Matrix cache to our site, but I am still worried about performance and speed. The Squiz tech who installed our system said that our database performance was very good, but when doing a comparison, it still seems like ours, and sites running Matrix are slower serving pages. Is there something that can be done about this? Or is it just my imagination?


Here is the site I used to test this.

Keep in mind that both the Squiz site and the Radio NZ sites are using Squid caching servers, so they will be faster than native Matrix.

[quote]Keep in mind that both the Squiz site and the Radio NZ sites are using Squid caching servers, so they will be faster than native Matrix.
[right][post=“14211”]<{POST_SNAPBACK}>[/post][/right][/quote]



So I assume your running a reverse proxy (squid or other?) with cache levels set in front of your matrix installation?

[quote]So I assume your running a reverse proxy (squid or other?) with cache levels set in front of your matrix installation?
[right][post=“14214”]<{POST_SNAPBACK}>[/post][/right][/quote]



Yes, we’re running Squid as a reverse caching proxy in front of www.squiz.net and matrix.squiz.net at the moment. There was a risk of being Slashdotted a few weeks ago (which didn’t happen, sadly), so we thought we should prepare!