We have just gone live internally with our new Matrix site and our staff have already managed to circumvent restrictions I placed on the simple edit interface.
I have turned off font editing abilities from the simple edit interface so they are not able to change fonts colours, sizes etc from there, however if they type something in a word document and then paste it into the simple edit interface any font changes they made in word get carried through into Matrix.
Is there any way that matrix can be told to remove this stuff, or not accept it in the first place? Some of our staff have interesting ideas about what looks good.
Do you have HTMLTidy enabled? It should be removing font tags and cleaning MS Word tags from the content.
It is enabled, but the font stuff is getting pasted in as styled spans e.g.
Point source discharge activities
[quote]It is enabled, but the font stuff is getting pasted in as styled spans e.g.
The html_tidy.inc file lives at: <mysource_home>/fudge/wysiwyg/plugins/htmltidy/
At Griffith we have used this approach to allow us to cater for cleaning up HTML above and beyond the default install with Matrix. I hope this gives you some ideas as to how to use tidy to suit your needs.....
Cheers,
Anthony :lol:
Hi Anthony,
That will do the job nicely.
Thanks
Ryan
Incase anyone is looking at doing something similar in the future. Here is what I have added so far.
$html = preg_replace('|<([\w]+)([^>]+?)style="([^"]+)?"([^>]+)?>|is', '<\\1\\2\\4>', $html); // Remove all inline styles $html = preg_replace('|<([\w]+)([^>]+?)xml:lang="([^"]+)?"([^>]+)?>|is', '<\\1\\2\\4>', $html); // Remove language stuff added by MS Word $html = preg_replace('|<([\w]+)([^>]+?)lang="([^"]+)?"([^>]+)?>|is', '<\\1\\2\\4>', $html); // Remove language stuff added by MS Word $html = preg_replace('|( )+|is', ' ', $html); // Trim multiple non-breaking spaces down to just one $html = preg_replace('|(.+?)|is', '\\2', $html); // Remove redundant spans $html = preg_replace('||is', '', $html); // Remove spacer paragraphs
It is not yet leaving 100% clean code, but all the style info is being stripped out and most screwy word things removed most of the time.
The biggest issue remaining that I have spotted is that it will leave empty paragraphs around sometimes, but on the second pass through HTMLTidy it will remove them. I will get back to this issue at some point.
One thing to watch out for - different versions of Word generate different HTML.
Yeah, good call.....The management of what users dump in thier pages is an ongoing process that requires periodic review and adjustments (or atleast that is our approach so far). Nice that Matrix allows for this management though (via HTML tidy for example) :)' /> I am always pleasantly surprised by this product <img src='http://forums.matrix.squiz.net/public/style_emoticons/<#EMO_DIR#>/smile.gif' class='bbc_emoticon' alt=':)<br /> <br /> A
That is why I went for more generic rules. Even word should have trouble getting around those. Stripping ALL inline styles from every element is a pretty big club to hit it with.
And I am also liking the flexibility, there are only a couple of things I have wanted to do that it hasn't let me yet.