Cutting and pasting word documents using a web page & JS [intermediate+]


(Nick Cowie) #1

Do your authors write in Word and want to paste directly into Matrix?


Well this requires a few extra steps,

1: paste the word content into another web page (code below)

2: press a button (if the user is using IE, there is a workaround for proper browsers)

3: press the code view button in the editor

4: paste the text into the editor

5: press the code view button in the editor


    
</script>
<style>
#textToClean, #cleanedText {border: 3px solid #999; margin: 1em; padding: 1em;}
</style>
</head>
<body onLoad="makeEditable()">
<p>Paste your content in the box below then  <button type="button" onClick="fixWord()" id="cleanUpButton" >press this button</button>  to process the text and for the cleaned up text in HTML format to be placed in your clipboard.</p>
<h2>Word Text</h2>
<div id="textToClean">
<p>Paste text in here</p>
</div>
<h2>Processed Text</h2>
<div id="cleanedText">
<p>Once the text is processed you will see a preview here, you could cut and paste from here, but Internet Explorer misbehaves.</p></pre>

(Pw) #2

This functionality is nice, sad but it doesn't work on IE8.


In Mysource Matrix WYSWIG there is "Replace text" with some options which clean Word files. As I tested it - the results were not good enough (some strange Word elements stayed after cleaning).



Simple edit users just love to copy from Word into CMS. Probably I will use somehow this functionality for my simple edit users.



Thanks, nickobec


(Jim Smith) #3

Hm - Sorry to ask but, Why not just copy into notepad? Only requires a tiny bit of re-styling.


Or if possible, dreamweaver, generally cleans all the bad styles out…


(Aleks Bochniak) #4

HTML tidy?


(Rhulse) #5

[quote]
HTML tidy?

[/quote]

We’ve been cleaning/parsing Word content for years.



The approach we now take is to pre-clean Word or email content before pasting it into Matrix.



We have a form with a drop down where you select the type of content (the software can format certain types of content).



Here is the code we use if anyone else is interested. There is also a Here is the code for doing contextual pre-formating of known content types (from Word).



The form uses the CK Editor plugin. The latest version of this does a pretty good job of cleaning up Word HTML on it own.



This code saves us hours of time every week, and is licensed MIT so feel free to grab it and use it.





Cheers,



Richard