Hi there… we are currently trying to get a standalone script written that recursively scans our current web content, and forces a clean (amost other things) and converts to xhtml output…
I currently have these options set for tidy, but what I would like to know what others should I include so that the script does not fall over on frontpage created pages…
Basically I want to force it to clean the content and overwrite the current files… so that when we cut and past into matric, some of the crap has already been eliminated…
Any additional parameters would be apprecaited.
Cheers
output-xhtml: yes
force-output: yes
word-2000: yes
indent: auto
newline: LF
clean: yes
drop-font-tags: yes
drop-proprietary-attributes: yes
fix-backslash: yes
fix-uri: yes
write-back: yes
indent-spaces: 2
wrap: 72
markup: yes
output-xml: no
input-xml: no
show-warnings: yes
numeric-entities: yes
quote-marks: yes
quote-nbsp: yes
quote-ampersand: yes
break-before-br: no
uppercase-tags: no
uppercase-attributes: no
char-encoding: latin1
new-inline-tags: cfif, cfelse, math, mroot,
mrow, mi, mn, mo, msqrt, mfrac, msubsup, munderover,
munder, mover, mmultiscripts, msup, msub, mtext,
mprescripts, mtable, mtr, mtd, mth
new-blocklevel-tags: cfoutput, cfquery
new-empty-tags: cfelse

