View Full Version : Htmleditor override to remove office document markup from pasted text

28 Sep 2009, 10:35 AM
99.9% of this was borrowed from VinylFox's Htmleditor plugins so I could use it with ExtJS 2.1. Here's his thread for ExtJS 3 : http://www.extjs.com/forum/showthread.php?t=72106

This does essentially what his word paste plugin does but doesn't have the button to turn it on and off.

Ext.override(Ext.form.HtmlEditor, {
cleanHtml : function(html) {
// remove microsoft jibberish using regex jibberish
var removals = [/&nbsp;/ig, /[\r\n]/g, /<(xml|style)[^>]*>.*?<\/\1>/ig, /<\/?(meta|object|span)[^>]*>/ig, /<\/?[A-Z0-9]*:[A-Z]*[^>]*>/ig,
/(lang|class|type|href|name|title|id|clear)=\"[^\"]*\"/ig, /style=(\'\'|\"\")/ig, /<![\[-].*?-*>/g, /MsoNormal/g, /<\\?\?xml[^>]*>/g, /<\/?o:p[^>]*>/g, /<\/?v:[^>]*>/g, /<\/?o:[^>]*>/g,
/<\/?st1:[^>]*>/g, /&nbsp;/g, /<\/?SPAN[^>]*>/g, /<\/?FONT[^>]*>/g, /<\/?STRONG[^>]*>/g, /<\/?H1[^>]*>/g, /<\/?H2[^>]*>/g, /<\/?H3[^>]*>/g, /<\/?H4[^>]*>/g, /<\/?H5[^>]*>/g, /<\/?H6[^>]*>/g,
/<\/?P[^>]*><\/P>/g, /<!--(.*)-->/g, /<!--(.*)>/g, /<!(.*)-->/g, /<\\?\?xml[^>]*>/g, /<\/?o:p[^>]*>/g, /<\/?v:[^>]*>/g, /<\/?o:[^>]*>/g, /<\/?st1:[^>]*>/g, /style=\"[^\"]*\"/g,
/style=\'[^\"]*\'/g, /lang=\"[^\"]*\"/g, /lang=\'[^\"]*\'/g, /class=\"[^\"]*\"/g, /class=\'[^\"]*\'/g, /type=\"[^\"]*\"/g, /type=\'[^\"]*\'/g, /href=\'#[^\"]*\'/g, /href=\"#[^\"]*\"/g,
/name=\"[^\"]*\"/g, /name=\'[^\"]*\'/g, / clear=\"all\"/g, /id=\"[^\"]*\"/g, /title=\"[^\"]*\"/g, /<span[^>]*>/g, /<\/?span[^>]*>/g, /class=/g];

Ext.each(removals, function(s) {
html = html.replace(s, "");

// keep the divs in paragraphs
html = html.replace(/<div[^>]*>/g, "<p>");
html = html.replace(/<\/?div[^>]*>/g, "</p>");
return html;


28 Sep 2009, 10:44 AM
I think the only thing you need to make my plugin set work with 2.x is the isObject method of the Ext class. There is an override around for this.