Hack 97 Remove Direct Formatting with XSLT![]() ![]() Strip out non-style-based formatting from Word documents. A common "cleanup" task in Word is to remove any formatting from a document that hasn't been applied with a style. It's a bit of a chore within Word, but it turns out to be remarkably concise in XSLT. 10.9.1 The CodeEnter the following code in a standard text editor such as Notepad and save it as removeDirectFormatting.xsl: <xsl:stylesheet version="1.0" As in [Hack #96], this hacks uses an XSLT identity transformation. The first template rule copies all nodes that don't trigger the other two template rules. The other two are both empty, which means that nodes that match them will effectively be stripped from the document. In this case, there are two particular contexts in which you want to exclude elements: inside the w:pPr and w:rPr elementsparticularly where they occur as children of w:p and w:r elements, respectively. Child elements of w:pPr and w:rPr set various formatting properties. There is one special child of each, however: the w:pStyle and w:rStyle elements are used not to apply direct formatting, but rather to associate the current paragraph or run with a paragraph or character style, respectively. Thus, these template rules are careful to avoid stripping out the w:pStyle and w:rStyle elements. 10.9.2 Running the HackTo run this hack on a document named formatted.xml located in the same folder as the removeDirectFormatting.xsl stylesheet, type the following at a DOS command prompt in the same folder: >msxsl formatted.xml removeDirectFormatting.xsl -o no-direct-formatting.xml Evan Lenz |