public final class NormalizeSpace extends XMLFilterImpl2
Removes whitespace in PCDATA and attributes in an XML file. Whitespace is defined as:
#x20 hex)#x9 hex or \t escaped Java character)#xA hex or \n escaped Java character)#xD hex or \r escaped Java character)
To preserve whitespace in a specific section of the XML file, the xml:space attribute should be included in an
element and set to preserve. This attribute applies to a complete subtree beneath that element unless overridden
by another xml:space attribute at a lower level.
WARNING: We have found this attribute to successfully preserve space with the Saxon processor, but not with
Xalan-J. Therefore we strongly recommend the use of Saxon when any input data uses xml:space attributes.
By default, whitespace between elements is removed. In the following example, the space between the bold and italic words would
be deleted:
<para>some text <bold>bold words</bold> <italic>italic words</italic></para>
To change this behaviour, adding the attribute deltaxml:mixed-content="true" to the paragraph level causes the
whitespace to be normalised rather than removed. N.B. This attribute needs to be added at every level that it
is required, it is not inherited from parent elements.
NormalizeSpace should be used as the first filter in a pipeline to ensure that subsequent filters do not regard
whitespace as significant. It should be applied to either both input files or neither of them. Applying it to a single input
file may have an undesired effect.
Note: This class has not been designed to be extended, therefore to err on the side of caution, it has been declared final.
| Constructor and Description |
|---|
NormalizeSpace()
Creates a new instance of
NormalizeSpace. |
| Modifier and Type | Method and Description |
|---|---|
void |
characters(char[] ch,
int start,
int length)
Overrides the default
characters method. |
void |
elementDecl(java.lang.String name,
java.lang.String model)
Overrides the default
elementDecl method. |
void |
endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName)
Overrides the default
endElement method. |
void |
ignorableWhitespace(char[] ch,
int start,
int length)
Overrides the default
ignorableWhitespace method. |
void |
setnormalizeAttValues(java.lang.String value)
Specifies whether to normalize attribute values.
|
void |
startDocument()
Overrides the default
startDocument method. |
void |
startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName,
org.xml.sax.Attributes atts)
Overrides the default
startElement method. |
void |
startPrefixMapping(java.lang.String prefix,
java.lang.String uri)
Overrides the default
startPrefixMapping method. |
attributeDecl, comment, endCDATA, endDTD, endEntity, externalEntityDecl, getProperty, internalEntityDecl, parse, parse, setProperty, startCDATA, startDTD, startEntityendDocument, endPrefixMapping, error, fatalError, getContentHandler, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getParent, notationDecl, processingInstruction, resolveEntity, setContentHandler, setDocumentLocator, setDTDHandler, setEntityResolver, setErrorHandler, setFeature, setParent, skippedEntity, unparsedEntityDecl, warningpublic NormalizeSpace()
NormalizeSpace. An instance cannot be shared amongst pipelines, it can only receive
SAX event inputs from a single source and send events to a single output, thus two instances of this class are typically
required when used in conjunction with the DeltaXML XMLComparator.public void setnormalizeAttValues(java.lang.String value)
value - whether to normalize attribute valuespublic void elementDecl(java.lang.String name,
java.lang.String model)
throws org.xml.sax.SAXException
elementDecl method.elementDecl in interface org.xml.sax.ext.DeclHandlerelementDecl in class XMLFilterImpl2name - - the element type namemodel - - the content model as a normalized stringorg.xml.sax.SAXException - the superclass may throw an exception during processingDeclHandler.elementDecl(String, String)public void startPrefixMapping(java.lang.String prefix,
java.lang.String uri)
throws org.xml.sax.SAXException
startPrefixMapping method.startPrefixMapping in interface org.xml.sax.ContentHandlerstartPrefixMapping in class org.xml.sax.helpers.XMLFilterImplprefix - - the namespace prefixuri - - the namespace URIorg.xml.sax.SAXException - the superclass may throw an exception during processing.XMLFilterImpl.startPrefixMapping(String, String)public void startDocument()
throws org.xml.sax.SAXException
startDocument method. This method performs internal operations.startDocument in interface org.xml.sax.ContentHandlerstartDocument in class org.xml.sax.helpers.XMLFilterImplorg.xml.sax.SAXException - the superclass may throw an exception during processing.public void startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName,
org.xml.sax.Attributes atts)
throws org.xml.sax.SAXException
startElement method. This method performs internal operations.startElement in interface org.xml.sax.ContentHandlerstartElement in class org.xml.sax.helpers.XMLFilterImpluri - - the element's namespace URIlocalName - - the element's localnameqName - - the element's qualified nameatts - - the element's attributesorg.xml.sax.SAXException - the superclass may throw an exception during processing.XMLFilterImpl.startElement(String, String, String, Attributes)public void endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName)
throws org.xml.sax.SAXException
endElement method. This method performs internal operations.endElement in interface org.xml.sax.ContentHandlerendElement in class org.xml.sax.helpers.XMLFilterImpluri - - the element's namespace URIlocalName - - the element's localnameqName - - the element's qualified nameorg.xml.sax.SAXException - the superclass may throw an exception during processing.XMLFilterImpl.endElement(String, String, String)public void characters(char[] ch,
int start,
int length)
throws org.xml.sax.SAXException
characters method. This version of the method removes whitespace from PCDATA within the
XML file unless the xml:space attribute is set to preserve.characters in interface org.xml.sax.ContentHandlercharacters in class org.xml.sax.helpers.XMLFilterImplch - - an array of charactersstart - - the starting position in the arraylength - - the number of characters to use from the arrayorg.xml.sax.SAXException - the superclass may throw an exception during processing.XMLFilterImpl.characters(char[], int, int)public void ignorableWhitespace(char[] ch,
int start,
int length)
throws org.xml.sax.SAXException
ignorableWhitespace method. When a DTD is present in the XML file, inter-element
whitespace causes an ignorableWhitespace SAXEvent to occur rather then the normal characters SAXEvent. This method ensures
that such space is removed should a DTD be present.ignorableWhitespace in interface org.xml.sax.ContentHandlerignorableWhitespace in class org.xml.sax.helpers.XMLFilterImplch - - an array of charactersstart - - the starting position in the arraylength - - the number of characters to use from the arrayorg.xml.sax.SAXException - the superclass may thrown an exception during processing.XMLFilterImpl.ignorableWhitespace(char[], int, int)