public class LexicalPreservationConfig
extends java.lang.Object
Configures the way that Lexical Preservation is applied during the document loading, preservation processing and output/serialisation phases of a pipelined comparison. Here, the:
Normally an XML parser or 'XML processor' (a term defined in the XML specification) disregards 'doctype', 'ignorable
whitespace', 'cdata Sections' and other 'lexical' aspects of the XML input during processing. Both the
PipelinedComparatorS9
and the DocumentComparator
can be configured to convert the 'lexical' items into markup
that can be processed by the underpinning comparator (i.e. element, attribute, and text nodes). Note that comments and
processing instructions are also treated as 'lexical' aspects of the input, as the underpinning comparator ignores them.
Note that some aspects of XML are not reported by an XML Parser and so we cannot ensure complete preservation of all lexical aspects of an input file. Some of these aspects include:
Some of the things that can be preserved include:
This configuration class should be set up as required and then passed as the parameter to the
PipelinedComparatorS9.setLexicalPreservationConfig(LexicalPreservationConfig)
method.
Some marked up items cannot be placed at their original locations whilst maintaining a well-formed result. This primarily relates to information outside the root element. For these areas the markup is moved inside the root element and contained in the first few children of the root element or the last child. Generally only comments and processing instructions can appear outside the root element, however the internal subset contains other items, as does the XML declaration. When all types of information are present the output will have this structure:
<root> <preserve:xmldecl xml-version="1.0" encoding="UTF-8" standalone="no"/> <preserve:comments-and-pis region="BEFORE_DTD"> ... </preserve:comments-and-pis> <preserve:doctype> ... </preserve:doctype> <preserve:comments-and-pis region="AFTER_DTD"> ... </preserve:comments-and-pis> <child> first child element of original root element ... </child> ... <child> last child element of original root element ... </child> <preserve:comments-and-pis region="AFTER_BODY"> ... </preserve:comments-and-pis> </root>
Three of the settings provided for handling entities interact in various ways. Some observations to note include:
LexicalPreservationConfig.setPreserveNestedEntityReferences(boolean)
to true
only makes sense when both
LexicalPreservationConfig.setPreserveEntityReferences(boolean)
and LexicalPreservationConfig.setPreserveEntityReplacementText(boolean)
are also true
LexicalPreservationConfig.setPreserveEntityReferences(boolean)
and LexicalPreservationConfig.setPreserveEntityReplacementText(boolean)
to
false
means that information is lost completely and this is not recommendedLexcial preservation creates elements in several namespaces, the following table provides a summary:
Usual prefix | Namespace URI | Description |
---|---|---|
preserve | http://www.deltaxml.com/ns/preserve | All generated markup uses this namespace unless one of those mentioned below |
er | http://www.deltaxml.com/ns/entity-references | Entity references are represented as elements using this namespace and a local name based on the entity name |
pi | http://www.deltaxml.com/ns/processing-instructions | Processing instructions are represented as elements using this namespace and a local name based on the PI target |
Lexical preservation is now a feature setting on a PipelinedComparatorS9
, rather than being an XMLFilter that is added
at the start of the input pipelines. This method of preserving items replaces the previous LexicalPreservation filter which has
been removed.
Modifier and Type | Field and Description |
---|---|
static boolean |
ENABLED_PROP_DVAL
The default value of the configuration property for enabling lexical preservation during comparator construction.
|
static java.lang.String |
ENABLED_PROP_NAME
The name of the configuration property for enabling lexical preservation during comparator construction.
|
Constructor and Description |
---|
LexicalPreservationConfig()
Creates a new Configuration for lexical preservation.
|
LexicalPreservationConfig(LexicalPreservationConfig base)
Creates a new Configuration for lexical preservation using the specified mode.
|
LexicalPreservationConfig(PresetPreservationMode preserveItemSetName)
Creates a new Configuration for lexical preservation using the specified mode.
|
LexicalPreservationConfig(java.lang.String preserveItemSetName)
Creates a new Configuration for lexical preservation using the specified mode.
|
Modifier and Type | Method and Description |
---|---|
AdvancedEntityRefUsage |
getAdvancedEntityReferenceUsage()
Return whether entity references or their replacement text appear in the output.
|
PreservationOutputType |
getCDATAOutputType()
Return the current
PreservationOutputType for CDATA sections. |
PreservationProcessingMode |
getCDATAProcessingMode()
Return the current
PreservationProcessingMode for CDATA blocks. |
PreservationOutputType |
getCommentOutputType()
Return the current
PreservationOutputType for comments. |
PreservationProcessingMode |
getCommentProcessingMode()
Return the current
PreservationProcessingMode for comments. |
PreservationOutputType |
getDefaultedAttributeInfoOutputType()
Return the current
PreservationOutputType for defaulted attributes. |
DefaultAttProcessingMode |
getDefaultedAttributeInfoProcessingMode()
Return the current
DefaultAttProcessingMode for defaulted attributes. |
PreservationOutputType |
getDefaultOutputType()
Return the current default
PreservationOutputType for preserved items. |
PreservationProcessingMode |
getDefaultProcessingMode()
Return the current default
PreservationProcessingMode . |
PreservationOutputType |
getDoctypeOutputType()
Return the current
PreservationOutputType for document type (and internal subset). |
PreservationProcessingMode |
getDoctypeProcessingMode()
Return the current
PreservationProcessingMode for doctype declarations. |
PreservationOutputType |
getEntityRefOutputType()
Return the current
PreservationOutputType for entity references. |
PreservationProcessingMode |
getEntityRefProcessingMode()
Return the current
PreservationProcessingMode for entity references. |
PreservationOutputType |
getIgnorableWhitespaceOutputType()
Return the current
PreservationOutputType for ignorable whitespace. |
PreservationProcessingMode |
getIgnorableWhitespaceProcessingMode()
Return the current
PreservationProcessingMode for ignorable whitespace. |
PreservationOutputType |
getOuterPiAndCommentOutputType()
Return the
PreservationOutputType to use for changes to processing instructions and comments outside the root element
(and outside the internal subset). |
PreservationProcessingMode |
getOuterPiAndCommentProcessingMode()
Return the current
PreservationProcessingMode for processing instructions and comments outside the root element. |
boolean |
getPreserveCDATA()
Reports the current CDATA marker status.
|
boolean |
getPreserveComments()
Reports the current comment conversion/preservation status.
|
boolean |
getPreserveContentModel()
Reports the current setting of the content model preservation feature.
|
boolean |
getPreserveDefaultAttributeInfo()
Reports whether information on which attributes were provided by a DTD is being stored.
|
boolean |
getPreservedEntityReferences()
Reports the current setting of the entity reference preservation feature.
|
boolean |
getPreserveDoctype()
Reports the current setting for DTD internal subset preservation.
|
boolean |
getPreserveDocumentLocation()
Reports the current setting of the document location preservation feature.
|
boolean |
getPreserveEntityReplacementText()
Reports whether entity replacement text is preserved.
|
boolean |
getPreserveIgnorableWhitespace()
Reports the current whitespace preservation setting.
|
boolean |
getPreserveNestedEntityReferences()
Reports the current nested entity references setting.
|
boolean |
getPreserveProcessingInstructions()
Reports the current processing instructions conversion status.
|
boolean |
getPreserveXMLDeclaration()
Reports whether XML Declarations are currently converted into markup.
|
PreservationOutputType |
getProcessingInstructionOutputType()
Return the current
PreservationOutputType for processing instructions. |
PreservationProcessingMode |
getProcessingInstructionProcessingMode()
Return the current
PreservationProcessingMode for processing instructions. |
PreservationOutputType |
getXMLDeclarationOutputType()
Return the current
PreservationOutputType for XML declaration. |
PreservationProcessingMode |
getXMLDeclarationProcessingMode()
Return the current
PreservationProcessingMode for xml declaration changes. |
boolean |
isPreservingItems()
States whether this
LexicalPreservationConfig object is preserving any items on the inputs. |
void |
setAdvancedEntityReferenceUsage(AdvancedEntityRefUsage usageMode)
Specify advanced behaviour of entity reference processing.
|
void |
setAllPreservationItems(boolean preserve)
Sets the preservation status of all PreserveItems.
|
void |
setCDATAOutputType(PreservationOutputType type)
Set the
PreservationOutputType for changes to CDATA sections. |
void |
setCDATAProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode to use for changes to CDATA blocks. |
void |
setCommentOutputType(PreservationOutputType type)
Set the
PreservationOutputType for changes to comments. |
void |
setCommentProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode to use for changes to comments. |
void |
setDefaultAttributeInfoOutputType(PreservationOutputType type)
Set the
PreservationOutputType for changes to defaulted attributes. |
void |
setDefaultAttributeInfoProcessingMode(DefaultAttProcessingMode mode)
Set the
DefaultAttProcessingMode to use for defaulted attributes. |
void |
setDefaultOutputType(PreservationOutputType type)
Set the default
PreservationOutputType for changes to preserved items. |
void |
setDefaultProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode to use as the default behaviour for changed lexical preservation items. |
void |
setDoctypeOutputType(PreservationOutputType type)
Set the
PreservationOutputType for changes to document type (and internal subset). |
void |
setDoctypeProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode to use for changes to doctype declarations. |
void |
setEntityRefOutputType(PreservationOutputType type)
Set the
PreservationOutputType for changes to entity references. |
void |
setEntityRefProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode for changes to entity references. |
void |
setIgnorableWhitespaceOutputType(PreservationOutputType type)
Set the
PreservationOutputType for changes to ignorable whitespace. |
void |
setIgnorableWhitespaceProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode for changes to ignorable whitespace. |
void |
setOuterPiAndCommentOutputType(PreservationOutputType type)
Set the
PreservationOutputType to use for changes to processing instructions and comments outside the root element
(and outside the internal subset). |
void |
setOuterPiAndCommentProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode to use for changes to processing instructions and comments outside the root
element. |
void |
setPreserveCDATA(boolean preserve)
Controls whether marker elements are inserted to record where CDATA sections were used.
|
void |
setPreserveComments(boolean preserve)
Controls whether XML Comments are converted into XML markup.
|
void |
setPreserveContentModel(boolean preserve)
Controls whether markup used to record content model information is persisted.
|
void |
setPreserveDefaultAttributeInfo(boolean preserve)
Adds information about which attributes arose through the use of default attribute values in the DTD, as opposed to having
explicit values.
|
void |
setPreserveDoctype(boolean preserve)
Controls whether items in XML DOCTYPE declaration and the DTD internal subset are converted into XML markup.
|
void |
setPreserveDocumentLocation(boolean preserve)
Controls whether markup is added to record the document location information
|
void |
setPreserveEntityReferences(boolean preserve)
Controls whether markup is used to record where entity references were used.
|
void |
setPreserveEntityReplacementText(boolean preserve)
Controls whether entity replacement text is preserved by this filter.
|
void |
setPreserveIgnorableWhitespace(boolean preserve)
Controls whether ignorableWhitespace is converted into standard character data.
|
void |
setPreserveNestedEntityReferences(boolean preserve)
Controls whether entity references are included in entity replacement text results.
|
void |
setPreserveProcessingInstructions(boolean preserve)
Controls whether processing instructions are converted into XML markup.
|
void |
setPreserveXMLDeclaration(boolean preserve)
Controls whether XML Declaration related information is converted into XML markup.
|
void |
setProcessingInstructionOutputType(PreservationOutputType type)
Set the
PreservationOutputType for changes to processing instructions. |
void |
setProcessingInstructionProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode for changes to processing instructions. |
void |
setXMLDeclarationOutputType(PreservationOutputType type)
Set the
PreservationOutputType for changes to XML declaration. |
void |
setXMLDeclarationProcessingMode(PreservationProcessingMode mode)
Set the
PreservationProcessingMode to use for changes to the xml declaration. |
java.util.EnumSet<LexicalPreservationBase.PreserveItem> |
toPreserveItemEnumSet()
This method is for internal use.
|
public static final java.lang.String ENABLED_PROP_NAME
public static final boolean ENABLED_PROP_DVAL
public LexicalPreservationConfig()
Creates a new Configuration for lexical preservation.
Note: the default behaviour of the lexical preservation is PresetPreservationMode.ROUND_TRIP
, unless this has been
overridden by supplying configuration properties as discussed in the
Lexical Preservation
Guide.
public LexicalPreservationConfig(java.lang.String preserveItemSetName)
Creates a new Configuration for lexical preservation using the specified mode.
Note: the default behaviour of the lexical preservation is PresetPreservationMode.ROUND_TRIP
, unless this has been
overridden by supplying configuration properties as discussed in the
Lexical Preservation
Guide.
preserveItemSetName
- a String specifying the preservation mode to create the Configuration with. Invalid values result
in the default setting of PresetPreservationMode.ROUND_TRIP
being used.public LexicalPreservationConfig(PresetPreservationMode preserveItemSetName)
Creates a new Configuration for lexical preservation using the specified mode.
Note: the default behaviour of the lexical preservation is PresetPreservationMode.ROUND_TRIP
, unless this has been
overridden by supplying configuration properties as discussed in the
Lexical Preservation
Guide.
preserveItemSetName
- the preservation mode to create the Configuration with, or null
for the default
behaviour.public LexicalPreservationConfig(LexicalPreservationConfig base)
base
- The lexical preservation object to be used as the base configuration.public boolean isPreservingItems()
States whether this LexicalPreservationConfig
object is preserving any items on the inputs.
This is a shorthand way of determining if any of the getPreserve...()
methods return true
.
public void setPreserveDoctype(boolean preserve)
Controls whether items in XML DOCTYPE declaration and the DTD internal subset are converted into XML markup.
The XML DOCTYPE declaration and associated internal subset can be converted into XML Markup for subsequent pipeline comparison and processing. The use of an external DTD is recorded, and as well as conversion the parser will validate the content using any declarations specified in an external DTD or internal DTD subset.
For example, when true
has been passed to this method, the following DOCTYPE, in an input file:
<!DOCTYPE article SYSTEM "http://www.docbook.org/xml/4.5/docbookx.dtd" [ <!ENTITY genEnt "<emphasis role='bold'>warning</emphasis>"> ]>
will be converted into output containing:
<preserve:doctype name="article" systemId="http://www.docbook.org/xml/4.5/docbookx.dtd"> <preserve:internalParsedGeneralEntityDecl name="genEnt" deltaxml:key="entity_gen_genEnt" value="an !(*lt!)emphasis role=!(*apos!)bold!(*apos!)!(*gt!)internal (parsed) general!(*lt!)/emphasis!(*gt!) entity."/> </preserve:doctype>
preserve
- if true
internal subset items are converted and preservedpublic boolean getPreserveDoctype()
Reports the current setting for DTD internal subset preservation.
true
if internal subset items are currently being preservedLexicalPreservationConfig.setPreserveDoctype(boolean)
public void setPreserveXMLDeclaration(boolean preserve)
Controls whether XML Declaration related information is converted into XML markup.
An XML declaration can specify the encoding, XML version and whether an XML file is 'standalone'. This information can be explicitly specified in an XML file but if it is not presebt, the parser will determine the information based on rules defined in the XML Specifications (see below).
The input settings will be preserved and used in the result file as long as they are not overridden with pipeline outputProperties e.g. if the inputs specify an encoding of UTF-16BE, this will be used as the encoding for files written by the comparison. However, if the encoding output property is set to UTF-8, the result file will be encoded using UTF-8.
N.B. Having no XML declaration in the inputs does not stop one from being output in the result file. The result will always contain a declaration specifying the XML version and File encoding unless the omit-xml-declaration output property has been set to 'yes'.
For example, when this method has been passed a value of true
, an XML file with this declaration:
<?xml version="1.0" encoding="UTF-8"?>
will produce output containing:
<preserve:xmldecl xml-version="1.0" encoding="UTF-8"/>
preserve
- whether or not to preserve XML declaration settingsPipelinedComparatorS9.setOutputProperty(net.sf.saxon.s9api.Serializer.Property, java.lang.String)
,
W3C XML 1.0 Specification (Fifth Edition), Appendix F:
Autodetection of Character Encodings,
W3C XML 1.1 Specification (Second Edition), Section 4.3.4:
Version Information in Entitiespublic boolean getPreserveXMLDeclaration()
Reports whether XML Declarations are currently converted into markup.
true
when XML declaration information is convertedpublic void setPreserveDefaultAttributeInfo(boolean preserve)
Adds information about which attributes arose through the use of default attribute values in the DTD, as opposed to having explicit values.
A DTD can contain attribute definitions such as the following:
<!ELEMENT myElement> <!ATTLIST myElement myAttribute CDATA "defaultValue">
When a value is defined in quotes like this, and the XML document is associated with this DTD using the DOCTYPE declaration,
the attribute myAttribute
will be present on every myElement
element whether it has been explicitly
added or not. If it is added by the parser, it will have the default value of defaultValue
as defined in the
DTD.
When true
has been passed to this method, attributes that have default values assigned by the parser in this way
will be marked by adding an attribute to the element like this:
<myElement myAttribute="defaultValue" preserve:defaultAttributes="{}myAttribute">
where the attribute name is encoded in the form {URI}localName
.
preserve
- if true information about defaulted attributes is added to the markpublic boolean getPreserveDefaultAttributeInfo()
Reports whether information on which attributes were provided by a DTD is being stored.
public void setPreserveEntityReferences(boolean preserve)
Controls whether markup is used to record where entity references were used.
An XML start-tag and end-tag will usually mark the position of the entity reference. This marker element will contain, by default, the entity replacement text as it was expanded by the parser.
Here is an example showing use of the XML predefined ampersand entity, more complex entities are also supported, including longer sequences of text and markup (elements):
<para>Hide & seek</para>
When this method is configured to false
, the output will be:
<para>Hide & seek</para>
The parser converts the entity into a literal unicode character (which may be serialized back into an entity at the end of
the pipeline). With the setting true
we see an XML element (using the 'er
' namespace and local-name
from the entity name) which records the details of the entity reference:
<para>Hide <er:amp>&</er:amp> seek</para>
Please see the method description of LexicalPreservationConfig.setPreserveEntityReplacementText(boolean)
for a more detailed description of
how these two settings interact.
preserve
- whether or not to record where entity references were usedLexicalPreservationConfig.setPreserveEntityReplacementText(boolean)
public boolean getPreservedEntityReferences()
public void setPreserveContentModel(boolean preserve)
Controls whether markup used to record content model information is persisted.
preserve
- whether or not to persist content model informationpublic boolean getPreserveContentModel()
public void setPreserveDocumentLocation(boolean preserve)
Controls whether markup is added to record the document location information
The document location is stored by adding an xml:base attribute to the root element.
N.B. If the xml:base attribute is already present on the parsed input, it will NOT be replaced.
preserve
- whether or not to preserve document location informationpublic boolean getPreserveDocumentLocation()
public void setPreserveNestedEntityReferences(boolean preserve)
Controls whether entity references are included in entity replacement text results.
A entity definition can itself contain an entity reference (general or parameter) and this method controls whether such entity references appear in the output.
When false
, entity reference elements will not be nested. Conversely when this parameter is set to
true
, the result may include nested entity reference elements (in the er
namespace).
The nesting corresponds to the use of entity references in the definition of other entity references.
preserve
- if true
, nested entities are convertedpublic boolean getPreserveNestedEntityReferences()
Reports the current nested entity references setting.
true
if nested entity references are convertedLexicalPreservationConfig.setPreserveNestedEntityReferences(boolean)
public void setPreserveEntityReplacementText(boolean preserve)
Controls whether entity replacement text is preserved by this filter.
As well as being able to use an element to describe the details of an element reference it is also possible to control whether the replacement text is preserved in the output.
The term 'Entity Replacement Text' is used in the W3C XML Specification and section 4.5 describes the process of replacing entity references.
For the follow example input markup:
<para>Hide & seek</para>
This following table documents the effects of the preservation settings for entity replacement text and also entity references:
setPreserveEntityReplacementText | setPreserveEntityReferences | result |
---|---|---|
true | true | <para>Hide <er:amp>&</er:amp> seek</para> |
true | false | <para>Hide & seek</para> |
false | true | <para>Hide <er:amp></er:amp> seek</para> |
false | false | <para>Hide seek</para> |
The entity replacement text for the &
entity is the unicode ampersand character (U+0026) and this
character appears in the results generated by this filter. An output filter or serializer at the end of a pipeline may
subsequently re-serialize this character back into an entity reference such as &
or a character
reference &
so that the pipeline result is well-formed.
preserve
- when true
, replacement text is preservedpublic boolean getPreserveEntityReplacementText()
Reports whether entity replacement text is preserved.
LexicalPreservationConfig.setPreserveEntityReplacementText(boolean)
public void setPreserveCDATA(boolean preserve)
Controls whether marker elements are inserted to record where CDATA sections were used.
The CDATA shorthand removes the need for entities and is a useful shorthand when authoring. The characters that an XML processor receives are identical irrespective of whether CDATA or entities are used.
This feature uses an element to record the position of CDATA sections in the input file.
<para><![CDATA[Hide & seek]]></para>
When this feature is true
the output of this filter will be:
<para><preserve:cdata>Hide & seek</preserve:cdata></para>
When false
, the text content is as before, only the marker element is missing:
<para>Hide & seek</para>
preserve
- when true
, insert CDATA markerspublic boolean getPreserveCDATA()
Reports the current CDATA marker status.
true
when CDATA is markedLexicalPreservationConfig.setPreserveCDATA(boolean)
public void setPreserveComments(boolean preserve)
Controls whether XML Comments are converted into XML markup.
For example, when the preserve parameter is true
, with this input:
<!-- add another section here -->
the comment would be converted into:
<preserve:comment> add another section here </preserve:comment>
When the preserve parameter is false
, i.e. comments are not converted, it is still possible for the subsequent
filters in a filter chain to receive comment events. This depends on whether the subsequent filters have configured the use
of a LexicalHandler
and/or extend XMLFilterImpl2
or XMLFilterImpl3
.
The LexicalPreservationConfig.setPreserveProcessingInstructions(boolean)
documentation also describes in which contexts the comment markup
appears in the result.
preserve
- controls whether comments are convertedLexicalPreservationConfig.setPreserveProcessingInstructions(boolean)
public boolean getPreserveComments()
Reports the current comment conversion/preservation status.
true
if comments are converted to markup, false
otherwisepublic void setPreserveProcessingInstructions(boolean preserve)
Controls whether processing instructions are converted into XML markup.
For example, with this input and setting of true
for the preserve parameter:
<?dbfo table-width="50%"?>
the processing instruction would be converted into:
<pi:dbfo>table-width="50%"</pi:dbfo>
When the processing instructions are contained within the root element of an XML file they appear in their converted form as
in the example above. However, when outside of the root element they need to be moved and they will then appear as a child of
either <preserve:comments-and-pis>
element with a range attribute indicating their position, or as a child
of the <preserve:doctype>
element.
preserve
- controls whether XML Processing Instructions are convertedpublic boolean getPreserveProcessingInstructions()
Reports the current processing instructions conversion status.
LexicalPreservationConfig.setPreserveProcessingInstructions(boolean)
public void setPreserveIgnorableWhitespace(boolean preserve)
Controls whether ignorableWhitespace is converted into standard character data.
Ignorable whitespace is reported by parsers when the input file is associated with a DTD. The DTD allows a parser to
differentiate between mixed content and element-only content where ignorable whitespace is reported. A true
value will allow all whitespace to flow through a comparison pipeline including what is typically regarded as 'indentation
whitespace' in XML and this may be important when round trip processing is required.
preserve
- specifies whether whitespace is converted to characterspublic boolean getPreserveIgnorableWhitespace()
Reports the current whitespace preservation setting.
true
if whitespace is converted to comments.LexicalPreservationConfig.setPreserveIgnorableWhitespace(boolean)
public void setAllPreservationItems(boolean preserve)
Sets the preservation status of all PreserveItems.
This method provides a shorthand way of setting all PreserveItems to the same value. It is useful if you only wish to set one
or two of the items to be preserved. If this is the case, pass false
to this method and subsequently pass
true
to the individual set methods for the items you wish to preserve.
preserve
- if true
, preserve all PreserveItems, if false
, preserve nonepublic void setDefaultProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
to use as the default behaviour for changed lexical preservation items. Note this
does not affect the setting of the default attribute processing mode.mode
- the PreservationProcessingMode
to use as the default behaviour for changes to lexical preservation itemspublic PreservationProcessingMode getDefaultProcessingMode()
PreservationProcessingMode
.PreservationProcessingMode
detailing the default behaviour for changed lexical preservation itemspublic void setXMLDeclarationProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
to use for changes to the xml declaration.mode
- the PreservationProcessingMode
to use for changes to the xml declarationpublic PreservationProcessingMode getXMLDeclarationProcessingMode()
PreservationProcessingMode
for xml declaration changes.PreservationProcessingMode
detailing how a changed xml declaration will be outputpublic void setDoctypeProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
to use for changes to doctype declarations.mode
- the PreservationProcessingMode
to use for changes to doctype declarationspublic PreservationProcessingMode getDoctypeProcessingMode()
PreservationProcessingMode
for doctype declarations.PreservationProcessingMode
detailing how changed doctype declarations will be outputpublic void setDefaultAttributeInfoProcessingMode(DefaultAttProcessingMode mode)
DefaultAttProcessingMode
to use for defaulted attributes. Note that when this mode is set to 'atomatic' it
behaves as 'excplicit' rather than as specified by the default processing mode.mode
- the DefaultAttProcessingMode
to use for defaulted attributes.for details on what output should be expected
public DefaultAttProcessingMode getDefaultedAttributeInfoProcessingMode()
DefaultAttProcessingMode
for defaulted attributes.DefaultAttProcessingMode
detailing how defaulted attributes will be outputpublic void setOuterPiAndCommentProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
to use for changes to processing instructions and comments outside the root
element. Note if this element is set to PreservationProcessingMode.CHANGE
, then the processing instructions and
comments outside the root element are handled in the same manner as those inside the root element.mode
- the PreservationProcessingMode
to use for changes to processing instructions and comments outside the
root elementpublic PreservationProcessingMode getOuterPiAndCommentProcessingMode()
PreservationProcessingMode
for processing instructions and comments outside the root element.PreservationProcessingMode
detailing how changed processing instructions and comments outside the root
element will be outputpublic void setCommentProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
to use for changes to comments.mode
- the PreservationProcessingMode
to use for changes to commentspublic PreservationProcessingMode getCommentProcessingMode()
PreservationProcessingMode
for comments.PreservationProcessingMode
detailing how changed comments will be outputpublic void setCDATAProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
to use for changes to CDATA blocks.mode
- the PreservationProcessingMode
to use for changes to CDATA blockspublic PreservationProcessingMode getCDATAProcessingMode()
PreservationProcessingMode
for CDATA blocks.PreservationProcessingMode
detailing how changed CDATA blocks will be outputpublic void setProcessingInstructionProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
for changes to processing instructions.mode
- the PreservationProcessingMode
to use for changes to processing instructionspublic PreservationProcessingMode getProcessingInstructionProcessingMode()
PreservationProcessingMode
for processing instructions.PreservationProcessingMode
detailing how processing instructions will be outputpublic void setIgnorableWhitespaceProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
for changes to ignorable whitespace.mode
- the PreservationProcessingMode
to use for changes to ignorable whitespacepublic PreservationProcessingMode getIgnorableWhitespaceProcessingMode()
PreservationProcessingMode
for ignorable whitespace.PreservationProcessingMode
detailing how ignorable whitespace will be outputpublic void setEntityRefProcessingMode(PreservationProcessingMode mode)
PreservationProcessingMode
for changes to entity references.mode
- the PreservationProcessingMode
to use for changes to entity referencespublic PreservationProcessingMode getEntityRefProcessingMode()
PreservationProcessingMode
for entity references.PreservationProcessingMode
detailing how entity references will be outputpublic void setAdvancedEntityReferenceUsage(AdvancedEntityRefUsage usageMode)
Specify advanced behaviour of entity reference processing. In particular, whether a 'compared' encoded entity reference should be replaced by its content, split into an 'old' and 'new' version on detection of change, or left with the full change information. This method is intended for expert use and should typically be left on automatic, as this configures it appropriately for non-specialist use cases (i.e. cases where the input and output preservation setting are consistent). The four modes are interpreted as follows:
Note that entity references are being encoded if, and only if, LexicalPreservationConfig.getPreservedEntityReferences()
returns true. And that
the entity replacement text is kept within an encoded entity reference when the LexicalPreservationConfig.getPreserveEntityReplacementText()
method returns true.
Warning: specifying that the encoded entity replacement text should be used, when it does not exist (see not above) will result in neither the entity reference or its replacement text appearing in the output.
usageMode
- the AdvancedEntityRefUsage
to use for entity referencespublic AdvancedEntityRefUsage getAdvancedEntityReferenceUsage()
AdvancedEntityRefUsage
which states whether entity references or their replacement text will be output.public void setDefaultOutputType(PreservationOutputType type)
PreservationOutputType
for changes to preserved items.type
- the default PreservationOutputType
to use for changes to preserved itemspublic PreservationOutputType getDefaultOutputType()
PreservationOutputType
for preserved items.PreservationOutputType
detailing how preserved items will be outputpublic void setCDATAOutputType(PreservationOutputType type)
PreservationOutputType
for changes to CDATA sections.type
- the PreservationOutputType
to use for changes to CDATA sectionspublic PreservationOutputType getCDATAOutputType()
PreservationOutputType
for CDATA sections.PreservationOutputType
detailing how CDATA sections will be outputpublic void setCommentOutputType(PreservationOutputType type)
PreservationOutputType
for changes to comments.type
- the PreservationOutputType
to use for changes to commentspublic PreservationOutputType getCommentOutputType()
PreservationOutputType
for comments.PreservationOutputType
detailing how comments will be outputpublic void setIgnorableWhitespaceOutputType(PreservationOutputType type)
PreservationOutputType
for changes to ignorable whitespace.type
- the PreservationOutputType
to use for changes to ignorable whitespacepublic PreservationOutputType getIgnorableWhitespaceOutputType()
PreservationOutputType
for ignorable whitespace.PreservationOutputType
detailing how ignorable whitespace will be outputpublic void setDefaultAttributeInfoOutputType(PreservationOutputType type)
PreservationOutputType
for changes to defaulted attributes.type
- the PreservationOutputType
to use for changes to defaulted attributespublic PreservationOutputType getDefaultedAttributeInfoOutputType()
PreservationOutputType
for defaulted attributes.PreservationOutputType
detailing how defaulted attributes will be outputpublic void setDoctypeOutputType(PreservationOutputType type)
PreservationOutputType
for changes to document type (and internal subset).type
- the PreservationOutputType
to use for changes to document type (and internal subset)public PreservationOutputType getDoctypeOutputType()
PreservationOutputType
for document type (and internal subset).PreservationOutputType
detailing how document type (and internal subset) will be outputpublic void setOuterPiAndCommentOutputType(PreservationOutputType type)
PreservationOutputType
to use for changes to processing instructions and comments outside the root element
(and outside the internal subset).type
- the PreservationOutputType
to use for changes to processing instructions and comments outside the root
element.public PreservationOutputType getOuterPiAndCommentOutputType()
PreservationOutputType
to use for changes to processing instructions and comments outside the root element
(and outside the internal subset).PreservationOutputType
detailing how changes to processing instructions and comments outside the root
element will be output.public void setEntityRefOutputType(PreservationOutputType type)
PreservationOutputType
for changes to entity references.type
- the PreservationOutputType
to use for changes to entity referencespublic PreservationOutputType getEntityRefOutputType()
PreservationOutputType
for entity references.PreservationOutputType
detailing how entity references will be outputpublic void setProcessingInstructionOutputType(PreservationOutputType type)
PreservationOutputType
for changes to processing instructions.type
- the PreservationOutputType
to use for changes to processing instructionspublic PreservationOutputType getProcessingInstructionOutputType()
PreservationOutputType
for processing instructions.PreservationOutputType
detailing how processing instructions will be outputpublic void setXMLDeclarationOutputType(PreservationOutputType type)
PreservationOutputType
for changes to XML declaration.type
- the PreservationOutputType
to use for changes to XML declarationpublic PreservationOutputType getXMLDeclarationOutputType()
PreservationOutputType
for XML declaration.PreservationOutputType
detailing how XML declaration will be outputpublic java.util.EnumSet<LexicalPreservationBase.PreserveItem> toPreserveItemEnumSet()
This method is for internal use.