public class ResultReadabilityOptions
extends java.lang.Object
Sets options to change the granularity and ordering of changes in the result in order to improve readability.
The underlying comparison engine attempts to produce a result based on the Levenshtein distance between the inputs. While this will produce a mathematically optimal result, when looking at individual word changes in a block of text, the results produced are not always very readable as they can appear to be a mix of added, deleted and unchanged items. This configuration object can be used to make the result more readable using a variety of techniques, including:
Changes to whitespace within a document can be insignificant, such as when editors automatically add line wrapping and indentation within a DITA, DocBook, or HTML paragraph. In these cases, it would be useful if changes in whitespace are not reported. This can be achieved in a number of ways, including: normalizing the whitespace in the inputs; and identifying and then ignoring modified whitespace in the raw comparison output. Here whitespace, at a given point in the document, is considered to be modified if, and only if, both documents have some whitespace at this point which differs.
Note: if there are subtrees in a document where whitespace change is important this can be identified by adding the standard
XML xml:space="preserve"
attribute to the top-element of the subtree, in the input filtering. Conversely, it is
possible to specify that the whitespace within a subtree does not need to be preserved by adding the
xml:space="default"
attribute to the top-element of that subtree, which is the implicit setting.
Constructor and Description |
---|
ResultReadabilityOptions()
Constructs a new
ResultReadabilityOptions instance. |
Modifier and Type | Method and Description |
---|---|
int |
getElementSplittingDebugTextReportSize()
Returns the current limit on the length of each debug report output by the element splitting filter.
|
int |
getElementSplittingThreshold()
Returns the percentage of unchanged text present in a modified element below which the element will be split.
|
MixedContentDetectionScope |
getMixedContentDetectionScope()
Return the current scope
MixedContentDetectionScope for determining whether elements are of a mixed-content type. |
ModifiedWhitespaceBehaviour |
getModifiedWhitespaceBehaviour()
Return the current
ModifiedWhitespaceBehaviour for handling changes in whitespace. |
java.lang.String |
getMoveAttributeXpath()
Deprecated.
Please use
MoveDetectionConfig instead. |
int |
getOrphanedWordLengthLimit()
Returns the current maximum number of words that could be considered orphaned.
|
int |
getOrphanedWordMaxPercentage()
Returns the maximum proportion of the total change size that orphaned words can take while still being considered orphans.
|
boolean |
isChangeGatheringEnabled()
States whether or not the change reordering functionality is currently enabled.
|
boolean |
isCharacterByCharacterEnabled()
Sets whether to use Character by Character in comparison.
|
boolean |
isDetectMoves()
Deprecated.
Please use
MoveDetectionConfig instead. |
boolean |
isElementSplittingDebug()
States whether the element splitting filter is set to output internal debug information.
|
boolean |
isElementSplittingEnabled()
States whether modified elements containing text are split when the amount of unchanged text falls below a given percentage.
|
boolean |
isOrphanedWordDetectionEnabled()
States whether or not orphaned word detection is enabled.
|
boolean |
isRemoveMoveSource()
Deprecated.
Please use
MoveDetectionConfig instead. |
void |
setChangeGatheringEnabled(boolean enabled)
Sets whether to change the order of consecutive changed items to improve readability.
|
void |
setCharacterByCharacterEnabled(boolean characterByCharacter)
Returns the setting which enables the character by character comparison.
|
void |
setDetectMoves(boolean value)
Deprecated.
Please use
MoveDetectionConfig instead. |
void |
setElementSplittingDebug(boolean debug)
Sets whether the element splitting filter should output internal debug information.
|
void |
setElementSplittingDebugTextReportSize(int size)
Sets a limit on the length of each debug report output by the element splitting filter.
|
void |
setElementSplittingEnabled(boolean enabled)
Sets whether modified elements containing text should be split when the amount of unchanged text falls below a given
percentage.
|
void |
setElementSplittingThreshold(int percentage)
Sets the percentage of unchanged text present in a modified element below which the element will be split.
|
void |
setMixedContentDetectionScope(MixedContentDetectionScope scope)
Set the scope
MixedContentDetectionScope to use for determining whether elements are of a mixed-content type. |
void |
setModifiedWhitespaceBehaviour(ModifiedWhitespaceBehaviour mode)
Set the
ModifiedWhitespaceBehaviour to use for changes to whitespace. |
void |
setMoveAttributeXpath(java.lang.String value)
Deprecated.
Please use
MoveDetectionConfig instead. |
void |
setOrphanedWordDetectionEnabled(boolean enabled)
Sets whether or not to enable orphaned word detection and fix-up.
|
void |
setOrphanedWordLengthLimit(int length)
Sets the maximum number of words to consider for orphaned word detection.
|
void |
setOrphanedWordMaxPercentage(int percentage)
Sets the maximum proportion of the total change size that orphaned words can take while still being considered orphans.
|
void |
setRemoveMoveSource(boolean removeMoveSource)
Deprecated.
Please use
MoveDetectionConfig instead. |
public ResultReadabilityOptions()
ResultReadabilityOptions
instance.public void setElementSplittingEnabled(boolean enabled)
Sets whether modified elements containing text should be split when the amount of unchanged text falls below a given percentage.
The percentage at which this behaviour is triggered can be set using ResultReadabilityOptions.setElementSplittingThreshold(int)
enabled
- whether or not to enable element splittingResultReadabilityOptions.setElementSplittingThreshold(int)
public boolean isElementSplittingEnabled()
States whether modified elements containing text are split when the amount of unchanged text falls below a given percentage.
public void setElementSplittingThreshold(int percentage) throws java.lang.IllegalArgumentException
Sets the percentage of unchanged text present in a modified element below which the element will be split.
percentage
- an integer in the range 0-100 specifying the percentage of unchanged textjava.lang.IllegalArgumentException
- if the supplied parameter is not in the range 0-100public int getElementSplittingThreshold()
Returns the percentage of unchanged text present in a modified element below which the element will be split.
public void setElementSplittingDebug(boolean debug)
Sets whether the element splitting filter should output internal debug information.
The internal debug output is passed using <xsl:message/>. This is typically passed to std.out
but this can
be changed my configuring the relevant Saxon Configuration
object.
debug
- whether the element splitting filter should output internal debug informationConfiguration.setStandardErrorOutput(java.io.PrintStream)
public boolean isElementSplittingDebug()
States whether the element splitting filter is set to output internal debug information.
public void setElementSplittingDebugTextReportSize(int size) throws java.lang.IllegalArgumentException
Sets a limit on the length of each debug report output by the element splitting filter.
size
- the size limit for the debug output reportjava.lang.IllegalArgumentException
- if the supplied parameter is not a positive integerpublic int getElementSplittingDebugTextReportSize()
Returns the current limit on the length of each debug report output by the element splitting filter.
public void setOrphanedWordDetectionEnabled(boolean enabled)
Sets whether or not to enable orphaned word detection and fix-up.
enabled
- whether or not to enable orphaned word detectionpublic boolean isOrphanedWordDetectionEnabled()
States whether or not orphaned word detection is enabled.
public void setOrphanedWordLengthLimit(int length) throws java.lang.IllegalArgumentException
Sets the maximum number of words to consider for orphaned word detection. Sequences of words longer than the specified length will never be detected as orphaned words, regardless of the amount of changed words around them.
length
- the maximum number of consecutive words that could be considered to be orphanedjava.lang.IllegalArgumentException
- if the supplied parameter is not a positive integerpublic int getOrphanedWordLengthLimit()
Returns the current maximum number of words that could be considered orphaned.
public void setOrphanedWordMaxPercentage(int percentage) throws java.lang.IllegalArgumentException
Sets the maximum proportion of the total change size that orphaned words can take while still being considered orphans.
If the percentage value for a possibly orphaned section is less than or equal to this value, then it is classified as orphaned (unless there are more words than the length limit allows). The percentage value for a possibly orphaned section is calculated as follows:
(possibly_orphaned_words_count * 100) / (preceding_changed_words_count + possibly_orphaned_words_count + following_changed_words_count)
percentage
- the maximum proportion of a changed section that orphaned words can take and still be considered orphansjava.lang.IllegalArgumentException
- if the supplied value is not in the range 0-100public int getOrphanedWordMaxPercentage()
Returns the maximum proportion of the total change size that orphaned words can take while still being considered orphans.
public void setChangeGatheringEnabled(boolean enabled)
Sets whether to change the order of consecutive changed items to improve readability.
If the result contains a sequence of elements whose deltaxml:deltaV2
attribute values are mixed up in a sequence
of As and Bs, enabling this feature will cause them to be reordered so that they are not mixed.
For example,
<elem deltaxml:deltaV2="A"/> <elem deltaxml:deltaV2="B"/> <elem deltaxml:deltaV2="A"/> <elem deltaxml:deltaV2="B"/> <elem deltaxml:deltaV2="B"/> <elem deltaxml:deltaV2="A"/> <elem deltaxml:deltaV2="A"/>would be reordered to
<elem deltaxml:deltaV2="A"/> <elem deltaxml:deltaV2="A"/> <elem deltaxml:deltaV2="A"/> <elem deltaxml:deltaV2="A"/> <elem deltaxml:deltaV2="B"/> <elem deltaxml:deltaV2="B"/> <elem deltaxml:deltaV2="B"/>
enabled
- whether the reordering functionality should be enabledpublic boolean isChangeGatheringEnabled()
States whether or not the change reordering functionality is currently enabled.
public void setModifiedWhitespaceBehaviour(ModifiedWhitespaceBehaviour mode)
ModifiedWhitespaceBehaviour
to use for changes to whitespace. Here, both documents must have some whitespace
at a given point in order for there to be a change in whitespace. This will then be processed in accordance with the
specified behaviour. Whitespace insertions and deletions are not affected by the modified whitespace behaviour.mode
- the ModifiedWhitespaceBehaviour
to use for changes in whitespacepublic ModifiedWhitespaceBehaviour getModifiedWhitespaceBehaviour()
ModifiedWhitespaceBehaviour
for handling changes in whitespace.ModifiedWhitespaceBehaviour
detailing how changed whitespace will be outputpublic void setMixedContentDetectionScope(MixedContentDetectionScope scope)
MixedContentDetectionScope
to use for determining whether elements are of a mixed-content type. This
property has no effect if DTD or XML Schema validation is enabled. The mixed-content type of an element affects whitespace
handling.scope
- the MixedContentDetectionScope
to use for determining if elements are mixed-content.public MixedContentDetectionScope getMixedContentDetectionScope()
MixedContentDetectionScope
for determining whether elements are of a mixed-content type.MixedContentDetectionScope
used for determining if elements are mixed-content.@Deprecated public void setDetectMoves(boolean value)
MoveDetectionConfig
instead.ResultReadabilityOptions.setMoveAttributeXpath(java.lang.String)
.value
- A boolean value to set moves detection feature.@Deprecated public boolean isDetectMoves()
MoveDetectionConfig
instead.@Deprecated public void setMoveAttributeXpath(java.lang.String value)
MoveDetectionConfig
instead.value
- XPath for id attribute.@Deprecated public java.lang.String getMoveAttributeXpath()
MoveDetectionConfig
instead.@Deprecated public boolean isRemoveMoveSource()
MoveDetectionConfig
instead.@Deprecated public void setRemoveMoveSource(boolean removeMoveSource)
MoveDetectionConfig
instead.removeMoveSource
- the setting to set remove moves sourcepublic boolean isCharacterByCharacterEnabled()
public void setCharacterByCharacterEnabled(boolean characterByCharacter)
characterByCharacter
- the setting to set character by character comparison