You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to the xml standard e.g. http://www.w3.org/TR/xml/#sec-references
one can refer to any charachter (including non-printables) using &#\d+; and &#\h+;.
However, these characters seem to be ignored by hexpat.
For example, the HUnit test case
testSingleEscapedTextNode :: Test
testSingleEscapedTextNode = TestCase $
let nodeName = "singleNode" in
let nodeText = "a text with escaped characters & < >    is not correctly handled" in
let (xml, mErr) = ( parse defaultParseOptions (pack $ map c2w ("<" ++ nodeName ++ ">" ++ nodeText ++ "</" ++ nodeName ++ ">") ) ) :: (UNode String, Maybe XMLParseError) in do
assertEqual "Single Node" xml (Element nodeName [] [Text nodeText])
yields as output
### Failure in: 0:Library Tests:1:Text.XML.Expat.Tree:3:Single Escaped Text Node
Single Node
expected: Element "singleNode" [] [Text "a text with escaped ",Text "characters ",Text "&",Text " ",Text "<",Text " ",Text ">",Text " "]
but got: Element "singleNode" [] [Text "a text with escaped characters & < >    is not correctly handled"]
The list is wrong since it contains no elements after the first &#\h+; character.
Note: I know there is also an error in my test: it assumes only one Text element not a list of Text elements, but this is irrelevant for this problem!
The text was updated successfully, but these errors were encountered:
Please note that the underlying parser is implementing XML 1.0 fourth edition, neither XML 1.1 nor XML 1.0 fifth edition. Expat ticket libexpat/libexpat#171 may be of interest.
According to the xml standard e.g. http://www.w3.org/TR/xml/#sec-references
one can refer to any charachter (including non-printables) using
&#\d+;
and&#\h+;
.However, these characters seem to be ignored by hexpat.
For example, the HUnit test case
yields as output
The list is wrong since it contains no elements after the first
&#\h+;
character.Note: I know there is also an error in my test: it assumes only one Text element not a list of Text elements, but this is irrelevant for this problem!
The text was updated successfully, but these errors were encountered: