Code Project

Link Unit

Monday, September 11, 2017

XML parsing: line 1, character 75, illegal name character

When saving strings to XML, or when trying to extract text within tags it important to escape invalid characters . The following table shows the invalid XML characters and their escaped equivalents.

Invalid XML Character Replaced With
<                          <
>                          >
" "
' '
& &

if we try following code in SQL window, where we are trying to extract text from within html tags.

declare @v varchar(40)
Set @v='a & b'
Select cast(@v as XML).value('.','varchar(max)')

we will receive error like "XML parsing: line 1, character xx, illegal name character"

Solution: 

declare @v varchar(40)
Set @v='a & b'
Select cast(replace(replace(replace(@v,'>','><![CDATA['),'</',']]></')+']]>','<![CDATA[]]>','') as XML).value('.','varchar(max)')

As we know CDATA section is "a section of element content that is marked for the parser to interpret as only character data, not markup." so we will include CDATA in such a way that text is inside it and parsing of it wouldn't result in error. 

Hope it helps

No comments: