13
Views
3
Comments
Solved
[XML Records] Two header/declaration rows causing problems
Question
Forge component by Afonso Carvalho
44
Published on 25 Nov 2019

Hi,
XmlToRecordList has an input parameter for ignoring declaration row if there is one. I'm working with xml files that have two such rows. Is there a way to ignore both somehow?

Example:

<?xml version="1.0" encoding="ISO-8859-15"?>
<?xml-stylesheet type="text/xsl" href="Style.xsl"?>

<note >
<to>Tove</to>
<from>Tim</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

Rank: #397
Solution

Hi,

I saw this is a bug where it can not handle multiple header ignore. I found a solution.

Use Regex_Replace action under Text extension. Before you pass the xml string to XmlToRecordList, remove all headers. Use the below steps.

 

Using the regular expression "<\?xml.*\?>" I will be removing the headers from XML string

<?xml version="1.0" encoding="ISO-8859-15"?>
<?xml-stylesheet type="text/xsl" href="Style.xsl"?>  - These two lines.

Then you send the Regex_Replace.Result as input to XmlToRecordList.


I tested, it is working


Thanks

Sourav

mvp_badge
MVP
Rank: #39

Olli, I'm glad you got this sorted.

Just to provide some context here, this is less a bug, and more of a mismatch in expectations. The input you mention is indeed used to ignore the XML declaration - multiples of these are not allowed in valid XML. The xml-stylesheet tag is a separate component of the XML schema and not a part of the declaration. These tags are not affected by the input.

I'm going to review the documentation and the Action inputs in order to try and make this more clear. I'll think of a way to add this functionality in the future - I want to avoid creating multiple inputs to ignore each optional XML tag. This might involve accepting a list of tags to ignore, but I'll speak with the team.

Thank you for your interest in the extension, and thank you Sourav for providing a solution.