Jump to content

process XML file


rudi
 Share

Recommended Posts

Hello,

doing some initial search I found, that there is a XML.AU3 UDF available:

 

The XML data are coming from a web services interface of a PLC (retrieving these data from http://10.27.20.101:8080/user/errors is no problem).

 

A "no-error-situation" result will look like this:

<eta version="1.0">
  <errors uri="/user/errors">
    <fub uri="/25/10341" name="Kessel2"/>
    <fub uri="/25/10241" name="Sys2"/>
    <fub uri="/26/10301" name="Kessel"/>
    <fub uri="/24/10341" name="Kessel1"/>
    <fub uri="/24/10241" name="Sys1"/>
    <fub uri="/33/10361" name="Asche2"/>
    <fub uri="/32/10361" name="Asche1"/>
    <fub uri="/123/10251" name="Puffer"/>
    <fub uri="/123/10241" name="AF"/>
  </errors>
</eta>

 

An example error XML data set I received from the vendor looks like this:

<eta xmlns="http://www.eta.co.at/rest/v1" version="1.0">
    <errors uri="/user/errors">
        <fub uri="/121/10441" name="Raum 1.1">
            <error msg="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" priority="Fehler" time="2019-04-18 12:36:02">
                Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung
            </error>
        </fub>
        <fub uri="/120/10601" name="PufferFlex"/>
        <fub uri="/120/10111" name="WW"/>
        <fub uri="/120/10101" name="HK"/>
        <fub uri="/120/10481" name="ER-HK"/>
    </errors>
</eta>

 

the error node url  for explicitely  /121/10441 is returning this partial XML content:

<eta xmlns="http://www.eta.co.at/rest/v1" version="1.0">
    <errors uri="/user/errors/121/10441">
        <fub uri="/121/10441" name="Raum 1.1">
            <error msg="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" priority="Fehler" time="2019-04-18 12:36:02">
                Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung
            </error>
        </fub>
    </errors>
</eta>

 

The data I need to extract from the error examle would be:

$Location="Raum 1.1"
$Device="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung"
$Status="Fehler" 
$Time="2019-04-18 12:36:02"
$Message="Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung"

 

What's the best aproach to grab error information only from the XML content returned as shown above?

 

TIA, Rudi.

 

Edited by rudi

Earth is flat, pigs can fly, and Nuclear Power is SAFE!

Link to comment
Share on other sites

@rudi
You could use a SRE to obtain the fields from you XML file, and then arrange them in this way:

#include <Array.au3>
#include <StringConstants.au3>

Global $strFileContent = '<fub uri="/121/10441" name="Raum 1.1">' & @CRLF & _
                         '<error msg="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" priority="Fehler" time="2019-04-18 12:36:02">' & _
                         'Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung' & @CRLF & _
                         '</error>' & @CRLF & _
                         '</fub>', _
       $arrResult, _
       $arrMessages[5] = ["Location", "Device", "Status", "Time", "Message"], _
       $arrArray[0][2]

$arrResult = StringRegExp($strFileContent, '(?s)<fub uri="[^"]+" name="([^"]+)">.*?<error msg="([^"]+)" priority="([^"]+)" time="([^"]+)">([^<]+)<\/error>', $STR_REGEXPARRAYGLOBALMATCH)
If IsArray($arrResult) Then

    For $i = 0 To UBound($arrResult) - 1 Step 1

        ; The StringReplace() is used to replace the @CRLF, which is the delimiter for the Row in _ArrayAdd, with a @CR
        _ArrayAdd($arrArray, $arrMessages[$i] & "|" & StringReplace($arrResult[$i], @CRLF, @CR))
        If @error Then Exit ConsoleWrite("_ArrayAdd() ERR: " & @error & @CRLF)
    Next

    _ArrayDisplay($arrArray)

EndIf

:)

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

@FrancescoDiMuro

 

Wow!

Standing ovations!!!

 

I tried to figure out some RegEx as well before, but despartely failed!

 

$RegEx='(?s)<fub uri="[^"]+" name="([^"]+)">.*?<error msg="([^"]+)" priority="([^"]+)" time="([^"]+)">([^<]+)<\/error>'

 

Tx!

Rudi.

Earth is flat, pigs can fly, and Nuclear Power is SAFE!

Link to comment
Share on other sites

For the fun...

#Include <Array.au3>

$txt = '<eta xmlns="http://www.eta.co.at/rest/v1" version="1.0">' & @crlf & _ 
    '    <errors uri="/user/errors/121/10441">' & @crlf & _ 
    '        <fub uri="/121/10441" name="Raum 1.1">' & @crlf & _ 
    '            <error msg="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" priority="Fehler" time="2019-04-18 12:36:02">' & @crlf & _ 
    '                Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung' & @crlf & _ 
    '            </error>' & @crlf & _ 
    '        </fub>' & @crlf & _ 
    '    </errors>' & @crlf & _ 
    '</eta>'
; Msgbox(0,"", $txt)

$res = StringRegExp($txt, '(?|(?:name|msg|priority|time)="([^"]*)|(?m)^\h*([^<]+?)$)', 3)
 _ArrayDisplay($res)

 

Link to comment
Share on other sites

The alternatives are familiar to me, as well as [non] capturing groups.

But what's a "Branch Reset Group" doing exactly?

and the 2nd alternative, hm...

  • (?m) = ^ and $ match beginning and end of line (not full string) --> that's not fact for the 1st alternative? :hm:
  • \h* = "kick leading WS, if any" (not in a capturing Group) ???
  • [^<]+? = match all but "<" (lazy)
  • Explanation: It's telling to match literally "<", where is that one specified? (I don't see a literally "<")

and I do not get how the result Array is populated

 

<edit> I'm pretty Close to get your regex with the help of this Explanation.
 

https://www.regular-expressions.info/branchreset.html

 

The Population of the Array isn't clear to me at all, still...

Edited by rudi

Earth is flat, pigs can fly, and Nuclear Power is SAFE!

Link to comment
Share on other sites

For  ?|  the helpfile says : "Non-capturing group with reset. Resets capturing group numbers in each top-level alternative it contains"
Practically, this means that you won't get an unwanted blank line in the resulting array in case of matching failure of the first part of the alternation. You can check this by replacing  ?|  by  ?:

For the 2nd alternative :
(?m)  allows ^ and $ to "match beginning and end of line (not full string)". This feature is used in the 2nd alternative only, reason why I put it there - but it could be placed at the beginning of the expression as well

In usual language, (?m)^\h*([^<]*?)$ means :
Between beginning and end of line (anchors are important to force check of the whole line), match 0 or more horizontal WS (don't capture) and 0 or more "non-< " chars (capture).  So if there is a < char in the line, it causes the whole match to fail

:)

Edited by mikell
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...