Jump to content

Recommended Posts

Posted (edited)

Hi Experts,

Hope everyone had a good day!😄

I need you help on how to handle below abbreviated text and its definition to be captured as XML form. I have tried regex but no success(maybe I need to call a friend for this😅). I been trying to search a way on how to present this to xml. Please refer below.

I have the below abbrev term and the definition of each term:

CD68    cluster of differentiation 68

CSI    chemical shift imaging

GFAP    glial fibrillary acidic protein

;etc.. more here....

The below is the sample XML output that I want it to happen.

<definition xml:id="autoit4164-dl-0001">
<termPaired xml:id="autoit4164-lp-0001">
<termItemPair>
    <termItem xml:id="autoit4164-li-0001">CD68</termItem>
    <defItem xml:id="autoit4164-li-0001">cluster of differentiation 68</defItem>
</termItemPair>
<termItemPair>
    <termItem xml:id="autoit4164-li-0002">CSI</termItem>
    <defItem xml:id="autoit4164-li-0002">chemical shift imaging</defItem>
</termItemPair>
<termItemPair>
    <termItem xml:id="autoit4164-li-0003">GFAP</termItem>
    <defItem xml:id="autoit4164-li-0003">glial fibrillary acidic protein</defItem>
</termItemPair>
</termPaired>
</definition>

 

I tried searching to create code but could fine any topic on how.😥 Anyone that has idea how or have a link, please guide me Experts.

 

I need you help! Thanks in advance....

 

KS15

Edited by KickStarter15

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Posted

@FrancescoDiMuro, I have the below tries and not good, found it somewhere in searching. It is not creating as expected, you can read my commented concerns.

#include <FileConstants.au3>
#include <MsgBoxConstants.au3>
#include <WinAPIFiles.au3>
Local $oXML = ObjCreate("Microsoft.XMLDOM")
$oXML.loadxml("<definition />")
; I need to declare the second parent element here but how? "<termPaired xml:id="autoit4164-lp-0001">"
$oRoot = $oXML.selectSingleNode("//definition")


AddXML($oXML,$oRoot)
AddXML($oXML,$oRoot)
AddXML($oXML,$oRoot)


ConsoleWrite($oXML.xml & @CRLF)
$sFilePath = @ScriptDir&"\XML.xml"
$hFileOpen = FileOpen($sFilePath, $FO_APPEND)
FileWriteLine($hFileOpen, $oXML.xml)
FileClose($hFileOpen)
Exit

Func AddXML($oXML,$o)
    Local $i = $o.selectNodes("./termItemPaired").length + 1
    Local $oChild = $oXML.createElement("termItemPaired")
    $oChild.SetAttribute("xml:id", "autoit4164-dl-000" & $i)

; I also need to add the title here... <title type="main">Abbreviations used</title>
    Local $oChild_1 = $oXML.createElement("termItem")
    Local $oChild_2 = $oXML.createElement("defItem")

;~ in this part, the below text is repeating in every element which is incorrect.
    $oChild_1.text = "CD86"
    $oChild_2.text = "cluster of differentiation 68"

    $oChild.appendChild($oChild_1)
    $oChild.appendChild($oChild_2)

    $o.appendChild($oChild)
EndFunc

 

@junkew,

Thanks for the link, yup I tried checking there but I don't know if I missed the link part where my concern can refer or guide me. Most of it are not what I expected, there are some regex but as much as possible I'll avoid regex for this concern.😅

 

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Posted

Hi Experts,

Is there any other way to resolve this?😅

 

Thanks.

KS15

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Posted (edited)

You can use the basic String* way with loop(s)

$txt = "CD68    cluster of differentiation 68" & @crlf & _ 
    "CSI    chemical shift imaging" & @crlf & _ 
    "GFAP    glial fibrillary acidic protein"
; Msgbox(0,"", $txt)

$a = StringSplit($txt, @crlf, 1)
$out = ""
$n = StringFormat("%04i", "1")
$out &= '<definition xml:id="autoit4164-dl-' & $n & '">' & @crlf
$out &= '<termPaired xml:id="autoit4164-dl-' & $n & '">' & @crlf
For $i = 1 to $a[0]
    $n = StringFormat("%04i", $i)
    $out &= '<termItemPair>' & @crlf
    $out &= @tab & '<termItem xml:id="autoit4164-li-' & $n & '">' & StringRegExpReplace($a[$i], '^(\w+).*', "$1") & @crlf
    $out &= @tab & '<defItem xml:id="autoit4164-li-' & $n & '">' & StringRegExpReplace($a[$i], '^(\w+\h*)', "") & @crlf
    $out &= '</termItemPair>'& @crlf
Next
$out &= '</termPaired>' & @crlf & '</definition>'
Msgbox(0,"", $out)

 

Edited by mikell
typo
Posted

@mikell, thank you so much even it was only basic but still it doesn't came up to my mind that way. Very nice and understandable code. Thumbs up for you mikell. Thanks!😁

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...