Jump to content

String Parse


Gaviel
 Share

Recommended Posts

Good Day,

I just wanna ask .. If I download a source page and store it in a variable " $sHTML " ..

How can I parse some text there ?! Like for example here's the source..

<DIV class="log log-child">
<DIV class=lmain>Neuer Schüler: VanBlattidCat ! </DIV></DIV>
<DIV class="log log-child">
<DIV class=lmain>Neuer Schüler: VanThenCrash ! </DIV></DIV>
<DIV class="log log-child">
<DIV class=lmain>Neuer Schüler: VanLogicalEwe ! </DIV>
<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV></DIV>
<DIV class="log log-child">
<DIV class=lmain>Neuer Schüler: VanEthicalYam ! </DIV></DIV>
<DIV class="log log-child">
<DIV class=lmain>Neuer Schüler: VanRefinedBow ! </DIV>
<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV></DIV>
<DIV class="log log-child">
<DIV class=lmain>Neuer Schüler: VanZestyClew ! </DIV></DIV>

inside the source .. The script will find the text VanBlattidCat then if the next DIV class is = ldetails the sctipt will return true .. If not .. It will return false and If VanBlattidCat doesn't exist .. It will return an error ..

Any help please ?!

thank you

Link to comment
Share on other sites

You could parse it out with string manipulation, but nested tags can be confusing when doing that.

It might be easier to load the HTML to an instance of IE and use the DOM with _IE* functions to find what you want.

;)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

You could parse it out with string manipulation, but nested tags can be confusing when doing that.

It might be easier to load the HTML to an instance of IE and use the DOM with _IE* functions to find what you want.

;)

what do you mean by that ?!

I'm sorry .. I'm really new to this thing .. Could you show me a sample code .. If it's only ok for you :)

what I use to download source is

$sHTML = _IEDocReadHTML ($oIE)
Edited by Gaviel
Link to comment
Share on other sites

What about this?

$sHTML = _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanBlattidCat ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanThenCrash ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanLogicalEwe ! </DIV>' & @LF & _
            '<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanEthicalYam ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanRefinedBow ! </DIV>' & @LF & _
            '<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanZestyClew ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">'

MsgBox(0, "Check", "Found: " & Check("VanThenCrash", $sHTML) & @CRLF & "Error: " & @error)
MsgBox(0, "Check", "Found: " & Check("VanRefinedBow", $sHTML) & @CRLF & "Error: " & @error)
MsgBox(0, "Check", "Found: " & Check("Test", $sHTML) & @CRLF & "Error: " & @error)

Func Check($search, $sHTML, $2nd_line = "ldetails>")
    Local $i, $aSearch = StringSplit($sHTML, @LF, 2)
    For $i = 0 To UBound($aSearch) - 2
        If StringInStr($aSearch[$i], $search & " !") Then
            If StringInStr($aSearch[$i + 1], $2nd_line) Then
                Return SetError(0, 0, True)
            Else
                Return SetError(0, 0, False)
            EndIf
        EndIf
    Next
    Return SetError(1, 0, "Error")
EndFunc

Br,

UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Link to comment
Share on other sites

What about this?

$sHTML = _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanBlattidCat ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanThenCrash ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanLogicalEwe ! </DIV>' & @LF & _
            '<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanEthicalYam ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanRefinedBow ! </DIV>' & @LF & _
            '<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanZestyClew ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">'

MsgBox(0, "Check", "Found: " & Check("VanThenCrash", $sHTML) & @CRLF & "Error: " & @error)
MsgBox(0, "Check", "Found: " & Check("VanRefinedBow", $sHTML) & @CRLF & "Error: " & @error)
MsgBox(0, "Check", "Found: " & Check("Test", $sHTML) & @CRLF & "Error: " & @error)

Func Check($search, $sHTML, $2nd_line = "ldetails>")
    Local $i, $aSearch = StringSplit($sHTML, @LF, 2)
    For $i = 0 To UBound($aSearch) - 2
        If StringInStr($aSearch[$i], $search & " !") Then
            If StringInStr($aSearch[$i + 1], $2nd_line) Then
                Return SetError(0, 0, True)
            Else
                Return SetError(0, 0, False)
            EndIf
        EndIf
    Next
    Return SetError(1, 0, "Error")
EndFunc

Br,

UEZ

But that would be hard-coded with the HTML source I provided right ?!

The HTML source will change from time to time .. And also the text that I'm parsing will change also ..

So All I need is this one ?!

Func Check($search, $sHTML, $2nd_line = "ldetails>")
    Local $i, $aSearch = StringSplit($sHTML, @LF, 2)
    For $i = 0 To UBound($aSearch) - 2
        If StringInStr($aSearch[$i], $search & " !") Then
            If StringInStr($aSearch[$i + 1], $2nd_line) Then
                Return SetError(0, 0, True)
            Else
                Return SetError(0, 0, False)
            EndIf
        EndIf
    Next
    Return SetError(1, 0, "Error")
EndFunc

thanks for the reply ..

Edited by Gaviel
Link to comment
Share on other sites

Dynamic code is hard to analyse everytime! What will remain same every time in HTML code?

'<DIV class=lmain>Neuer Schüler: VanBlattidCat ! </DIV></DIV>

"Neuer Schüler:"? or "! </DIV></DIV>"?

Br,

UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Link to comment
Share on other sites

See _IEDocWriteHTML() in the help file to write your source to an instance of IE. Then use something like _IETagNameGetCollection($oIE, "div") to get a collection of all DIV tags and do with them as you please.

;)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

Dynamic code is hard to analyse everytime! What will remain same every time in HTML code?

'<DIV class=lmain>Neuer Schüler: VanBlattidCat ! </DIV></DIV>

"Neuer Schüler:"? or "! </DIV></DIV>"?

Br,

UEZ

"<DIV class=lmain>Neuer Schüler:" and "! </DIV>" will remain the same in HTML code ..

Only VanBlattidCat will change from time to time ..

To make this clear " VanBlattidCat " , " VanThenCrash " , " VanLogicalEwe " and so on .. are all names ..

So everytime I submit a new name .. E.G. " NewName1 " ... It will appear there

<DIV class=lmain>Neuer Schüler: NewName1 ! </DIV>

Then I need to parse that " NewName1 " in the HTML code .. And see if the next DIV class is "ldetails" .. And will return a TRUE ,, else is FALSE .. And If " NewName1 " doesn't exit in the whole HTML code .. It will return an error .

thanks

Edited by Gaviel
Link to comment
Share on other sites

Then the function I wrote should work if I've not forgotten something ;).

I'm not familar with the IE stuff PsaltyDS wrote. Maybe it is much simpler to use the IE stuff.

Br,

UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Link to comment
Share on other sites

Then the function I wrote should work if I've not forgotten something ;).

I'm not familar with the IE stuff PsaltyDS wrote. Maybe it is much simpler to use the IE stuff.

Br,

UEZ

so that means that I don't need to include this code ?!

$sHTML = _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanBlattidCat ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanThenCrash ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanLogicalEwe ! </DIV>' & @LF & _
            '<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanEthicalYam ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanRefinedBow ! </DIV>' & @LF & _
            '<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">' & @LF & _
            '<DIV class=lmain>Neuer Schüler: VanZestyClew ! </DIV></DIV>' & @LF & _
            '<DIV class="log log-child">'
Link to comment
Share on other sites

Letting the IE DOM do the parsing for you, the function is only eight lines of code:

#include <IE.au3>

$sHTML = '<DIV class="log log-child">' & @LF & _
        '<DIV class=lmain>Neuer Schüler: VanBlattidCat ! </DIV>' & @LF & _
        '</DIV>' & @LF & _
        '<DIV class="log log-child">' & @LF & _
        '<DIV class=lmain>Neuer Schüler: VanThenCrash ! </DIV>' & @LF & _
        '</DIV>' & @LF & _
        '<DIV class="log log-child">' & @LF & _
        '<DIV class=lmain>Neuer Schüler: VanLogicalEwe ! </DIV>' & @LF & _
        '<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV>' & @LF & _
        '</DIV>' & @LF & _
        '<DIV class="log log-child">' & @LF & _
        '<DIV class=lmain>Neuer Schüler: VanEthicalYam ! </DIV>' & @LF & _
        '</DIV>' & @LF & _
        '<DIV class="log log-child">' & @LF & _
        '<DIV class=lmain>Neuer Schüler: VanRefinedBow ! </DIV>' & @LF & _
        '<DIV class=ldetails>1 Erfahrungspunkt gewonnen. </DIV>' & @LF & _
        '</DIV>' & @LF & _
        '<DIV class="log log-child">' & @LF & _
        '<DIV class=lmain>Neuer Schüler: VanZestyClew ! </DIV>' & @LF & _
        '</DIV>' & @LF

$oIE = _IECreate()
_IEBodyWriteHTML($oIE, $sHTML)

$RET = _TestDiv("VanBlattidCat")
MsgBox(64, "Result", "VanBlattidCat = " & $RET)

$RET = _TestDiv("VanLogicalEwe")
MsgBox(64, "Result", "VanBlattidCat = " & $RET)

Func _TestDiv($sSearch)
    $colDiv = _IETagNameGetCollection($oIE, "DIV")
    For $oDiv In $colDiv
        If StringInStr(_IEPropertyGet($oDiv, "innerText"), $sSearch) Then
            $oSib = $oDiv.nextSibling
            If IsObj($oSib) And $oSib.className & "" = "ldetails" Then Return True
        EndIf
    Next
    Return False
EndFunc   ;==>_TestDiv

;)

P.S. The @LF's in $sHTML are only there for easy human reading, the browser engine doesn't care if it was all one long line.

:)

Edited by PsaltyDS
Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

I see .. What's the difference between your code and the other code above ?!

I also got it with the post above yours ..

#include <IE.au3>
#include <file.au3>
$oIE = _IECreate()
$sHTML = _IEDocReadHTML ($oIE)

$x = Check("INetSourceTest", $sHTML)

Func Check($search, $sHTML, $2nd_line = "ldetails>")
    Local $i, $aSearch = StringSplit($sHTML, @LF, 2)
    For $i = 0 To UBound($aSearch) - 2
        If StringInStr($aSearch[$i], $search & " !") Then
            If StringInStr($aSearch[$i + 1], $2nd_line) Then
                Return SetError(0, 0, True)
            Else
                Return SetError(0, 0, False)
            EndIf
        EndIf
    Next
    Return SetError(1, 0, "Error")
EndFunc
Link to comment
Share on other sites

No difference in results if it works for you either way. I would expect using the DOM to directly address and read the desired elements would be easier to customize and maintain than string manipulation of the HTML, but that may be outside your comfort zone.

;)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

No difference in results if it works for you either way. I would expect using the DOM to directly address and read the desired elements would be easier to customize and maintain than string manipulation of the HTML, but that may be outside your comfort zone.

;)

I see .. so the function _TestDiv is mainly for that ?! So which one is the most accurate and could you add another return there please!? I mean If the script doesn't find the name its looking for then it would return ERROR .. If it's ok .. And how can I put the return in a variable ?!

thank you so much ..

Link to comment
Share on other sites

The example I posted already puts the return value in $RET before displaying it. If don't want to use the Boolean True/False return value, just change True/False to whatever value you want after "Return".

;)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

The example I posted already puts the return value in $RET before displaying it. If don't want to use the Boolean True/False return value, just change True/False to whatever value you want after "Return".

:)

No .. I'm good with the Boolean .. But can you add an ERROR !? If the name I'm looking for doesn't exist ?! ;)

Link to comment
Share on other sites

You could make the returns something like:

Return SetError(0, 0, True) ; Success/Found

; ...

Return SetError(1, 0, False) ; Fail/Not found

Then in my example you could test the results of the function with any of the following:

If _TestDiv("VanBlattidCat") Then
    ; Success
Else
    ; Fail
EndIf

; --- or ---

$RET = _TestDiv("VanBlattidCat")
If $RET Then 
    ; success
Else
    ; fail
EndIf

; --- or ---

_TestDiv("VanBlattidCat")
If Not @error Then
    ; success
Else
    ; fail
EndIf

Did you mean a literal string value of "ERROR"? That would be a little odd, but you could do this and get either the matching string or "ERROR":

Func _TestDiv($sSearch)
    $colDiv = _IETagNameGetCollection($oIE, "DIV")
    For $oDiv In $colDiv
        If StringInStr(_IEPropertyGet($oDiv, "innerText"), $sSearch) Then
            $oSib = $oDiv.nextSibling
            If IsObj($oSib) And $oSib.className & "" = "ldetails" Then Return SetError(0, 0, $oDiv.innerText)
        EndIf
    Next
    Return SetError(1, 0, "ERROR")
EndFunc   ;==>_TestDiv

;)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

I mean something like this .. how can I explain this ...

For example the script will look for VanBlattidCat ... Then if next class is ldetails will return TRUE .. If not it will return FALSE .. But If VanBlattidCat doesn't exist in the source it'll return an ERROR ..

If StringInStr($aSearch[$i + 1], $2nd_line) Then
                Return SetError(0, 0, True)
            Else
                Return SetError(0, 0, False)
            EndIf
        EndIf
    Next
    Return SetError(1, 0, "Error")
Link to comment
Share on other sites

OK, I think you mean like this:

Func _TestDiv($sSearch)
    Local $bolRET = False, $iErr = 1
    $colDiv = _IETagNameGetCollection($oIE, "DIV")
    For $oDiv In $colDiv
        If StringInStr(_IEPropertyGet($oDiv, "innerText"), $sSearch) Then
            $iErr = 0
            $oSib = $oDiv.nextSibling
            If IsObj($oSib) And $oSib.className & "" = "ldetails" Then
                $bolRET = True
                ExitLoop
            EndIf
        EndIf
    Next
    Return SetError($iErr, 0, $bolRET)
EndFunc   ;==>_TestDiv

This always returns True/False for the complete match, and also sets @error = 1 if the initial search string is never found at all.

;)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...