Jump to content

Problem with _StringBetween(HTML,tag_START,tag_ENG)


Recommended Posts

I'm having trouble trying to collect data that's inside an html tag, I haven't found a solution to it yet.

If you have ideas please help. Thank you

Script:

#include <String.au3>

Global $HTML_Test
$HTML_Test &= '<div class="accordion-item">' & @CRLF  ; <!---- START GET-->
$HTML_Test &= ' <div class="accordion-inner">' & @CRLF
$HTML_Test &= '     <p>Khoá an toàn giúp bếp luôn được an toàn</p>' & @CRLF
$HTML_Test &= ' </div>' & @CRLF
$HTML_Test &= ' <a href="#" class="accordion-title plain">' & @CRLF
$HTML_Test &= '     <button class="toggle">' & @CRLF
$HTML_Test &= '         <i class="icon-angle-down"></i>' & @CRLF
$HTML_Test &= '     </button>' & @CRLF
$HTML_Test &= '     <span>Khoá an toàn</span>' & @CRLF
$HTML_Test &= ' </a>' & @CRLF
$HTML_Test &= '</div>' & @CRLF  ;<!---- END GET -->

Global $aSearch = _StringBetween($HTML_Test, '<div class="accordion-item">', '</div>')
If IsArray($aSearch) Then
    For $i = 0 To UBound($aSearch) - 1
        ConsoleWrite('!-> SB Return: ' & $aSearch[$i] & @CRLF)
    Next
Else
    ConsoleWrite('! SB: No strings found. ' & @CRLF)
EndIf

 

Unexpected output:

<div class="accordion-inner">
        <p>Khoá an toàn giúp bếp luôn được an toàn</p>

 

Input:

<div class="accordion-item">  
    <div class="accordion-inner">
        <p>Khoá an toàn giúp bếp luôn được an toàn</p>
    </div>
    <a href="#" class="accordion-title plain">
        <button class="toggle">
            <i class="icon-angle-down"></i>
        </button>
        <span>Khoá an toàn</span>
    </a>
</div>

Desired output:

<div class="accordion-inner">
        <p>Khoá an toàn giúp bếp luôn được an toàn</p>
    </div>
    <a href="#" class="accordion-title plain">
        <button class="toggle">
            <i class="icon-angle-down"></i>
        </button>
        <span>Khoá an toàn</span>
    </a>

 

Edited by VIP

Regards,
 

Link to post
Share on other sites
  • VIP changed the title to Problem with _StringBetween(HTML,tag_START,tag_ENG)

Hm, the cause seems to be simple: StringBetween is doing what it should - it returns the text between your 

<div class="accordion-item">

and the first

</div>

without including the search-texts itself. So you'd need to find another end-string or solve it via regex. in your case, you could

#include <String.au3>

Global $HTML_Test
$HTML_Test &= '<div class="accordion-item">' & @CRLF  ; <!---- START GET-->
$HTML_Test &= ' <div class="accordion-inner">' & @CRLF
$HTML_Test &= '     <p>Khoá an toàn giúp bếp luôn được an toàn</p>' & @CRLF
$HTML_Test &= ' </div>' & @CRLF
$HTML_Test &= ' <a href="#" class="accordion-title plain">' & @CRLF
$HTML_Test &= '     <button class="toggle">' & @CRLF
$HTML_Test &= '         <i class="icon-angle-down"></i>' & @CRLF
$HTML_Test &= '     </button>' & @CRLF
$HTML_Test &= '     <span>Khoá an toàn</span>' & @CRLF
$HTML_Test &= ' </a>' & @CRLF
$HTML_Test &= '</div>' & @CRLF  ;<!---- END GET -->

Global $aSearch = _StringBetween($HTML_Test, '<div class="accordion-item">', '</a>' & @CRLF & '</div>')
If IsArray($aSearch) Then
    For $i = 0 To UBound($aSearch) - 1
        ConsoleWrite('!-> SB Return: ' & $aSearch[$i] & @CRLF)
    Next
Else
    ConsoleWrite('! SB: No strings found. ' & @CRLF)
EndIf

so you'll get the inner text. Of course the closing </a> gets lost because it is included in the $sEnd-String.

Or with a regex:

#include <String.au3>

Global $HTML_Test
$HTML_Test &= '<div class="accordion-item">' & @CRLF  ; <!---- START GET-->
$HTML_Test &= ' <div class="accordion-inner">' & @CRLF
$HTML_Test &= '     <p>Khoá an toàn giúp bếp luôn được an toàn</p>' & @CRLF
$HTML_Test &= ' </div>' & @CRLF
$HTML_Test &= ' <a href="#" class="accordion-title plain">' & @CRLF
$HTML_Test &= '     <button class="toggle">' & @CRLF
$HTML_Test &= '         <i class="icon-angle-down"></i>' & @CRLF
$HTML_Test &= '     </button>' & @CRLF
$HTML_Test &= '     <span>Khoá an toàn</span>' & @CRLF
$HTML_Test &= ' </a>' & @CRLF
$HTML_Test &= '</div>' & @CRLF  ;<!---- END GET -->

; Global $aSearch = _StringBetween($HTML_Test, '<div class="accordion-item">', '</a>' & @CRLF & '</div>')
$aSearch = StringRegExp($HTML_Test, '(?s)<div class="accordion-item">(.*)</div>',3)
If IsArray($aSearch) Then
    For $i = 0 To UBound($aSearch) - 1
        ConsoleWrite('!-> SB Return: ' & $aSearch[$i] & @CRLF)
    Next
Else
    ConsoleWrite('! SB: No strings found. ' & @CRLF)
EndIf

best regards,

Marc

Edited by Marc

Any of my own codes posted on the forum are free for use by others without any restriction of any kind. (WTFPL)

Link to post
Share on other sites

Couldn't be more detailed in the div tag, to use _StringBetween().
And your Regex code is not working correctly either.

Script with regex:

#include <String.au3>

Global $HTML_Test
$HTML_Test &= '<div class="0">[' & @CRLF  ;
$HTML_Test &= '<code unknown code 1>' & @CRLF  ;
$HTML_Test &= '<div class="1">[' & @CRLF  ;
$HTML_Test &= '<code unknown code 2>' & @CRLF  ;
$HTML_Test &= '<div class="2 3 4">[' & @CRLF  ;
$HTML_Test &= '<code unknown code 3>' & @CRLF  ;
$HTML_Test &= '2]</div>' & @CRLF  ;
$HTML_Test &= '<code unknown code 4>' & @CRLF  ;
$HTML_Test &= '1]</div>' & @CRLF  ;
$HTML_Test &= '<code unknown code 5>' & @CRLF  ;
$HTML_Test &= '0]</div>' & @CRLF  ;

Global $rSearch = _BetweenString($HTML_Test, '<div class="1">', '</div>')
ConsoleWrite('! ============================' & @CRLF & $rSearch & @CRLF & '! ============================' & @CRLF)
Exit

Func _BetweenString($iString, $iStart, $iEnd)
    Local $aSearch = StringRegExp($iString, '(?s)' & $iStart & '(.*)' & $iEnd, 3)
    If IsArray($aSearch) Then
        For $i = 0 To UBound($aSearch) - 1
            ;ConsoleWrite('!-> SB Return: ' & $aSearch[$i] & @CRLF)
            If ($aSearch[$i] <> "") Then Return $aSearch[$i]
        Next
    Else
        ConsoleWrite('! SB: No strings found. ' & @CRLF)
    EndIf
EndFunc   ;==>_BetweenString

 

Regards,
 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By VIP
      I would like to take String Between of the text, on starting point and end point found.
      #include <String.au3> Local $begin = "((" Local $end = "))" Local $loop = 0 For $i = 1 To 2 While 1 $loop += 1 $begin &= "(" If $loop = $i Then ExitLoop WEnd $loop = 0 While 1 $loop += 1 $end &= ")" If $loop = $i Then ExitLoop WEnd Local $keyS = "#", $keyE = "@" Local $0 = "A))" & $begin & $keyS & "(" & $end & $begin & $begin & "C" & $end & $begin & $end & ")" & $keyE & $end & "(((B" Local $A = _StringBetween($0, $keyS, $keyE) Local $C = $keyS & $A[0] & $keyE Local $B = _StringBetween($0, $begin, $end) Local $D = $B[0] ConsoleWrite("!-IN: " & $0 & @CRLF) ConsoleWrite("String Between: " & $begin & " <with> " & $end & @CRLF) If $C = $D Then ConsoleWrite("+OUT OK: " & $D & @CRLF & @CRLF) Else ConsoleWrite("- True result: " & $C & @CRLF) ConsoleWrite("! But OUT: " & $D & @CRLF & @CRLF) EndIf Next Local $begin = "((" Local $end = ")))" Local $loop = 0 For $i = 1 To 2 While 1 $loop += 1 $begin &= "(" If $loop = $i Then ExitLoop WEnd $loop = 0 While 1 $loop += 1 $end &= ")" If $loop = $i Then ExitLoop WEnd Local $keyS = "#", $keyE = "@" Local $0 = "A))" & $begin & $keyS & "(" & $end & $begin & $begin & "C" & $end & $begin & $end & ")" & $keyE & $end & "(((B" Local $A = _StringBetween($0, $keyS, $keyE) Local $C = $keyS & $A[0] & $keyE Local $B = _StringBetween($0, $begin, $end) Local $D = $B[0] ConsoleWrite("!-IN: " & $0 & @CRLF) ConsoleWrite("String Between: " & $begin & " <with> " & $end & @CRLF) If $C = $D Then ConsoleWrite("+OUT OK: " & $D & @CRLF & @CRLF) Else ConsoleWrite("- True result: " & $C & @CRLF) ConsoleWrite("! But OUT: " & $D & @CRLF & @CRLF) EndIf Next Local $begin = "(((" Local $end = ")" Local $loop = 0 For $i = 1 To 2 While 1 $loop += 1 $begin &= "(" If $loop = $i Then ExitLoop WEnd $loop = 0 While 1 $loop += 1 $end &= ")" If $loop = $i Then ExitLoop WEnd Local $keyS = "#", $keyE = "@" Local $0 = "A))" & $begin & $keyS & $end & "(" & $begin & $begin & "C" & $end & $begin & $end & ")" & $keyE & $end & "(((B" Local $A = _StringBetween($0, $keyS, $keyE) Local $C = $keyS & $A[0] & $keyE Local $B = _StringBetween($0, $begin, $end) Local $D = $B[0] ConsoleWrite("!-IN: " & $0 & @CRLF) ConsoleWrite("String Between: " & $begin & " <with> " & $end & @CRLF) If $C = $D Then ConsoleWrite("+OUT OK: " & $D & @CRLF & @CRLF) Else ConsoleWrite("- True result: " & $C & @CRLF) ConsoleWrite("! But OUT: " & $D & @CRLF & @CRLF) EndIf Next 
    • By DatMCEyeBall
      I am working on a script that loads data from a file and displays it with bold, highlighted etc. text.
      The text looks something like this "This is <b>bold</b> text and this is <c=FF0000>red<c> text"
      So how exactly would I go about doing this. (maybe some StringRegExp() magic)

      NOTE: I want to load the text from a .txt file not a .rtf file.

      I've tried searching and found nothing of intrest.
×
×
  • Create New...