Jump to content

How to use _StringBetween in succession?


youtuber
 Share

Recommended Posts

Hi I am able to get rid of the Regex complexity and to be easier, how can I use _StringBetween succession or another? thanks.

#include <Array.au3>
#include <String.au3>

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
For $j = 0 To UBound($aData) - 1
    If IsArray($aData) Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

        For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _StringBetween($aPostData[$s], '<div class="entry-content">','<span class="synved-social-container');My question is exactly for this line
                $StringBetw2 = StringRegExpReplace($StringBetw2, "(?is)(<script[^>]+javascript.*?/script>)", "")
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(&nbsp;)+', "")
                ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
            Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

Func _oHTTPGet($aUrL)
    Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
    $oHTTP.Open("GET", $aUrL, False)
    $oHTTP.SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0")
    $oHTTP.Send()
    If @error Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " Not Connect " & @CRLF)
        $oHTTP = 0
        Return SetError(1)
    EndIf

    If $oHTTP.Status = 200 Then
        Local $sReceived = $oHTTP.ResponseText
        $oHTTP = Null
        Return $sReceived
    EndIf
   $oHTTP = Null
    Return -1
EndFunc

 

Link to comment
Share on other sites

Have a look at the string that is in $aPostData[$s].  I don't think it contains the substrings you are searching on in your _StringBetween.  I put that text to the console, then copied and pasted it into NotePad++ and could not find the start or end substrings.  

Build your own poker game with AutoIt: pokerlogic.au3 | Learn To Program Using FREE Tools with AutoIt

Link to comment
Share on other sites

I understand thank you @mikell
Do you see a problem in my other codes?

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
For $j = 0 To UBound($aData) - 1
    If IsArray($aData) Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

        For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _StringBetween($aPostData[$s], '<div class="entry-content','<!-- .entry-content -->')
                For $i = 0 To UBound($StringBetw2) - 1
                $StringBetw2 = StringRegExpReplace($StringBetw2[$i], "(?is)(<script[^>]+javascript.*?/script>)", "")
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                $StringBetw2 = StringRegExpReplace($StringBetw2, '( )+', "")
                ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
                Next
                Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

 

Edited by youtuber
Link to comment
Share on other sites

Mikell is right, but it's also easy enough to stringify that array, and  stringbetween iterates its own damn self, so isnt this the same thing (save for whatever cleaning you were doing for the GET)?

For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _ArrayToString(_StringBetween($aPostData[$s], '<div class="entry-content">','<span class="synved-social-container'));My question is exactly for this line
;~                 $StringBetw2 = StringRegExpReplace($StringBetw2, "(?is)(<script[^>]+javascript.*?/script>)", "")
;~                 $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
;~                 $StringBetw2 = StringRegExpReplace($StringBetw2, '(&nbsp;)+', "")
                ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
            Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

 

Edited by iamtheky
put a gd unacceptable apostrophe in 'its'

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

20 minutes ago, iamtheky said:

so isnt this the same thing (save for whatever cleaning you were doing for the GET)?

I am cleaning up html tags for me

when you disable

Spoiler

kOhDv4WgQnyRckiLkTRL-w.png

 

When enabled

Spoiler

IfOAejuCSgOuGAQta-lvUg.png

 

And should I use this?

#include <Array.au3>
#include <String.au3>

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
For $j = 0 To UBound($aData) - 1
    If IsArray($aData) Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _ArrayToString(_StringBetween($aPostData[$s], '<div class="entry-content','<!-- .entry-content -->'));My question is exactly for this line
                $StringBetw2 = StringRegExpReplace($StringBetw2, "(?is)(<script[^>]+javascript.*?/script>)", "")
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(&nbsp;)+', "")
                ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
            Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

Func _oHTTPGet($aUrL)
    Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
    $oHTTP.Open("GET", $aUrL, False)
    $oHTTP.SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0")
    $oHTTP.Send()
    If @error Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " Not Connect " & @CRLF)
        $oHTTP = 0
        Return SetError(1)
    EndIf

    If $oHTTP.Status = 200 Then
        Local $sReceived = $oHTTP.ResponseText
        $oHTTP = Null
        Return $sReceived
    EndIf
   $oHTTP = Null
    Return -1
EndFunc

 

Edited by youtuber
Link to comment
Share on other sites

nice, combining some of those seems like a sporting next task, prior to nesting all that shit on one line for fun.

Edited by iamtheky
grammar

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

@mikell So is she okay?

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
For $j = 0 To UBound($aData) - 1
    If IsArray($aData) Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

        For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _StringBetween($aPostData[$s], '<div class="entry-content','<!-- .entry-content -->')
                For $i = 0 To UBound($StringBetw2) - 1
                    If IsArray($StringBetw2) Then
                        $StringBetw2 = StringRegExpReplace($StringBetw2[$i], "(?is)(<script[^>]+javascript.*?/script>)", "")
                        $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                        $StringBetw2 = StringRegExpReplace($StringBetw2, '( )+', "")
                        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
                        EndIf
                    Next
                    Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

 

Link to comment
Share on other sites

Well I meant : check the array before the loop , so you can see the results/errors

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
If IsArray($aData) Then
  For $j = 0 To UBound($aData) - 1
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

        If IsArray($aPostData) Then
           For $s = 0 To UBound($aPostData) - 1
                $StringBetw2 = _StringBetween($aPostData[$s], '<div class="entry-content','<!-- .entry-content -->')
                If IsArray($StringBetw2) Then
                   For $i = 0 To UBound($StringBetw2) - 1
                        $StringBetw2 = StringRegExpReplace($StringBetw2[$i], "(?is)(<script[^>]+javascript.*?/script>)", "")
                        $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                        $StringBetw2 = StringRegExpReplace($StringBetw2, '( )+', "")
                        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
                    Next
                Else
                    ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem $StringBetw2" & @CRLF)
                EndIf
           Next
       Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem $aPostData" & @CRLF)
       EndIf
  Next
Else
     ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem $aData" & @CRLF)
EndIf

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...