Jump to content
youtuber

How to use _StringBetween in succession?

Recommended Posts

youtuber

Hi I am able to get rid of the Regex complexity and to be easier, how can I use _StringBetween succession or another? thanks.

#include <Array.au3>
#include <String.au3>

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
For $j = 0 To UBound($aData) - 1
    If IsArray($aData) Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

        For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _StringBetween($aPostData[$s], '<div class="entry-content">','<span class="synved-social-container');My question is exactly for this line
                $StringBetw2 = StringRegExpReplace($StringBetw2, "(?is)(<script[^>]+javascript.*?/script>)", "")
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(&nbsp;)+', "")
                ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
            Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

Func _oHTTPGet($aUrL)
    Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
    $oHTTP.Open("GET", $aUrL, False)
    $oHTTP.SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0")
    $oHTTP.Send()
    If @error Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " Not Connect " & @CRLF)
        $oHTTP = 0
        Return SetError(1)
    EndIf

    If $oHTTP.Status = 200 Then
        Local $sReceived = $oHTTP.ResponseText
        $oHTTP = Null
        Return $sReceived
    EndIf
   $oHTTP = Null
    Return -1
EndFunc

 

Share this post


Link to post
Share on other sites
Jfish

Have a look at the string that is in $aPostData[$s].  I don't think it contains the substrings you are searching on in your _StringBetween.  I put that text to the console, then copied and pasted it into NotePad++ and could not find the start or end substrings.  

  • Like 1

Build your own poker game with AutoIt: pokerlogic.au3 | Learn To Program Using FREE Tools with AutoIt

Share this post


Link to post
Share on other sites
youtuber

Here is in the source section

view-source:https://www.autoitscript.com/site/autoit-news/autoit-v3-3-14-0-released/

 

Share this post


Link to post
Share on other sites
youtuber

I understand thank you @mikell
Do you see a problem in my other codes?

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
For $j = 0 To UBound($aData) - 1
    If IsArray($aData) Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

        For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _StringBetween($aPostData[$s], '<div class="entry-content','<!-- .entry-content -->')
                For $i = 0 To UBound($StringBetw2) - 1
                $StringBetw2 = StringRegExpReplace($StringBetw2[$i], "(?is)(<script[^>]+javascript.*?/script>)", "")
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                $StringBetw2 = StringRegExpReplace($StringBetw2, '( )+', "")
                ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
                Next
                Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

 

Edited by youtuber

Share this post


Link to post
Share on other sites
iamtheky

Mikell is right, but it's also easy enough to stringify that array, and  stringbetween iterates its own damn self, so isnt this the same thing (save for whatever cleaning you were doing for the GET)?

For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _ArrayToString(_StringBetween($aPostData[$s], '<div class="entry-content">','<span class="synved-social-container'));My question is exactly for this line
;~                 $StringBetw2 = StringRegExpReplace($StringBetw2, "(?is)(<script[^>]+javascript.*?/script>)", "")
;~                 $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
;~                 $StringBetw2 = StringRegExpReplace($StringBetw2, '(&nbsp;)+', "")
                ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
            Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

 

Edited by iamtheky
put a gd unacceptable apostrophe in 'its'
  • Like 1

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
youtuber
20 minutes ago, iamtheky said:

so isnt this the same thing (save for whatever cleaning you were doing for the GET)?

I am cleaning up html tags for me

when you disable

Spoiler

kOhDv4WgQnyRckiLkTRL-w.png

 

When enabled

Spoiler

IfOAejuCSgOuGAQta-lvUg.png

 

And should I use this?

#include <Array.au3>
#include <String.au3>

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
For $j = 0 To UBound($aData) - 1
    If IsArray($aData) Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _ArrayToString(_StringBetween($aPostData[$s], '<div class="entry-content','<!-- .entry-content -->'));My question is exactly for this line
                $StringBetw2 = StringRegExpReplace($StringBetw2, "(?is)(<script[^>]+javascript.*?/script>)", "")
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                $StringBetw2 = StringRegExpReplace($StringBetw2, '(&nbsp;)+', "")
                ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
            Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

Func _oHTTPGet($aUrL)
    Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
    $oHTTP.Open("GET", $aUrL, False)
    $oHTTP.SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0")
    $oHTTP.Send()
    If @error Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " Not Connect " & @CRLF)
        $oHTTP = 0
        Return SetError(1)
    EndIf

    If $oHTTP.Status = 200 Then
        Local $sReceived = $oHTTP.ResponseText
        $oHTTP = Null
        Return $sReceived
    EndIf
   $oHTTP = Null
    Return -1
EndFunc

 

Edited by youtuber

Share this post


Link to post
Share on other sites
iamtheky

nice, combining some of those seems like a sporting next task, prior to nesting all that shit on one line for fun.

Edited by iamtheky
grammar

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
mikell

May I add... it could be a great idea to swap these

For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then

because running a For/Next loop through an array works - generally - much better if the concerned array is really an array  :)

  • Like 1

Share this post


Link to post
Share on other sites
youtuber

@mikell So is she okay?

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
For $j = 0 To UBound($aData) - 1
    If IsArray($aData) Then
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

        For $s = 0 To UBound($aPostData) - 1
            If IsArray($aPostData) Then
                $StringBetw2 = _StringBetween($aPostData[$s], '<div class="entry-content','<!-- .entry-content -->')
                For $i = 0 To UBound($StringBetw2) - 1
                    If IsArray($StringBetw2) Then
                        $StringBetw2 = StringRegExpReplace($StringBetw2[$i], "(?is)(<script[^>]+javascript.*?/script>)", "")
                        $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                        $StringBetw2 = StringRegExpReplace($StringBetw2, '( )+', "")
                        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
                        EndIf
                    Next
                    Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem " & @CRLF)
            EndIf
        Next
    EndIf
Next

 

Share this post


Link to post
Share on other sites
mikell

Well I meant : check the array before the loop , so you can see the results/errors

$string = _oHTTPGet("https://www.autoitscript.com/site/post-sitemap.xml")
$string = StringRegExpReplace($string, '(?s)[\n\r\t\v]', '')
$string = StringStripWS($string, 7)
$aData = _StringBetween($string, '<loc>','</loc>')
If IsArray($aData) Then
  For $j = 0 To UBound($aData) - 1
        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $aData[$j] & @CRLF)
        $string2 = _oHTTPGet($aData[$j])
        $string2 = StringRegExpReplace($string2, '(?s)[\n\r\t\v]', '')
        $string2 = StringStripWS($string2, 7)
        $aPostData = _StringBetween($string2, '</head>','<footer')

        If IsArray($aPostData) Then
           For $s = 0 To UBound($aPostData) - 1
                $StringBetw2 = _StringBetween($aPostData[$s], '<div class="entry-content','<!-- .entry-content -->')
                If IsArray($StringBetw2) Then
                   For $i = 0 To UBound($StringBetw2) - 1
                        $StringBetw2 = StringRegExpReplace($StringBetw2[$i], "(?is)(<script[^>]+javascript.*?/script>)", "")
                        $StringBetw2 = StringRegExpReplace($StringBetw2, '(?s)<.*?>', "" & @CRLF)
                        $StringBetw2 = StringRegExpReplace($StringBetw2, '( )+', "")
                        ConsoleWrite("Line : " & @ScriptLineNumber & " : " & $StringBetw2 & @CRLF)
                    Next
                Else
                    ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem $StringBetw2" & @CRLF)
                EndIf
           Next
       Else
            ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem $aPostData" & @CRLF)
       EndIf
  Next
Else
     ConsoleWrite("Line : " & @ScriptLineNumber & " : " & " Problem $aData" & @CRLF)
EndIf

 

  • Like 1

Share this post


Link to post
Share on other sites
youtuber

@mikell Thank you 

How can I avoid the long vertical gaps that occur?

Spoiler

Kc3P0XYAR9mV_jhtTP808g.png

 

Share this post


Link to post
Share on other sites
mikell

Matter of patterns in the SRER, depending on the expected result
But gaps are whitespaces so StringStripWS with the adequate flag could do the job

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.