Overlane

Error when get URL from file a go to it

24 posts in this topic

 

 

Hello
 
im trying to get a URL from txt file, visit it, find for "word1" or "word2", stop if found,
else, go to next URL and start over
 
but this is saying the word was found, even when the word is not there. (stops at the first URL)

 

my code:

#include <File.au3>
#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <INet.au3>

$file = "c:\servers.txt"
FileOpen($file, 0)

For $i = 1 to _FileCountLines($file)
    $line = FileReadLine($file, $i)

    $page = _INetGetSource($line)
      $pageArray = StringSplit($page, @CRLF)

For $i = 1 To $pageArray[0]
    If StringInStr($pageArray[$i], "word1" or "word2") Then MsgBox(0, "Found it", $pageArray[$i])
Next

Next
 

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

You are using the same variable $i in both your loops.

You need to use a different one in your loop within a loop .... try $a for instance.

The second instance changes the value for the first, and that is where your problem lies.

P.S. Don't forget to change all the other required instances.

#include <File.au3>
#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <INet.au3>

$file = "c:\servers.txt"
FileOpen($file, 0)

For $i = 1 to _FileCountLines($file)
      $line = FileReadLine($file, $i)

      $page = _INetGetSource($line)
      $pageArray = StringSplit($page, @CRLF)

      For $a = 1 To $pageArray[0]
          If StringInStr($pageArray[$a], "word1" or "word2") Then MsgBox(0, "Found it", $pageArray[$a])
      Next

Next
Edited by TheSaint

AutoIt.4.Life Clubrooms - Life is like a Donut (secret key)

Make sure brain is in gear before opening mouth!
Remember, what is not said, can be just as important as what is said.

Spoiler

What is the Secret Key? Life is like a Donut

If I put effort into communication, I expect you to read properly & fully, or just not comment.
Ignoring those who try to divert conversation with irrelevancies.
If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it.
I'm only big and bad, to those who have an over-active imagination.

I may have the Artistic Liesense ;) to disagree with you. TheSaint's Toolbox (be advised many downloads are not working due to ISP screwup with my storage)

userbar.png

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

 

You are using the same variable $i in both your loops.

You need to use a different one in your loop within a loop .... try $a for instance.

The second instance changes the value for the first, and that is where your problem lies.

P.S. Don't forget to change all the other required instances.

#include <File.au3>
#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <INet.au3>

$file = "c:\servers.txt"
FileOpen($file, 0)

For $i = 1 to _FileCountLines($file)
      $line = FileReadLine($file, $i)

      $page = _INetGetSource($line)
      $pageArray = StringSplit($page, @CRLF)

      For $a = 1 To $pageArray[0]
          If StringInStr($pageArray[$a], "word1" or "word2") Then MsgBox(0, "Found it", $pageArray[$a])
      Next

Next

 

Same issue, still getting false positive at the first URL.

i changed the word1 to a word which i know that is only at the website of the line 6. Still stopping at the first attempt.

Edited by Overlane

Share this post


Link to post
Share on other sites

$str = "ab"
If StringInStr($str, "a" or "b") Then MsgBox(0, "", "Found it")

This doesn't work for me

Is it a new syntax implemented in the Beta ? :huh2:

Share this post


Link to post
Share on other sites

Try simplifying it (well, this reads easier for me than yours at least, it may not for others):

#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <INet.au3>

$file = "c:\servers.txt"

; splitting up the file into lines
Global $gaURLs = StringSplit(StringStripCR(FileRead($file)), @LF)
; putting words to find in an array
Global $gaWords[2] = ["word1", "word2"]
Global $gsSource = ""

; create match pattern for regex engine
Global $gsFind = "(?si)"
For $words = 0 To UBound($gaWords) - 1
    $gsFind &= "\Q" & $gaWords[$words] & "\E|"
Next
$gsFind = StringTrimRight($gsFind, 1)

For $i = 1 To $gaURLs[0]
    $gsSource = _INetGetSource($gaURLs[$i])
    If StringRegExp($gsSource, $gsFind) Then
        ; found
        MsgBox(0, 0, "Data found")
    EndIf
Next

.

1 person likes this

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

$str = "True"
If StringInStr($str, "a" or "b") Then MsgBox(0, "", "Found it")

its obfuscation   :thumbsup:

Edited by boththose
1 person likes this

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Try simplifying it (well, this reads easier for me than yours at least, it may not for others):

#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <INet.au3>

$file = "c:\servers.txt"

; splitting up the file into lines
Global $gaURLs = StringSplit(StringStripCR(FileRead($file)), @LF)
; putting words to find in an array
Global $gaWords[2] = ["word1", "word2"]
Global $gsSource = ""

; create match pattern for regex engine
Global $gsFind = "(?si)"
For $words = 0 To UBound($gaWords) - 1
    $gsFind &= "\Q" & $gaWords[$words] & "\E|"
Next
$gsFind = StringTrimRight($gsFind, 1)

For $i = 1 To $gaURLs[0]
    $gsSource = _INetGetSource($gaURLs[$i])
    If StringRegExp($gsSource, $gsFind) Then
        ; found
        MsgBox(0, 0, "Data found")
    EndIf
Next

.

 

Worked perfectly, thank you.  :thumbsup:

Share this post


Link to post
Share on other sites

i have two more questions

1 - Can i randomize the read of lines instead of get it sequentially ?

2 - Can i make it load and access more than one line/url at a time ?

Share this post


Link to post
Share on other sites

1) You can use the Random() function, to pick a line, but you would have to keep another Array with that line #, to make sure you do not access the same line more than once.

2) You can use _FileReadToArray, to load all the lines, and then make If/Then based on those elements - ie to search for your key words in each element,


All by me:

"Sometimes you have to go back to where you started, to get to where you want to go." 

"Everybody catches up with everyone, eventually" 

"As you teach others, you are really teaching yourself."

From my dad

"Do not worry about yesterday, as the only thing that you can control is tomorrow."

 

WindowsError.gif

WIKI | Tabs; | Arrays; | Strings | Wiki Arrays | How to ask a Question | Forum Search | FAQ | Tutorials | Original FAQ | ONLINE HELP | UDF's Wiki | AutoIt PDF

AutoIt Snippets | Multple Guis | Interrupting a running function | Another Send

StringRegExp | StringRegExp Help | RegEXTester | REG TUTOR | Reg TUTOT 2

AutoItSetOption | Macros | AutoIt Snippets | Wrapper | Autoit  Docs

SCITE | SciteJump | BB | MyTopics | Programming | UDFs | AutoIt 123 | UDFs Form | UDF

Learning to script | Tutorials | Documentation | IE.AU3 | Games? | FreeSoftware | Path_Online | Core Language

Programming Tips

Excel Changes

ControlHover.UDF

GDI_Plus

Draw_On_Screen

GDI Basics

GDI_More_Basics

GDI Rotate

GDI Graph

GDI  CheckExistingItems

GDI Trajectory

Replace $ghGDIPDll with $__g_hGDIPDll

DLL 101?

Array via Object

GDI Swimlane

GDI Plus French 101 Site

GDI Examples UEZ

GDI Basic Clock

GDI Detection

Ternary operator

Share this post


Link to post
Share on other sites

1.

#include <Array.au3>
#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <INet.au3>

$file = "c:\servers.txt"

; splitting up the file into lines
Global $gaURLs = StringSplit(StringStripCR(FileRead($file)), @LF)
; putting words to find in an array
Global $gaWords[2] = ["word1", "word2"]
Global $gsSource = ""

; create match pattern for regex engine
Global $gsFind = "(?si)"
For $words = 0 To UBound($gaWords) - 1
    $gsFind &= "\Q" & $gaWords[$words] & "\E|"
Next
$gsFind = StringTrimRight($gsFind, 1)

; uncomment below to see array urls
;~ _ArrayDisplay($gaURLs)

; randomize array
_ArrayShuffle($gaURLs, 1)

; uncomment below to see array urls randomized
;~ _ArrayDisplay($gaURLs)

For $i = 1 To $gaURLs[0]
    $gsSource = _INetGetSource($gaURLs[$i])
    If StringRegExp($gsSource, $gsFind) Then
        ; found
        MsgBox(0, 0, "Data found")
    EndIf
Next

2. Why? (makes little sense if you're trying to find the 1 url that contains your string)

You'd have to use Step in the For/Next loop then put more If/Then's, but it wouldn't make it any faster.

If you're trying to check them all, you could just concatenate the $gsSource string in the loop and do one regex comparison at the end of the concatenation.


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

$str = "ab"
If StringInStr($str, "a" or "b") Then MsgBox(0, "", "Found it")

This doesn't work for me

Is it a new syntax implemented in the Beta ? :huh2:

 

It's perfectly fine code that searches for the string representation of a boolean in a string :)

; This for instance gives you a messagebox, since the truth value of ("a" or "b") is true:

$str = "True"
If StringInStr($str, "a" or "b") Then MsgBox(0, "", "Found it")
; This also works, since 1 != 2:

$str = "False"
If StringInStr($str, 1 == 2) Then MsgBox(0, "", "Found it")

Well... Perfectly fine... In the sense that it works predictably. Not that everyone should ever use logic in AutoIt like this, ever :)

Edited by SadBunny

Roses are FF0000, violets are 0000FF... All my base are belong to you.

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

1) You can use the Random() function, to pick a line, but you would have to keep another Array with that line #, to make sure you do not access the same line more than once.

2) You can use _FileReadToArray, to load all the lines, and then make If/Then based on those elements - ie to search for your key words in each element,

 

1.

#include <Array.au3>
#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <INet.au3>

$file = "c:\servers.txt"

; splitting up the file into lines
Global $gaURLs = StringSplit(StringStripCR(FileRead($file)), @LF)
; putting words to find in an array
Global $gaWords[2] = ["word1", "word2"]
Global $gsSource = ""

; create match pattern for regex engine
Global $gsFind = "(?si)"
For $words = 0 To UBound($gaWords) - 1
    $gsFind &= "\Q" & $gaWords[$words] & "\E|"
Next
$gsFind = StringTrimRight($gsFind, 1)

; uncomment below to see array urls
;~ _ArrayDisplay($gaURLs)

; randomize array
_ArrayShuffle($gaURLs, 1)

; uncomment below to see array urls randomized
;~ _ArrayDisplay($gaURLs)

For $i = 1 To $gaURLs[0]
    $gsSource = _INetGetSource($gaURLs[$i])
    If StringRegExp($gsSource, $gsFind) Then
        ; found
        MsgBox(0, 0, "Data found")
    EndIf
Next

2. Why? (makes little sense if you're trying to find the 1 url that contains your string)

You'd have to use Step in the For/Next loop then put more If/Then's, but it wouldn't make it any faster.

If you're trying to check them all, you could just concatenate the $gsSource string in the loop and do one regex comparison at the end of the concatenation.

 

Thanks

1.  This _ArrayShuffle don't get the same line more then once, right ?

2. Because i need check > 300 hundreds links. this script getting around 1000ms + slow server responses, per link. Total time is too long to what i need.

    Happily i have a good internet so access multiple links wouldn't be a problem. About the source: yes, they could be all together since the information i need is around the string.

Edited by Overlane

Share this post


Link to post
Share on other sites

1. Test it, I left that there for you to, just uncomment the two _ArrayDisplays

2. There's no multi-threading in autoit, so it's not going to get much faster unless you run multiple scripts.


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

1. Test it, I left that there for you to, just uncomment the two _ArrayDisplays

2. There's no multi-threading in autoit, so it's not going to get much faster unless you run multiple scripts.

1. i tested, it dont repeat :D

2. i tried concatenate like you said = nothing. Still getting 1 link per second.

#include <Array.au3>
#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <INet.au3>

$file = "c:\servers.txt"
$file2 = "c:\servers2.txt"

; splitting up the file into lines
Global $gaURLs = StringSplit(StringStripCR(FileRead($file)), @LF)
Global $gaURLs2 = StringSplit(StringStripCR(FileRead($file2)), @LF)
; putting words to find in an array
Global $gaWords[1] = ["word"]
Global $gsSource = ""
Global $gsSource2 = ""

; create match pattern for regex engine
Global $gsFind = "(?si)"
For $words = 0 To UBound($gaWords) - 1
    $gsFind &= "\Q" & $gaWords[$words] & "\E|"
Next
$gsFind = StringTrimRight($gsFind, 1)

; uncomment below to see array urls
;_ArrayDisplay($gaURLs)

; randomize array
_ArrayShuffle($gaURLs, 1)
_ArrayShuffle($gaURLs2, 1)

; uncomment below to see array urls randomized
_ArrayDisplay($gaURLs)
_ArrayDisplay($gaURLs2)

For $i = 1 To $gaURLs[0]
For $i2 = 1 To $gaURLs2[0]
    $gsSource = _INetGetSource($gaURLs2[$i])
    $gsSource2 = _INetGetSource($gaURLs2[$i2])
    $gsSource2 &= $gsSource ; concatenate
   If StringRegExp($gsSource2, $gsFind) Then
        ; found
        MsgBox(0, "Data found", $gsSource)
        Exit
     EndIf
  Next
  Next

Share this post


Link to post
Share on other sites

I've explained why you are only ever going to get 1 per second.


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

Well, i though this limitation was present only in for and next loops, since you said "If you're trying to check them all, you could just concatenate..."  :ermm: 

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

I have NOT debugged or ran the code I'm about to provide, I'm merely providing the logic and hope it runs out of the box for you.  You can do the leg work and debug.

This script takes 2 command lines, url and words pattern for regex

Compile this script (name it something like myInetGetSource.au3 and compile).

#include <INet.au3>
#NoTrayIcon

; there are a minimum of 2 switches
; 1: url
; 2: word to find, each command line after would be a word to find
If Not ($CmdLine[0] = 2) Then Exit 0

If StringRegExp(_INetGetSource($cmdline[1]), $CmdLine[2]) Then Exit 1

Exit 0 ; nothing found

From here you can manipulate your main script... something like:

#include <Array.au3>
#include <ProcessConstants.au3>
#include <WinAPIProc.au3>
#include <WinAPISys.au3>

Global $gsFile = "c:\servers.txt"

Global $gsGetSourceExe = "name of exe here";
Global $giMaxRun = 5 ; how many exe's to run at a time

; splitting up the file into lines
Global $gaURLs = StringSplit(StringStripCR(FileRead($gsFile)), @LF)
; putting words to find in an array
Global $gaWords[2] = ["word1", "word2"]

; create regexp to capture words
Global $gsFind = "(?si)"
For $words = 2 To $CmdLine[0]
    $gsFind &= "\Q" & $gaWords[$words] & "\E|"
Next
$gsFind = StringTrimRight($gsFind, 1)

; randomize array
_ArrayShuffle($gaURLs, 1)

Global $gaData[UBound($gaURLs)][4]; [n][0] = url, [n][1] = pid, [n][2] = process handle, [n][3] = exit code
Global $iCount, $iLoop
; launch and monitor
For $i = 1 To UBound($gaURLs) - 1 Step $giMaxRun
    $iCount = 0
    For $j = 0 To $giMaxRun - 1
        If ($i + $j) > (UBound($gaURLs) - 1) Then ExitLoop
        $gaData[$i + $j][0] = $gaURLs[$i + $j]
        $gaData[$i + $j][1] = _myRunToCmdLine($gsGetSourceExe, $gaURLs[$i + $j], $gsFind)
        $gaData[$i + $j][2] = _WinAPI_OpenProcess($PROCESS_QUERY_INFORMATION, 0, $gaData[$i + $j][1])
        $iCount += 1
    Next
    ; wait for them to be done
    ; now we could do a real monitor and it would speed it up even more
    ;  but I don't have the patience to write it
    $iLoop = $iCount
    While $iCount
        For $j = 0 To $iLoop - 1
            If Not StringLen($gaData[$i + $j][2]) Then
                If Not ProcessExists($gaData[$i + $j][1]) Then
                    $gaData[$i + $j][3] = Int(_WinAPI_GetExitCodeProcess($gaData[$i + $j][2]))
                    _WinAPI_CloseHandle($gaData[$i + $j][2])
                    $iCount -= 1
                EndIf
            EndIf
        Next
        Sleep(10) ; sanity sleep
    WEnd
Next

; $gaData array now holds all your data
; it will show url in [n][0] and the exit code in [n][3]
; if exit code is 1, then the word(s) was/were found
_ArrayDisplay($gaData)

Func _myRunToCmdLine($sExe, $sURL, $sPattern)

    If StringInStr($sExe, " ", 1, 1) Then
        $sExe = '"' & StringReplace($sExe, '"', '""', 0, 1) & '"'
    EndIf

    If StringInStr($sURL, " ", 1, 1) Then
        $sURL = '"' & $sURL & '"'
    EndIf

    If StringInStr($sPattern, " ", 1, 1) Then
        $sPattern = '"' & StringReplace($sPattern, '"', '""', 0, 1) & '"'
    EndIf

    Return Run($sExe & " " & $sURL & " " & $sPattern)
EndFunc

Good luck.

Edited by SmOke_N
had to edit for/next loop in while/wend loop
1 person likes this

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

I have NOT debugged or ran the code I'm about to provide, I'm merely providing the logic and hope it runs out of the box for you.  You can do the leg work and debug.

This script takes 2 command lines, url and words pattern for regex

Compile this script (name it something like myInetGetSource.au3 and compile).

#include <INet.au3>
#NoTrayIcon

; there are a minimum of 2 switches
; 1: url
; 2: word to find, each command line after would be a word to find
If Not ($CmdLine[0] = 2) Then Exit 0

If StringRegExp(_INetGetSource($cmdline[1]), $CmdLine[2]) Then Exit 1

Exit 0 ; nothing found

From here you can manipulate your main script... something like:

#include <Array.au3>
#include <ProcessConstants.au3>
#include <WinAPIProc.au3>
#include <WinAPISys.au3>

Global $gsFile = "c:\servers.txt"

Global $gsGetSourceExe = "name of exe here";
Global $giMaxRun = 5 ; how many exe's to run at a time

; splitting up the file into lines
Global $gaURLs = StringSplit(StringStripCR(FileRead($gsFile)), @LF)
; putting words to find in an array
Global $gaWords[2] = ["word1", "word2"]

; create regexp to capture words
Global $gsFind = "(?si)"
For $words = 2 To $CmdLine[0]
    $gsFind &= "\Q" & $gaWords[$words] & "\E|"
Next
$gsFind = StringTrimRight($gsFind, 1)

; randomize array
_ArrayShuffle($gaURLs, 1)

Global $gaData[UBound($gaURLs)][4]; [n][0] = url, [n][1] = pid, [n][2] = process handle, [n][3] = exit code
Global $iCount, $iLoop
; launch and monitor
For $i = 1 To UBound($gaURLs) - 1 Step $giMaxRun
    $iCount = 0
    For $j = 0 To $giMaxRun - 1
        If ($i + $j) > (UBound($gaURLs) - 1) Then ExitLoop
        $gaData[$i + $j][0] = $gaURLs[$i + $j]
        $gaData[$i + $j][1] = _myRunToCmdLine($gsGetSourceExe, $gaURLs[$i + $j], $gsFind)
        $gaData[$i + $j][2] = _WinAPI_OpenProcess($PROCESS_QUERY_INFORMATION, 0, $gaData[$i + $j][1])
        $iCount += 1
    Next
    ; wait for them to be done
    ; now we could do a real monitor and it would speed it up even more
    ;  but I don't have the patience to write it
    $iLoop = $iCount
    While $iCount
        For $j = 0 To $iLoop - 1
            If Not StringLen($gaData[$i + $j][2]) Then
                If Not ProcessExists($gaData[$i + $j][1]) Then
                    $gaData[$i + $j][3] = Int(_WinAPI_GetExitCodeProcess($gaData[$i + $j][2]))
                    _WinAPI_CloseHandle($gaData[$i + $j][2])
                    $iCount -= 1
                EndIf
            EndIf
        Next
        Sleep(10) ; sanity sleep
    WEnd
Next

; $gaData array now holds all your data
; it will show url in [n][0] and the exit code in [n][3]
; if exit code is 1, then the word(s) was/were found
_ArrayDisplay($gaData)

Func _myRunToCmdLine($sExe, $sURL, $sPattern)

    If StringInStr($sExe, " ", 1, 1) Then
        $sExe = '"' & StringReplace($sExe, '"', '""', 0, 1) & '"'
    EndIf

    If StringInStr($sURL, " ", 1, 1) Then
        $sURL = '"' & $sURL & '"'
    EndIf

    If StringInStr($sPattern, " ", 1, 1) Then
        $sPattern = '"' & StringReplace($sPattern, '"', '""', 0, 1) & '"'
    EndIf

    Return Run($sExe & " " & $sURL & " " & $sPattern)
EndFunc

Good luck.

 

Sorry man, its not working

i did put a traytip to see where it fails, is somewhere around the line 40.

Share this post


Link to post
Share on other sites

Like I said in the first paragraph, wasn't really my concern if it worked or not.  It was up to you to debug, I merely provided you the concept code.  Didn't run it or debug it myself, that's your job ;) .


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

I was just spot looking at this, and realized I added another index to the 2nd dimension... so,

This:

If Not StringLen($gaData[$i + $j][2]) Then

Needs to be:

If Not StringLen($gaData[$i + $j][3]) Then

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.