Sign in to follow this  
Followers 0
SupGuvna

So. I want to save pages that have blablabla in them instead of just every page, Possible with autoit?

10 posts in this topic

I have wrote tools for pulling pages off of a range and dumping their source to a textfile.

Example:

For $i = $Start To $Finish
$url = "http://DOMAIN.com/Pageid=" & $i
    $source = _INetGetSource($url)
    FileWrite("FILE.txt", $i & @CRLF)
    FileWrite("FILE.txt", $source & @CRLF)

However, Now I want to make this into a tool that instead of just pulling the page source entirely, I want it to save ONLY the URL if a certain line of text is in the source of the page.

How it is setup now, It just dumps the entire source of each page into a single textfile so I can just use notepads find function to find the pieces of text I want.

I suppose it could be considered a type of crawler. But instead of just one thing being searched for...I would like it to search for say. Several phrases or lines. And only if that line of text exists in the sourcefile, I would like it to write the page ID number ($i) To the list.

So...Can anybody help me with building something like this?.. It would help me out a lot.

Sorry for the complicated explanation. but, I consider this complicated ><

Share this post


Link to post
Share on other sites



Hello SupGuvna,

you can use StringInStr() function or similar functions to check whether the string you're searching for is in the source.

1 person likes this

Regards,Hannes[spoiler]If you can't convince them, confuse them![/spoiler]

Share this post


Link to post
Share on other sites

Hello SupGuvna,

you can use StringInStr() function or similar functions to check whether the string you're searching for is in the source.

experimented around with it quite abit...all I can seem to do is get it to dump the url to textfile along with "10"

Any ideas?...

$source = _INetGetSource($url)
StringInStr($source, "HerroThere", 0, 1,0,0)
Local $result = StringInStr("I am a String", "RING")
FileWrite("test.txt", $result & @CRLF)

Not sure if this is being properly used or what I am doing wrong. Not exactly an expert when it comes to this ><

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

experimented around with it quite abit...all I can seem to do is get it to dump the url to textfile along with "10"

Any ideas?...

$source = _INetGetSource($url)
StringInStr($source, "HerroThere", 0, 1,0,0)
Local $result = StringInStr("I am a String", "RING")
FileWrite("test.txt", $result & @CRLF)
Not sure if this is being properly used or what I am doing wrong. Not exactly an expert when it comes to this ><
[/quote]
Try this$source = _INetGetSource($url)
$Str = StringInStr($source, "HerroThere")
Local $result = StringMid($source, $Str)
FileWrite("test.txt", $result & @CRLF)
Edited by EndFunc
1 person likes this

EndFuncAutoIt is the shiznit. I love it.

Share this post


Link to post
Share on other sites

Try this

$source = _INetGetSource($url)
$Str = StringInStr($source, "HerroThere")
Local $result = StringMid($source, $Str)
FileWrite("test.txt", $result & @CRLF)

This works, But I was hoping it would write the var used instead of the results themselfs.

Such as...it finds the line HerroThere in page 5784

Instead of writing the results, I want to make it write the page it was found in <3

Understand? Though, This is definitely a big step in the right direction.

Share this post


Link to post
Share on other sites

Here is the closest I can get...

8336

8337
HerroThere (Followed by the rest of the page source for some reason)

8338

Though thats by going with this route.

FileWrite("test.txt", $i & @CRLF)
FileWrite("test.txt", $result & @CRLF)

Is it possible at all to write NOTHING to the text file with the exception of page ID`s via $i that have the string HerroThere in them?

Sorry for making things complicated x-x

Share this post


Link to post
Share on other sites

This site could use an edit button..But anyways, I have gotten a step closer!

For $i = $Start To $Finish
    $url = "http://www.Domain.com/pageid/" & $i
   $source = _INetGetSource($url)
  
$Str = StringInStr($source, "HerroThere",0)
$Main = ($i & " " & $Str)
FileWrite("test.txt",$Main & @CRLF)

Now the output is down to this!

8336 0

8337 1787

8338 0

8339 0

8340 0

Anybody got a way to push to the final step? <3 Almost there!

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

This should show you how you might achieve what you want

Local $sUrl = "http://www.Domain.com/pageid/"
Local $sFind = "HerroThere"
Local $sSource = ""
Local $iFirstPage = 1
Local $iLastPage = 20
For $i = $iFirstPage To $iLastPage
  $sSource = _INetGetSource($sUrl & $i)
  if StringInStr($sSource, $sFind,0) Then
    FileWriteLine("test.txt","Found " & $sFind & " on page " & $i & " of " & $sUrl)
  endif
Next
Edited by Bowmore
1 person likes this

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Share this post


Link to post
Share on other sites

SupGuvna,

This site could use an edit button

Now you have 5 posts you should see one at bottom right. ;)

M23

1 person likes this

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

This should show you how you might achieve what you want

Local $sUrl = "http://www.Domain.com/pageid/"
Local $sFind = "HerroThere"
Local $sSource = ""
Local $iFirstPage = 1
Local $iLastPage = 20
For $i = $iFirstPage To $iLastPage
  $sSource = _INetGetSource($sUrl & $i)
  if StringInStr($sSource, $sFind,0) Then
    FileWriteLine("test.txt","Found " & $sFind & " on page " & $i & " of " & $sUrl)
  endif
Next

Unfortunately the code you wrote there always results in error. Played around with it abit and it is scanning, but nothing is being wrote to file.

SupGuvna,

Now you have 5 posts you should see one at bottom right. ;)

M23

Thanks <3

Edit:

Messed around with the code and cleaned it up abit <3 Works just fine now. Thanks for the lovely education you guys!

Edited by SupGuvna

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0