Jump to content

So. I want to save pages that have blablabla in them instead of just every page, Possible with autoit?


Recommended Posts

I have wrote tools for pulling pages off of a range and dumping their source to a textfile.

Example:

For $i = $Start To $Finish
$url = "http://DOMAIN.com/Pageid=" & $i
    $source = _INetGetSource($url)
    FileWrite("FILE.txt", $i & @CRLF)
    FileWrite("FILE.txt", $source & @CRLF)

However, Now I want to make this into a tool that instead of just pulling the page source entirely, I want it to save ONLY the URL if a certain line of text is in the source of the page.

How it is setup now, It just dumps the entire source of each page into a single textfile so I can just use notepads find function to find the pieces of text I want.

I suppose it could be considered a type of crawler. But instead of just one thing being searched for...I would like it to search for say. Several phrases or lines. And only if that line of text exists in the sourcefile, I would like it to write the page ID number ($i) To the list.

So...Can anybody help me with building something like this?.. It would help me out a lot.

Sorry for the complicated explanation. but, I consider this complicated ><

Link to comment
Share on other sites

Hello SupGuvna,

you can use StringInStr() function or similar functions to check whether the string you're searching for is in the source.

experimented around with it quite abit...all I can seem to do is get it to dump the url to textfile along with "10"

Any ideas?...

$source = _INetGetSource($url)
StringInStr($source, "HerroThere", 0, 1,0,0)
Local $result = StringInStr("I am a String", "RING")
FileWrite("test.txt", $result & @CRLF)

Not sure if this is being properly used or what I am doing wrong. Not exactly an expert when it comes to this ><

Link to comment
Share on other sites

experimented around with it quite abit...all I can seem to do is get it to dump the url to textfile along with "10"

Any ideas?...

$source = _INetGetSource($url)
StringInStr($source, "HerroThere", 0, 1,0,0)
Local $result = StringInStr("I am a String", "RING")
FileWrite("test.txt", $result & @CRLF)
Not sure if this is being properly used or what I am doing wrong. Not exactly an expert when it comes to this ><
[/quote]
Try this$source = _INetGetSource($url)
$Str = StringInStr($source, "HerroThere")
Local $result = StringMid($source, $Str)
FileWrite("test.txt", $result & @CRLF)
Edited by EndFunc
EndFuncAutoIt is the shiznit. I love it.
Link to comment
Share on other sites

Try this

$source = _INetGetSource($url)
$Str = StringInStr($source, "HerroThere")
Local $result = StringMid($source, $Str)
FileWrite("test.txt", $result & @CRLF)

This works, But I was hoping it would write the var used instead of the results themselfs.

Such as...it finds the line HerroThere in page 5784

Instead of writing the results, I want to make it write the page it was found in <3

Understand? Though, This is definitely a big step in the right direction.

Link to comment
Share on other sites

Here is the closest I can get...

8336

8337
HerroThere (Followed by the rest of the page source for some reason)

8338

Though thats by going with this route.

FileWrite("test.txt", $i & @CRLF)
FileWrite("test.txt", $result & @CRLF)

Is it possible at all to write NOTHING to the text file with the exception of page ID`s via $i that have the string HerroThere in them?

Sorry for making things complicated x-x

Link to comment
Share on other sites

This site could use an edit button..But anyways, I have gotten a step closer!

For $i = $Start To $Finish
    $url = "http://www.Domain.com/pageid/" & $i
   $source = _INetGetSource($url)
  
$Str = StringInStr($source, "HerroThere",0)
$Main = ($i & " " & $Str)
FileWrite("test.txt",$Main & @CRLF)

Now the output is down to this!

8336 0

8337 1787

8338 0

8339 0

8340 0

Anybody got a way to push to the final step? <3 Almost there!
Link to comment
Share on other sites

This should show you how you might achieve what you want

Local $sUrl = "http://www.Domain.com/pageid/"
Local $sFind = "HerroThere"
Local $sSource = ""
Local $iFirstPage = 1
Local $iLastPage = 20
For $i = $iFirstPage To $iLastPage
  $sSource = _INetGetSource($sUrl & $i)
  if StringInStr($sSource, $sFind,0) Then
    FileWriteLine("test.txt","Found " & $sFind & " on page " & $i & " of " & $sUrl)
  endif
Next
Edited by Bowmore

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Link to comment
Share on other sites

  • Moderators

SupGuvna,

This site could use an edit button

Now you have 5 posts you should see one at bottom right. ;)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

This should show you how you might achieve what you want

Local $sUrl = "http://www.Domain.com/pageid/"
Local $sFind = "HerroThere"
Local $sSource = ""
Local $iFirstPage = 1
Local $iLastPage = 20
For $i = $iFirstPage To $iLastPage
  $sSource = _INetGetSource($sUrl & $i)
  if StringInStr($sSource, $sFind,0) Then
    FileWriteLine("test.txt","Found " & $sFind & " on page " & $i & " of " & $sUrl)
  endif
Next

Unfortunately the code you wrote there always results in error. Played around with it abit and it is scanning, but nothing is being wrote to file.

SupGuvna,

Now you have 5 posts you should see one at bottom right. ;)

M23

Thanks <3

Edit:

Messed around with the code and cleaned it up abit <3 Works just fine now. Thanks for the lovely education you guys!

Edited by SupGuvna
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...