Sign in to follow this  
Followers 0
cypher175

search for a text url in a web page..?

20 posts in this topic

How would i search for a text url in a webpage that isnt a link..??

i need to find a text url in a web page & use the url as a variable

but the thing is is that the red part of the url constantly changes every time the page is viewed.

http://website.com/f65c6f361293823/

so would there be a way to find the & gather the whole url based on just the front part of the url: http://website.com/

Share this post


Link to post
Share on other sites



You have not really given enough of the html code to be sure this is correct but you could use a regEx along the lines of

$sText = "code where you are returning the string http://website.com/f65c6f361293823/"
$aStr = StringRegExp($sText, "(?i)http://.+\.com/[0-9a-f]+/", 1)
If NOT @Error Then MsgBox(0, "Results", #aStr[0])

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

You have not really given enough of the html code to be sure this is correct

^^What @GEOSoft said, because if you're using _IE functions to get to this webpage you could sure pull the html using _IEDocReadHTML and then use @GEOSoft's Regular Expression to pull it out of it... but because you didn't share your code, we just don't know. :D

#include <IE.au3>
$oIE = _IECreate ("http://www.google.com")
MsgBox (0, "", _IEDocReadHTML($oIE))
_IENavigate ($oIE, "http://www.yahoo.com")
MsgBox (0, "", _IEDocReadHTML($oIE))
Edited by exodius

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

^^What @GEOSoft said, because if you're using _IE functions to get to this webpage you could sure pull the html using _IEDocReadHTML and then use @GEOSoft's Regular Expression to pull it out of it... but because you didn't share your code, we just don't know. :D

#include <IE.au3>
$oIE = _IECreate ("http://www.google.com")
MsgBox (0, "", _IEDocReadHTML($oIE))
_IENavigate ($oIE, "http://www.yahoo.com")
MsgBox (0, "", _IEDocReadHTML($oIE))
Thats not really what I meant. I don't care how he's getting the html. I would have been able to test more if I had more of the html surrounding what he was looking for. Preferably a complete container <div> <span> & etc) that encompassed that string so I could be sure to get the correct string using a container ID or some other consistent text.

EDIT: Even a link to a typical page would work so that I could read the source code.

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

well actually the URL is coming from an email message from http://10minutemail.com

dunno if that helps or not, what other info do you guys need..??

Share this post


Link to post
Share on other sites

well actually the URL is coming from an email message from http://10minutemail.com

dunno if that helps or not, what other info do you guys need..??

with 189 posts cypher, show us what you have tried!!!...???

8)


NEWHeader1.png

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

$sText = "This is a code that gets by your test: http://test.com/test/index.htm and http://website.com/f65c6f361293823/"
$aStr = StringRegExp($sText, "(?i)http://.+\.com/[0-9a-f]+/", 1)
If NOT @Error Then MsgBox(0, "Results", $aStr[0])oÝ÷ Ù:-+ºÚ"µÍÌÍÜÕ^H ][ÝÕÈÈHÛÙH]Ù]ÈH[ÝÝËÝÝÛÛKÝÝÚ[^H[ËÝÙXÚ]KÛÛKÙXÍÍLLÎËÉ][ÝÂÌÍØTÝHÝ[ÔYÑ^
    ÌÍÜÕ^   ][ÝÊÚJZËÝÙXÚ]KÛÛKÊÊKÉ][ÝËJBYÝÜ[ÙÐÞ
    ][ÝÔÝ[É][ÝË   ÌÍØTÝÌJ

Because you said the website.com is ALWAYS the one things your dealing with and then you don't need to know anything else as that will search through for the string.

Edited by TerarinK

0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E

Share this post


Link to post
Share on other sites

thats the thing, though Valuater..!! i never done this type of code before, thats why i ask here, i'm not to sure on what functions to use in all that..

Share this post


Link to post
Share on other sites

He was referring to one of your mails that contain the website.com\SOMECODE\ they usually want the page else the description of your email ie... emails are usually contains in a <DIV> they would search by Tagname else if your lucky enough to have a ID or NAME which you can certainly get by IEObj_


0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E

Share this post


Link to post
Share on other sites

you can always try _INetGetSource() and StringInStr()


[u]You can download my projects at:[/u] Pulsar Software

Share this post


Link to post
Share on other sites

This just 1 of 16 different IE scripts in Welcome to Autoit 1-2-3

; demonstration to find chracters that change between to standard points
; or just find a string
#include <IE.au3>
#include <String.au3>

#Region --- IE-Builder generated code Start ---

$oIE = _IECreate()

;------------- User input --------------
_IENavigate($oIE, "http://www.autoitscript.com/") ; web address
$Find = "Welcome to the "  ; my info shows after this line... or just find this line
$Before = "- the home "     ; my info shows before this line... or set as ""
; ------------ End User input -------------
Sleep(1000)
$body = _IEBodyReadHTML($oIE)
$sloc = @TempDir & "\stest.txt" 
FileDelete($sloc)
FileWrite($sloc, $body)
$sfile = FileOpen($sloc, 0)
$num = 0
While 2
    $num = $num + 1
    $sline = FileReadLine($sfile, $num)
    If @error Then
        MsgBox(262208, "Fail", "The string was NOT found   ")
        FileClose($sfile)
        Exit
    EndIf
    If StringInStr($sline, $Find) Then
        MsgBox(64, "Success", "The string " & $Find & " was found    " & @CRLF & " on line # " & $num, 5)
        If $Before = "" Then ExitLoop
        $Found = _StringBetween($sline, $Find, $Before)
        MsgBox(64, "Found", "The string is *" & $Found[0] & "*    ", 5)
        ExitLoop
    EndIf
WEnd

#EndRegion --- IE-Builder generated code End ---

Also you might want to look at IE Builder here...

http://www.autoitscript.com/forum/index.php?showtopic=19368

As a future resource

8)


NEWHeader1.png

Share this post


Link to post
Share on other sites

theres: <DIV id=content> & <DIV class=article>

and the URL & message body are contained within those DIVS..

Does that help at all..??

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

See the small area called user input?

For the code I posted?

;------------- User input --------------

_IENavigate($oIE, "http://website.com/") ; web address

$Find = "http://website.com/" ; my info shows after this line

$Before = "/" ; my info shows before this line... or set as ""

; ------------ End User input -------------

Want me to come over and run it too?... lol

8)

Edited by Valuater

NEWHeader1.png

Share this post


Link to post
Share on other sites

sorry i think i posted that before i refreshed the page and saw yer post..

Would there be any way to click a link in the same way of the URL that always changes its last section..??

if the URL is http://website.com/f65c6f361293823/

would there be anyway to do _IELinkClickByText by only detecting the http://website.com/ part and then click the link..??

Share this post


Link to post
Share on other sites

#15 ·  Posted (edited)

I am trying to understand you... but

If I used my script and found the link to be "http://website.com/f65c6f361293823/"

I would just use IENavigate($oIE, "http://website.com/f65c6f361293823/")

That should do the same thing as "Clicking it" right?

8)

Edited by Valuater

NEWHeader1.png

Share this post


Link to post
Share on other sites

Valuater, is there anyway to do your code without writing a temp stest.txt file to the disk..??

Share this post


Link to post
Share on other sites

#include <IE.au3>

$oIE = _IECreate("http://www.autoitscript.com/") 

$sMyString = "http://website.com/"
$oLinks = _IELinkGetCollection($oIE)
For $oLink in $oLinks
    $sLinkText = _IEPropertyGet($oLink, "innerText")
    If StringInStr($sLinkText, $sMyString) Then
        _IEAction($oLink, "click")
        ExitLoop
    EndIf
Next

Right off from the help file on _IELinkGetCollection


0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E

Share this post


Link to post
Share on other sites

This is Valuaters code with the file

#include <IE.au3>
#include <String.au3>

#Region --- IE-Builder generated code Start ---

$oIE = _IECreate()

;------------- User input --------------
_IENavigate($oIE, "http://www.autoitscript.com/") ; web address
$sFind = "Welcome to the " ; my info shows after this line... or just find this line
$sBefore = "- the home " ; my info shows before this line... or set as ""
; ------------ End User input -------------
Sleep(1000)
$sBody = _IEBodyReadHTML($oIE)

If StringInStr($sBody, $sFind) Then
    MsgBox(64, "Success", "The string " & $sFind & " was found", 5)
    If $sBefore = "" Then Exit
    $sFound = _StringBetween($sBody, $sFind, $sBefore)
    MsgBox(64, "Found", "The string is *" & $sFound[0] & "*    ", 5)
EndIf

#EndRegion --- IE-Builder generated code Start ---

They work just the same


0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E

Share this post


Link to post
Share on other sites

I even change the _IE function so I can search part of the string but remember this searches everything and clicks on the first phrase

#include <IE.au3>
$oIE = _IE_Example("basic")
__IELinkClickByText($oIE, "forum")

;===============================================================================
;
; Function Name:    _IELinkClickByText()
; Description:      Simulate a mouse click on a link with text sub-string matching the string provided
; Parameter(s):     $o_object   - Object variable of an InternetExplorer.Application, Window or Frame object
;                   $s_linkText - Text displayed on the web page for the desired link to click
;                   $i_index    - Optional: If the link text occurs more than once, specify which instance
;                                   you want to click by 0-based index
;                   $f_wait     - Optional: specifies whether to wait for page to load before returning
;                                   0 = Return immediately, not waiting for page to load
;                                   1 = (Default) Wait for page load to complete before returning
; Requirement(s):   AutoIt3 V3.2 or higher
; Return Value(s):  On Success  - Returns -1
;                   On Failure  - Returns 0 and sets @ERROR
;                   @ERROR      - 0 ($_IEStatus_Success) = No Error
;                               - 1 ($_IEStatus_GeneralError) = General Error
;                               - 3 ($_IEStatus_InvalidDataType) = Invalid Data Type
;                               - 4 ($_IEStatus_InvalidObjectType) = Invalid Object Type
;                               - 6 ($_IEStatus_LoadWaitTimeout) = Load Wait Timeout
;                               - 7 ($_IEStatus_NoMatch) = No Match
;                               - 8 ($_IEStatus_AccessIsDenied) = Access Is Denied
;                               - 9 ($_IEStatus_ClientDisconnected) = Client Disconnected
;                   @Extended   - Contains invalid parameter number
; Author(s):        Dale Hohm
;
;===============================================================================
;
Func __IELinkClickByText(ByRef $o_object, $s_linkText, $i_index = 0, $f_wait = 1)
    If Not IsObj($o_object) Then
        __IEErrorNotify("Error", "_IELinkClickByText", "$_IEStatus_InvalidDataType")
        SetError($_IEStatus_InvalidDataType, 1)
        Return 0
    EndIf
    ;
    Local $found = 0, $link, $linktext, $links = $o_object.document.links
    $i_index = Number($i_index)
    For $link In $links
        $linktext = StringInStr($link.outerText, $s_linktext)
        If $linktext Then
            If ($found = $i_index) Then
                $link.click
                If $f_wait Then
                    _IELoadWait($o_object)
                    SetError(@error)
                    Return -1
                EndIf
                SetError($_IEStatus_Success)
                Return -1
            EndIf
            $found = $found + 1
        EndIf
    Next
    __IEErrorNotify("Warning", "_IELinkClickByText", "$_IEStatus_NoMatch")
    SetError($_IEStatus_NoMatch) ; Could be caused by parameter 2, 3 or both
    Return 0
EndFunc   ;==>_IELinkClickByText

0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E

Share this post


Link to post
Share on other sites

what if i wanted to find all of the links in a webpage that contain the variable $Find

and then do a function for each link that it finds, how would i do that, without writing a temp file to disk..??

$Read_HTML = _IEDocReadHTML($IE)

$Find = _StringBetween($Read_HTML, "http://website.com/", "/")

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0