Jump to content
Sign in to follow this  
will88

Reading HTML Text

Recommended Posts

will88

Im trying to read all the text off a webpage, the text is "Location:" then get all the text to the right of ":" which at the moment is "test" then declare that as a variable>msgbox of what just the text to the right of ":" is.

heres is some code, but I only figured out how to read exact text

$ripway = 'http://h1.ripway.com/blacknight9622/information.html'
$oIE = _IECreate($ripway, 0, 0, 0)
Sleep(2000)
$HWND = _IEPropertyGet($oIE, "hwnd")
WinSetState($HWND, "", @SW_HIDE)
Sleep(2000)
$text = _IEBodyReadText($oIE)
$loc = 'Location:'
If StringInStr($text, $loc) Then

MsgBox(0, "", $loc)
;
EndIf

It always finds the word "Location:"

What do I use to get only get the text to the right of the ":"

hope you understand what im saying lol

Thanks

Edited by will88

Share this post


Link to post
Share on other sites
Szhlopp

Im trying to read all the text off a webpage, the text is "Location:" then get all the text to the right of ":" which at the moment is "test" then declare that as a variable>msgbox of what just the text to the right of ":" is.

heres is some code, but I only figured out how to read exact text

$ripway = 'http://h1.ripway.com/blacknight9622/information.html'
$oIE = _IECreate($ripway, 0, 0, 0)
Sleep(2000)
$HWND = _IEPropertyGet($oIE, "hwnd")
WinSetState($HWND, "", @SW_HIDE)
Sleep(2000)
$text = _IEBodyReadText($oIE)
$loc = 'Location:'
If StringInStr($text, $loc) Then

MsgBox(0, "", $loc)
;
EndIf

It always finds the word "Location:"

What do I use to get only get the text to the right of the ":"

hope you understand what im saying lol

Thanks

Regular expressions my friend. There is an awesome tester in my sig=)

Share this post


Link to post
Share on other sites
dbzfanatic

Try something like this.

Dim $n
$oIE = _IECreate("url",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$pat = StringRegExp($str,"(?i)location: [[:alnum:]]*",2)
For $n = 1 To UBound($pat) - 1
$results = $results & @crlf & StringReplace($pat[$n],"location: ", "")
Next

Untested but should at least give you the idea.

Share this post


Link to post
Share on other sites
Szhlopp

Try something like this.

Dim $n
$oIE = _IECreate("url",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$pat = StringRegExp($str,"(?i)location: [[:alnum:]]*",2)
For $n = 1 To UBound($pat) - 1
$results = $results & @crlf & StringReplace($pat[$n],"location: ", "")
Next

Untested but should at least give you the idea.

Or better yet, use a Non-capturing in-sensitive group.

(?i:location:\s?)(.*)

Share this post


Link to post
Share on other sites
will88

Thanks for the replys.. this seems way to confusing O.o

Dim $n
$oIE = _IECreate("url",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$pat = StringRegExp($str,"(?i)location: [[:alnum:]]*",2)
For $n = 1 To UBound($pat) - 1
$results = $results & @crlf & StringReplace($pat[$n],"location: ", "")
Next
msgbox(0, "", $str)

That works, the message box comes up with "Location:test" which is right, is it possible to have the messagebox only display whats after ":" which is "test"

Thanks

Share this post


Link to post
Share on other sites
dbzfanatic
PsaltyDS

Thanks for the replys.. this seems way to confusing O.o

Dim $n
$oIE = _IECreate("url",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$pat = StringRegExp($str,"(?i)location: [[:alnum:]]*",2)
For $n = 1 To UBound($pat) - 1
$results = $results & @crlf & StringReplace($pat[$n],"location: ", "")
Next
msgbox(0, "", $str)

That works, the message box comes up with "Location:test" which is right, is it possible to have the messagebox only display whats after ":" which is "test"

Thanks

Make the group for "location:" a non-capturing group:
$pat = StringRegExp($str, "(?i:location:)([[:alnum:]]*)", 1)

;)


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites
will88

Yeah just don't use $str use $results. That's why I put it in there:P.

I used $str for msgbox because If I use $results it tells me that $results is not declared

$results = $results & @crlf & StringReplace($pat[$n],"Location: ", "")

looks like its declared to me? :S

Share this post


Link to post
Share on other sites
Szhlopp

I used $str for msgbox because If I use $results it tells me that $results is not declared

$results = $results & @crlf & StringReplace($pat[$n],"Location: ", "")

looks like its declared to me? :S

Why is everyone using the long way around?

$Text = "Random junk goes 1 h3re. And the location: 555 222 ABC"
$RegEx = StringRegExp($Text, "(?i:location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])

Share this post


Link to post
Share on other sites
will88

Why is everyone using the long way around?

$Text = "Random junk goes 1 h3re. And the location: 555 222 ABC"
$RegEx = StringRegExp($Text, "(?i:location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])
Thanks!

#include <IE.au3>
$oIE = _IECreate("http://h1.ripway.com/blacknight9622/information.html",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$RegEx = StringRegExp($str, "(?i:Location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])

Share this post


Link to post
Share on other sites
PsaltyDS

Why is everyone using the long way around?

$Text = "Random junk goes 1 h3re. And the location: 555 222 ABC"
$RegEx = StringRegExp($Text, "(?i:location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])
Well... no. That assumes the text wanted is the end of the string. Brokeness demo:
$Text = "Random junk goes 1 h3re. And the location: 555 222 ABC <Stuff you don't want>"
$RegEx = StringRegExp($Text, "(?i:location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])

The pattern suggested by dbzfanatic stops when it hits a non-alnum character, which is good. I just updated it to have a non-capture for the prefix.

;)


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites
will88

Its working and gets the info from the site "Location:test" then shows msgbox with only the text after the ":"..

but when I do a stringcompare it tells me that test is not equal to test O.o

#include <IE.au3>
$oIE = _IECreate("http://h1.ripway.com/blacknight9622/info.html",0,0,0,0)
$read = _IEBodyReadHtml($oIE)
$cmd = StringRegExp($read, "(?i:Location:\s?)(.*)", 1)
$thetext = $cmd[0]

$comp = StringCompare($thetext, "test")

MsgBox(0, "", $comp)

so I can't do anything like

If $thetext = 'test' then
;do something
else
'do something else
endif

Share this post


Link to post
Share on other sites
dbzfanatic

Share this post


Link to post
Share on other sites
will88

Maybe there's a space in the return? Try something like

If $thetext = 'test' Or $thetext = ' test' Then
;do something
else
;blah
Endif
tried that, theres no space.. if I do Location:test instead of test it works but then there would be no point in the above code at all.

Share this post


Link to post
Share on other sites
dbzfanatic

Try using the pattern PsaltyDS modified from mine. As mentioned the one you are using will gather everything after Location: so perhaps you're getting more than you needed. Try that and also if nothing else you can use a MsgBox to output the result before you send it to your if statement to see what it really is and ensure you're gathering the proper data.

Edit: typo

Edited by dbzfanatic

Share this post


Link to post
Share on other sites
will88

Try using the patter PsaltyDS modified from mine. As mentioned the one you are using will gather everything after Location: so perhaps you're getting more than you needed. Try that and also if nothing else you can use a MsgBox to output the result before you send it to your if statement to see what it really is and ensure you're gathering the proper data.

alright I tried the one you coded earlier but when I try to msgbox $results it tells me that $results is not declared

Share this post


Link to post
Share on other sites
dbzfanatic

Yeah I noticed that, just try putting a comma and adding $result after $n in Dim. You need to learn to declare variables before you go trying to fill them:P.

Yeah I know I didn't declare it either but it was an example, not supposed to be fully functional. It was only supposed to demonstrate the pattern, nothing more.

Edited by dbzfanatic

Share this post


Link to post
Share on other sites
will88

sweet got it working deleted the for and next lines,

Thanks for the help

Edited by will88

Share this post


Link to post
Share on other sites
dbzfanatic

Once again don't use my full code use the pattern for StringRegExp that I gave (actually use the one PsaltyDS modified from mine) and use the code I wrote as a guideline.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×