Jump to content

Reading HTML Text


will88
 Share

Recommended Posts

Im trying to read all the text off a webpage, the text is "Location:" then get all the text to the right of ":" which at the moment is "test" then declare that as a variable>msgbox of what just the text to the right of ":" is.

heres is some code, but I only figured out how to read exact text

$ripway = 'http://h1.ripway.com/blacknight9622/information.html'
$oIE = _IECreate($ripway, 0, 0, 0)
Sleep(2000)
$HWND = _IEPropertyGet($oIE, "hwnd")
WinSetState($HWND, "", @SW_HIDE)
Sleep(2000)
$text = _IEBodyReadText($oIE)
$loc = 'Location:'
If StringInStr($text, $loc) Then

MsgBox(0, "", $loc)
;
EndIf

It always finds the word "Location:"

What do I use to get only get the text to the right of the ":"

hope you understand what im saying lol

Thanks

Edited by will88
Link to comment
Share on other sites

Im trying to read all the text off a webpage, the text is "Location:" then get all the text to the right of ":" which at the moment is "test" then declare that as a variable>msgbox of what just the text to the right of ":" is.

heres is some code, but I only figured out how to read exact text

$ripway = 'http://h1.ripway.com/blacknight9622/information.html'
$oIE = _IECreate($ripway, 0, 0, 0)
Sleep(2000)
$HWND = _IEPropertyGet($oIE, "hwnd")
WinSetState($HWND, "", @SW_HIDE)
Sleep(2000)
$text = _IEBodyReadText($oIE)
$loc = 'Location:'
If StringInStr($text, $loc) Then

MsgBox(0, "", $loc)
;
EndIf

It always finds the word "Location:"

What do I use to get only get the text to the right of the ":"

hope you understand what im saying lol

Thanks

Regular expressions my friend. There is an awesome tester in my sig=)

Link to comment
Share on other sites

Try something like this.

Dim $n
$oIE = _IECreate("url",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$pat = StringRegExp($str,"(?i)location: [[:alnum:]]*",2)
For $n = 1 To UBound($pat) - 1
$results = $results & @crlf & StringReplace($pat[$n],"location: ", "")
Next

Untested but should at least give you the idea.

Link to comment
Share on other sites

Try something like this.

Dim $n
$oIE = _IECreate("url",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$pat = StringRegExp($str,"(?i)location: [[:alnum:]]*",2)
For $n = 1 To UBound($pat) - 1
$results = $results & @crlf & StringReplace($pat[$n],"location: ", "")
Next

Untested but should at least give you the idea.

Or better yet, use a Non-capturing in-sensitive group.

(?i:location:\s?)(.*)
Link to comment
Share on other sites

Thanks for the replys.. this seems way to confusing O.o

Dim $n
$oIE = _IECreate("url",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$pat = StringRegExp($str,"(?i)location: [[:alnum:]]*",2)
For $n = 1 To UBound($pat) - 1
$results = $results & @crlf & StringReplace($pat[$n],"location: ", "")
Next
msgbox(0, "", $str)

That works, the message box comes up with "Location:test" which is right, is it possible to have the messagebox only display whats after ":" which is "test"

Thanks

Link to comment
Share on other sites

Thanks for the replys.. this seems way to confusing O.o

Dim $n
$oIE = _IECreate("url",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$pat = StringRegExp($str,"(?i)location: [[:alnum:]]*",2)
For $n = 1 To UBound($pat) - 1
$results = $results & @crlf & StringReplace($pat[$n],"location: ", "")
Next
msgbox(0, "", $str)

That works, the message box comes up with "Location:test" which is right, is it possible to have the messagebox only display whats after ":" which is "test"

Thanks

Make the group for "location:" a non-capturing group:
$pat = StringRegExp($str, "(?i:location:)([[:alnum:]]*)", 1)

;)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

I used $str for msgbox because If I use $results it tells me that $results is not declared

$results = $results & @crlf & StringReplace($pat[$n],"Location: ", "")

looks like its declared to me? :S

Why is everyone using the long way around?

$Text = "Random junk goes 1 h3re. And the location: 555 222 ABC"
$RegEx = StringRegExp($Text, "(?i:location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])
Link to comment
Share on other sites

Why is everyone using the long way around?

$Text = "Random junk goes 1 h3re. And the location: 555 222 ABC"
$RegEx = StringRegExp($Text, "(?i:location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])
Thanks!

#include <IE.au3>
$oIE = _IECreate("http://h1.ripway.com/blacknight9622/information.html",0,0,0,0)
$str = _IEBodyReadHtml($oIE)
$RegEx = StringRegExp($str, "(?i:Location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])
Link to comment
Share on other sites

Why is everyone using the long way around?

$Text = "Random junk goes 1 h3re. And the location: 555 222 ABC"
$RegEx = StringRegExp($Text, "(?i:location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])
Well... no. That assumes the text wanted is the end of the string. Brokeness demo:
$Text = "Random junk goes 1 h3re. And the location: 555 222 ABC <Stuff you don't want>"
$RegEx = StringRegExp($Text, "(?i:location:\s?)(.*)", 1)
MsgBox(0, "", $RegEx[0])

The pattern suggested by dbzfanatic stops when it hits a non-alnum character, which is good. I just updated it to have a non-capture for the prefix.

;)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

Its working and gets the info from the site "Location:test" then shows msgbox with only the text after the ":"..

but when I do a stringcompare it tells me that test is not equal to test O.o

#include <IE.au3>
$oIE = _IECreate("http://h1.ripway.com/blacknight9622/info.html",0,0,0,0)
$read = _IEBodyReadHtml($oIE)
$cmd = StringRegExp($read, "(?i:Location:\s?)(.*)", 1)
$thetext = $cmd[0]

$comp = StringCompare($thetext, "test")

MsgBox(0, "", $comp)

so I can't do anything like

If $thetext = 'test' then
;do something
else
'do something else
endif
Link to comment
Share on other sites

Link to comment
Share on other sites

Try using the pattern PsaltyDS modified from mine. As mentioned the one you are using will gather everything after Location: so perhaps you're getting more than you needed. Try that and also if nothing else you can use a MsgBox to output the result before you send it to your if statement to see what it really is and ensure you're gathering the proper data.

Edit: typo

Edited by dbzfanatic
Link to comment
Share on other sites

Try using the patter PsaltyDS modified from mine. As mentioned the one you are using will gather everything after Location: so perhaps you're getting more than you needed. Try that and also if nothing else you can use a MsgBox to output the result before you send it to your if statement to see what it really is and ensure you're gathering the proper data.

alright I tried the one you coded earlier but when I try to msgbox $results it tells me that $results is not declared

Link to comment
Share on other sites

Yeah I noticed that, just try putting a comma and adding $result after $n in Dim. You need to learn to declare variables before you go trying to fill them:P.

Yeah I know I didn't declare it either but it was an example, not supposed to be fully functional. It was only supposed to demonstrate the pattern, nothing more.

Edited by dbzfanatic
Link to comment
Share on other sites

Once again don't use my full code use the pattern for StringRegExp that I gave (actually use the one PsaltyDS modified from mine) and use the code I wrote as a guideline.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...