Sign in to follow this  
Followers 0
SunnyDayCloud

web page numeric & text recognition with autoit

21 posts in this topic

Hi guys & gals.

I'm working on a project for some friends to compare numeric and textual values from a file with sites like eBay in order to recognize products and prices.

my problem is i cant find a suitable command capable of finding specific numeric values off of a web page according to the data provided from an external file.

can some one please recommend a command i could use for the purpose?

I'm quite new to autoit, all i have done so far is creating simple macros for exploring the web and interacting and creating bat files.

Thanks.

Share this post


Link to post
Share on other sites



_IEBodyReadText should be able to read the text from a web page and you can use StringInStr for comparing values...


010101000110100001101001011100110010000001101001011100110010000

001101101011110010010000001110011011010010110011100100001

My Android cat and mouse game
https://play.google.com/store/apps/details?id=com.KaosVisions.WhiskersNSqueek

We're gonna need another Timmy!

Share this post


Link to post
Share on other sites

_IEBodyReadText should be able to read the text from a web page and you can use StringInStr for comparing values...

Thanks for the quick answer.

But don't all the IE commands require the use of internet explorer? i would rather not have to result to the use of IE, if possible.

i will however try what you recommended and see if i can get it to work, but i would still like to know if there is a way to do this without the use of IE.

Share this post


Link to post
Share on other sites

Perhaps you could use InetGet to download the html page and do a FileRead on it? Not sure if that would work or not.


010101000110100001101001011100110010000001101001011100110010000

001101101011110010010000001110011011010010110011100100001

My Android cat and mouse game
https://play.google.com/store/apps/details?id=com.KaosVisions.WhiskersNSqueek

We're gonna need another Timmy!

Share this post


Link to post
Share on other sites

I use the _INETGetSource() function to read web page source code.


- Bruce /*somdcomputerguy */  If you change the way you look at things, the things you look at change.

Share this post


Link to post
Share on other sites

Hi guys.

Im having a problem using _IEBodyReadText, here is my code:

$mcm=("MagicCardMarket - Buy and sell Magic the Gathering Cards. - Windows Internet Explorer")
$a=_IEAttach ($mcm)
$b=_IEBodyReadText ($a)
MsgBox(0, "Body Text", $b)

The url for the page i am trying to use the _IEBodyReadText command on is http://bit.ly/kw8KIm .

The one reason i can forsee for the error is that the page is too complex for the command to handel, if so could some one please confirm this? Or could it be that there is too much text on the page to be displayed using MsgBox?

Mean while i will try the _INETGetSource command.

Thanks.

Share this post


Link to post
Share on other sites

Something like this?

#include <IE.au3>
#include <string.au3>
$oIE = _IECreate("http://bit.ly/kw8KIm")
$sText = _IEBodyReadText ($oIE)
$string_to_search_for = "Abandon Hope"  ;<--- This is where you can put what you want to search for
If StringInStr($sText, $string_to_search_for, 0) Then
    MsgBox(0, "Success", "The string you searched for: " & $string_to_search_for & " is on the web page")
Else
    MsgBox(0, "Unsuccessful", "The string you searched for: " & $string_to_search_for & " is NOT on the web page")
EndIf

#include <ByteMe.au3>

Share this post


Link to post
Share on other sites

As for using MsgBox to display the info from the page, I tried it and it came out looking ridiculous. This worked much better:

#include <GUIConstantsEx.au3>
#include <GUIListBox.au3>
#include <WindowsConstants.au3>
#include <IE.au3>

$oIE = _IECreate("http://bit.ly/kw8KIm")
$sText = _IEBodyReadText ($oIE)

$Form1 = GUICreate("Form1", 625, 443, -1, -1)
$Edit1 = GUICtrlCreateEdit($sText, 16, 8, 593, 425)
GUISetState(@SW_SHOW)

While 1
    $nMsg = GUIGetMsg()
    Switch $nMsg
        Case $GUI_EVENT_CLOSE
            Exit

    EndSwitch
WEnd

#include <ByteMe.au3>

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

Something like this?

#include <IE.au3>
#include <string.au3>
$oIE = _IECreate("http://bit.ly/kw8KIm")
$sText = _IEBodyReadText ($oIE)
$string_to_search_for = "Abandon Hope"  ;<--- This is where you can put what you want to search for
If StringInStr($sText, $string_to_search_for, 0) Then
    MsgBox(0, "Success", "The string you searched for: " & $string_to_search_for & " is on the web page")
Else
    MsgBox(0, "Unsuccessful", "The string you searched for: " & $string_to_search_for & " is NOT on the web page")
EndIf

As for using MsgBox to display the info from the page, I tried it and it came out looking ridiculous. This worked much better:

#include <GUIConstantsEx.au3>
#include <GUIListBox.au3>
#include <WindowsConstants.au3>
#include <IE.au3>

$oIE = _IECreate("http://bit.ly/kw8KIm")
$sText = _IEBodyReadText ($oIE)

$Form1 = GUICreate("Form1", 625, 443, -1, -1)
$Edit1 = GUICtrlCreateEdit($sText, 16, 8, 593, 425)
GUISetState(@SW_SHOW)

While 1
    $nMsg = GUIGetMsg()
    Switch $nMsg
        Case $GUI_EVENT_CLOSE
            Exit

    EndSwitch
WEnd

Wow! Thanks a lot for all the help Sleepydvdr. this will make my project a lot easier =D.

Now i need to figure out how make searches that consist of multiple terms, like making the code first find a card name and then finding the quantity in stock.

can this be done by either using the & sign to add a numeric value to the string being searched for? or can i use the "StringInStr ( "string", "substring")" string part to first find the card name and then search for the quantity in stock? unless i have misunderstood though the first string defines where to being searching while the substring what to search for?

will be attempting these solutions now.

And Thanks again for all the help =)

Edited by SunnyDayCloud

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

Hi and thanks for all the help so far :huh2:

I have been looking in to the StringInStr function, and im pretty sure that it alone can not solve my problem, since i need to be able to differentiate numeric values by making somthing like <6 part of the search terms.

I am pretty sure how ever that this can be done using StringRegExp, But i need to get the position of the text that StringInStr has found, i know that this is one of the functions return values but being new to this language i haven't a clue how to obtain it. Would this work?

$oIE = _IECreate("http://bit.ly/kw8KIm")
$sText = _IEBodyReadText ($oIE)
$Query = "Abandone Hope" 

$prior=StringInStr($sText, $Query, 0)

StringRegExp ("$sText", <6, , "$prior",)

Also i just realized that i need StringRegExp to give me a false or true value and to only look for the first or second match, is this possible?

Edited by SunnyDayCloud

Share this post


Link to post
Share on other sites

But i need to get the position of the text that StringInStr has found, i know that this is one of the functions return values but being new to this language i haven't a clue how to obtain it.

See the woodchuck example here, StringInStr.


- Bruce /*somdcomputerguy */  If you change the way you look at things, the things you look at change.

Share this post


Link to post
Share on other sites

See the woodchuck example here, StringInStr.

The Woodchuck example shows one how to use occurrence not how to receive the location output of the StrinInStr command. unfortunately that wont help me.

Share this post


Link to post
Share on other sites

But i need to get the position of the text that StringInStr has found, i know that this is one of the functions return values but being new to this language i haven't a clue how to obtain it.

Return Value

Success: Returns the position of the substring.


- Bruce /*somdcomputerguy */  If you change the way you look at things, the things you look at change.

Share this post


Link to post
Share on other sites

#14 ·  Posted (edited)

Yes i understand that it DOSE return the position of the substring but what i dont understand is what code do i use to interact with and use the return value.

Edited by SunnyDayCloud

Share this post


Link to post
Share on other sites

I'm not sure exactly what you're looking for, but does this help?

$string = "Lorem ipsum dolor sit amet."
$wordtofind = "dolor"
$location = StringInStr($string, $wordtofind)
$foundword = StringMid($string, $location, StringLen($wordtofind))
ConsoleWrite($location & " - " & $foundword & @LF)

- Bruce /*somdcomputerguy */  If you change the way you look at things, the things you look at change.

Share this post


Link to post
Share on other sites

I'm not sure exactly what you're looking for, but does this help?

$string = "Lorem ipsum dolor sit amet."
$wordtofind = "dolor"
$location = StringInStr($string, $wordtofind)
$foundword = StringMid($string, $location, StringLen($wordtofind))
ConsoleWrite($location & " - " & $foundword & @LF)

That may well do the trick, but i still need to make the $string in to a wepage, would that work streight up by defnining the variable as a an open window?

then after it has located txt i also need it to recognize a number next, how would i do that? any ideas what commands could help?

Thanks, Sunny

Share this post


Link to post
Share on other sites

That may well do the trick, but i still need to make the $string in to a wepage, would that work streight up by defnining the variable as a an open window?

then after it has located txt i also need it to recognize a number next, how would i do that? any ideas what commands could help?

Thanks, Sunny

You means fill $string with data from a webpage? Use _InetGetSource for that. Any text or number can be recognized and located (position wise that is), a string is a string..


- Bruce /*somdcomputerguy */  If you change the way you look at things, the things you look at change.

Share this post


Link to post
Share on other sites

#18 ·  Posted (edited)

You means fill $string with data from a webpage? Use _InetGetSource for that. Any text or number can be recognized and located (position wise that is), a string is a string..

Okay excellent, that's exactly what i wanted to hear =)

now the last thing i should need to know is if StringRegExp is the only command i can use to consistently recognize a specific series of numbers after the relevant string of text?

Edited by SunnyDayCloud

Share this post


Link to post
Share on other sites

It's probably not the only way. The easiest (and I'll also say; for lack of a better word, best) way I know of though.


- Bruce /*somdcomputerguy */  If you change the way you look at things, the things you look at change.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0