Jump to content

[Solved] RegEx Help!

Recommended Posts

I am working on a little autoit script for the website gasbuddy. I need my script to grab recenty reported prices. I am new to using RegEx, and I'm having problems with it. I have spent hours trying to figure this out and only wanted to post if it was my last resort. It seems to work when I paste the code into this RegEx Generator with '\d.\d*(?=\Q<br />\E)' as my pattern, but never in my script when using _IEBodyGrabHTML. Here is the HTML code that it is working with (wouldn't post within

correctly). The numbers I am trying to pull are located in the "Members Favorites" box under the input boxes for the 7-11 station.

At first I thought that maybe the html was to much for autoit, so I split it down to the code below using the _StringBetween, but still no success. Below is my autoit code for this:

#include <IE.au3>
#include <String.au3>
#include <Array.au3>
$IE = _IECreate ("http://www.tampagasprices.com/ReportGasPrices.aspx","",1)
$ReportHTML = _IEBodyReadHTML ($IE) 
$HTML = String($ReportHTML)
$HTML = _StringBetween ($HTML, "ctl00$Content$FPI$rrFavs$ctl01$txtDataID", "tl00$Content$FPI$rrFavs$ctl02$ddlTimeSpotted2")
$HTMLString = String($HTML[0])
$Array = StringRegExp($HTMLString, '\d.\d*(?=\Q<br />\E)', 1, 0)
Edited by Ned
Link to comment
Share on other sites

Sorry for double post, edit button is not there for some reason... :huh2:

Anyways, working on this some more, I noticed that when autoit fetches the HTML, it is a little different (<br /> in chrome is <BR> in autoit/IE). So I modified the code acordingly, it now grabs the price, but only the 1st price. There up to 4 (3 on example page) that it needs to grab. Any help with this?

Updated Code:

#include <IE.au3>
#include <String.au3>
#include <Array.au3>
$IE = _IECreate ("http://dl.dropbox.com/u/188331/HTML.htm","",1)
$ReportHTML = _IEBodyReadHTML ($IE) 
$HTML = String($ReportHTML)
$Array = StringRegExp($HTML, '\d.\d*(?=\Q<BR>\E)', 1, 0)
Edited by Ned
Link to comment
Share on other sites

Hello Ned,

Its not that AutoIt fetches the data differently...Its that the browswers grab the data differently.

If you take note from the helpfile with _IEBodyGrabHTML() and other similar IE functions, they are utilizing Internet Explorer, thus the reason it requires you to have IE 4 or better installed. Try grabbing the source from an IE browser for your testing rather than Chrome which does show the source code differently then IE or Firefox or most other browsers.


My Contributions: Unix Timestamp: Calculate Unix time, or seconds since Epoch, accounting for your local timezone and daylight savings time. RegEdit Jumper: A Small & Simple interface based on Yashied's Reg Jumper Function, for searching Hives in your registry. 

Link to comment
Share on other sites

...Chrome which does show the source code differently then IE or Firefox or most other browsers.

In addition, the client browser reports its identity to the server in headers and server-side scripts can return different HTML to the client based on that and other information.


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

I got it figured out. It was between the HTML being different and the RegEx mode. Had it on 1 (only returning 1st found value), then tried the others, with 3 working (returning all values found). I do not understand why 1 did not work, but nonetheless my script is working. :huh2:

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...