Burgs Posted March 30, 2018 Share Posted March 30, 2018 Greetings, I'd like to be able to scrape data from websites, mostly official corporate websites for name and address information. I know about the "_IE" related functions pertaining to 'tables', 'forms', 'body', and other website elements. However it seems now-adays many websites are instead using Javascript/PHP/SQL to post data directly onto the page from a database query...and this seems to make the "_IE" related functions problematic...if not defeat them entirely. Is there an effective way of getting data off a website when these situations are encountered...? Thanks in advance for any hints. Regards Link to comment Share on other sites More sharing options...
Danp2 Posted March 30, 2018 Share Posted March 30, 2018 Can you provide a few links of the sites where the _IE* functions aren't able to retrieve the desired information? Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
Burgs Posted March 30, 2018 Author Share Posted March 30, 2018 Hi, Well I can give you one quick example off the top of my head...go to the official corporate website for McDonalds (www.mcdonalds.com)...click on 'locate' at the top menu bar. Then put in whatever you want...state, zip code...however you want to search. The site will then give a listing of stores in the area you searched in... It seems they are not in 'tables' or 'frames' etc...they are pulled from a database...so the names/addresses for each listed store do not even appear in the 'body' tag of the HTML... There are many sites like that...just about any corporate or transit website, many seem to be like that now...so how are "_IE" functions supposed to scrape that information from them? Link to comment Share on other sites More sharing options...
Danp2 Posted March 30, 2018 Share Posted March 30, 2018 The elements are there for you to scrape. The data just isn't in a table that you can easily retrieve. Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
Burgs Posted March 30, 2018 Author Share Posted March 30, 2018 Hmmm well it seems not, unless you care to explain... if you do a search on any locality...the information I want to scrape, "123 main street, anytown, anystate" for example...that information is not in the body of the HTML or anywhere in the source of the page...so how can I scrape it? If it ain't there...it ain't there... Link to comment Share on other sites More sharing options...
Danp2 Posted March 30, 2018 Share Posted March 30, 2018 Using Firefox, I right clicked on the desired element and chose Inspect Element from the popup menu. This opened the Developer Tools' Inspector tab and highlights the designated element. FWIW, I was able to view the address. IIRC, it was contained in a bunch of nested DIV elements. If it's visible, it's there somewhere... Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
Burgs Posted March 30, 2018 Author Share Posted March 30, 2018 well that's what I would have figured...however for some reason when I did a search (ctrl F) of the source for the page the address info did not come up... however yes I followed your instruction and was able to find the 'DIV' element (h3/h4) where the address and town are listed...now my obvious question...how can I use that process to scrape that info...? I mean it would not be feasible to send 'clicks' and read the 'Inspect Element'...on that particular site there are countless 'DIV' elements...there must be a way to automate it...correct? Link to comment Share on other sites More sharing options...
Danp2 Posted March 30, 2018 Share Posted March 30, 2018 21 minutes ago, Burgs said: there must be a way to automate it...correct? Yes... the same way you would for any other site. In this case, you could use the _IE commands to retrieve all DIV elements with class "restaurant-location__address-container". From there, either retrieve the H3/H4 elements and grab their innertext values. If I was trying to automate this, I would probably invoke the jQuerify solution from @Chimp. Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
Burgs Posted March 30, 2018 Author Share Posted March 30, 2018 (edited) Thanks for the reply. I will look at that link, i'm not familiar with it. This particular website (McDonalds) is only 1 example...I want to be able to scrape similar information (name, address, and other related info) from a great multitude of any particular random website...thus that particular container may be fine for the McDonalds website...however not valid for KFC...for example...which is likely laid out differently. It seems it may not be possible to do as I would like...as I mentioned earlier perhaps the 'good 'ole days' of data being neatly available in forms, tables, and frames are not viable anymore...??? Edited March 30, 2018 by Burgs Link to comment Share on other sites More sharing options...
Danp2 Posted March 30, 2018 Share Posted March 30, 2018 I don't know of a single AutoIT solution that will automatically grab the address from any website. Even in the "good 'old days" you still had to know which elements (forms, tables, frames, etc) contained the desired data. FWIW, jQuerify is an AutoIT solution. It just gives you access to the jQuery variable so that you can invoke commands that may not be available via the standard _IE* commands. Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
Burgs Posted March 30, 2018 Author Share Posted March 30, 2018 OK yes I see...I am looking at that jQuerify link now. Yes all I mean is that the information was more readily available when located in such an element like a table...seemed easier to be able to parse and track down what you are looking for. Thanks again for the information. I will study that jQuery link you supplied. Regards. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now