BlazerV60 Posted April 17, 2024 Posted April 17, 2024 Hello, I'm creating a program to help me analyze stocks. So a big part of my tool is web scraping. As of yesterday, the _INetGetSource command seemed to stop working for the page that tells me a stock's info which is on yahoo finance (I'll ref the link below). Here is the simplified version of the code. #include <Inet.au3> ConsoleWrite(_INetGetSource('https://finance.yahoo.com/quote/AAPL')) It's strange because before yesterday, it was pulling the code from those pages correctly. The _INetGetSource will work for most other yahoo pages, even the finance home page (finance.yahoo.com) but not the page that shows me a specific stock's info. Does anyone know why it stopped giving me the source code for those pages?ย
SOLVE-SMART Posted April 17, 2024 Posted April 17, 2024 (edited) Hi @BlazerV60, how does your output look like? What do you search exactly? This site uses iframes on several sections - this might be a problem. ย 6 hours ago, BlazerV60 said: So a big part of my tool is web scraping. Which tool(s) do you use for the web scraping? Are you only interested in getting the whole page content (source) ? Then you parse the needed data? I guess it could be better to only get the expected data directly (e.g. WebDriver (au3WebDriver project) or by UIA). 6 hours ago, BlazerV60 said: Does anyone know why it stopped giving me the source code for those pages?ย How? Only the maintainer(s)/developer(s) of the page know if there were changes. I am interested in helping you. So please provide more context ๐ค . Best regards Sven Edited April 17, 2024 by SOLVE-SMART ==> AutoIt related: ๐ GitHub, ๐ Discord Server, ๐ย Cheat Sheet,ย ๐ย autoit-webdriver-boilerplate Spoiler ๐ย Au3Forums ๐ฒ AutoIt (en) Cheat Sheet ๐ AutoIt limits/defaults ๐ Code Katas: [...] (comming soon) ๐ญ Collection of GitHub users with AutoIt projects ๐ย False-Positives ๐ฎย Me on GitHub ๐ฌย Opinion about new forum sub category ๐ย UDF wiki list โย VSCode-AutoItSnippets ๐ย WebDriver FAQs ๐จโ๐ซย WebDriver Tutorial (coming soon)
BlazerV60 Posted April 18, 2024 Author Posted April 18, 2024 The output comes out as weird characters like a symbol as shown in the attached image. ย I'm trying to get the whole page content (source) and then parse the data I need.ย I'm just wondering if you know a way to get the source code for that page again. I'd do it the _IECreate /w _IEDocReadHTML method but that makes my tool run slower because it technically has to open a hidden browser and then take the source code from that and then close that hidden browser, but if there's no solution to the _InetGetSource way then i'll do it.
SOLVE-SMART Posted April 18, 2024 Posted April 18, 2024 There are several questions not answered so far @BlazerV60. So I have to answer to this ... 7 hours ago, BlazerV60 said: I'm just wondering if you know a way to get the source code for that page again. ... simply with yes, use the WebDriver or try to use UIA. Both can be found several times here at the forum by the search box. 7 hours ago, BlazerV60 said: I'd do it the _IECreate /w _IEDocReadHTML method but that makes my tool run slower because it technically has to open a hidden browser and then take the source code from that and then close that hidden browser [...] This would also be the case for using WebDriver or UIA too. But in headless mode it's not that bad (it's quick enough I would say). Best regards Sven argumentum 1 ==> AutoIt related: ๐ GitHub, ๐ Discord Server, ๐ย Cheat Sheet,ย ๐ย autoit-webdriver-boilerplate Spoiler ๐ย Au3Forums ๐ฒ AutoIt (en) Cheat Sheet ๐ AutoIt limits/defaults ๐ Code Katas: [...] (comming soon) ๐ญ Collection of GitHub users with AutoIt projects ๐ย False-Positives ๐ฎย Me on GitHub ๐ฌย Opinion about new forum sub category ๐ย UDF wiki list โย VSCode-AutoItSnippets ๐ย WebDriver FAQs ๐จโ๐ซย WebDriver Tutorial (coming soon)
BlazerV60 Posted April 18, 2024 Author Posted April 18, 2024 6 hours ago, SOLVE-SMART said: There are several questions not answered so far @BlazerV60. So I have to answer to this ... ... simply with yes, use the WebDriver or try to use UIA. Both can be found several times here at the forum by the search box. This would also be the case for using WebDriver or UIA too. But in headless mode it's not that bad (it's quick enough I would say). Best regards Sven Yeah but my tool usually looks at over 100 stocks on any given day so 100x the extra lag time on using _IECreate does make things a little slower but it looks like it's the only way for me to go about this.ย So I guess the reason why inetgetsource stopped working on the page I referenced is due to the page implementing iframes?
argumentum Posted April 18, 2024 Posted April 18, 2024 That page is a cluster f.. .I've looked at it withย ย and I have no idea on how you did the scraping with just InetRead(). SOLVE-SMART 1 Follow the link to my code contributionย ( and other things too ). FAQ -ย Please Read Before Posting.
SOLVE-SMART Posted April 18, 2024 Posted April 18, 2024 That's what I also though on the first look into the DOM structure @argumentum . 1 hour ago, BlazerV60 said: Yeah but my tool usually looks at over 100 stocks on any given day so 100x the extra lag time on using _IECreate does make things a little slower but it looks like it's the only way for me to go about this.ย I understand this, but me guess is in case you would only scrape you target information instead of trying to get all of the page, it shouldn't be very slow. You also can implement multiple instances of the chromedriver to do the scraping actions in "parallel". Ones again, if you could specific which data you need from which page, we could possibly make other/better suggestions. Besides that, give the au3WebDriver Project a chance. For a quick start I refer to this post. Best regards Sven argumentum 1 ==> AutoIt related: ๐ GitHub, ๐ Discord Server, ๐ย Cheat Sheet,ย ๐ย autoit-webdriver-boilerplate Spoiler ๐ย Au3Forums ๐ฒ AutoIt (en) Cheat Sheet ๐ AutoIt limits/defaults ๐ Code Katas: [...] (comming soon) ๐ญ Collection of GitHub users with AutoIt projects ๐ย False-Positives ๐ฎย Me on GitHub ๐ฌย Opinion about new forum sub category ๐ย UDF wiki list โย VSCode-AutoItSnippets ๐ย WebDriver FAQs ๐จโ๐ซย WebDriver Tutorial (coming soon)
BlazerV60 Posted April 18, 2024 Author Posted April 18, 2024 Sure, I'm specifically only trying to grab the share price from the page, so right now it's showing around $166. ย I'll look into the au3webdriver.
SOLVE-SMART Posted April 18, 2024 Posted April 18, 2024 (edited) As a complete different approach: Read this article that could be helpful - I don't know. => Usage of an API (yahoo finance) to get the information you want. https://algotrading101.com/learn/yahoo-finance-api-guide/ ๐ก I was just searching for "yahoo finance api" on google ... several API ideas. Best regards Sven Edited April 18, 2024 by SOLVE-SMART ==> AutoIt related: ๐ GitHub, ๐ Discord Server, ๐ย Cheat Sheet,ย ๐ย autoit-webdriver-boilerplate Spoiler ๐ย Au3Forums ๐ฒ AutoIt (en) Cheat Sheet ๐ AutoIt limits/defaults ๐ Code Katas: [...] (comming soon) ๐ญ Collection of GitHub users with AutoIt projects ๐ย False-Positives ๐ฎย Me on GitHub ๐ฌย Opinion about new forum sub category ๐ย UDF wiki list โย VSCode-AutoItSnippets ๐ย WebDriver FAQs ๐จโ๐ซย WebDriver Tutorial (coming soon)
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now