Sign in to follow this  
Followers 0
Athos

Reading from a webpage

6 posts in this topic

Hi Guys, I'm trying to find a way to collect data from a webpage using IE.

I already know about _IEBodyReadText($oIE) but the problem is, I just want a specific instance of that text without getting all of it.

What I want to be able to do, is to read that text line by line, and when I find a specific string on the line I want, I want to be able to print out that entire line.

The problem for me is that It _IEBodyReadText formats the text into one big string, so I would have to use split string to accomplish this task. The problem with that is, there are no lines, so I don't know how I would write something like,

if substring=Cluster read untill you reach .com, and let that be the substring....

Should I tackle this problem that way, or should I find a way to get the substring I want from the HTML itself?

Thanks,

Athos

Share this post


Link to post
Share on other sites



Do you have an example of the html and the text that you want? There are other _IE functions that can return specific elements of the DOM, but without knowing the specifics it is hard to recommend a solution.

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Sure thing. This is from the html source (note I modded it again to protect the data)

<tr><td>&amp;nbsp;</td><td><font class="f-body">This is the body of text taht I want to avoid </font></td></tr>

<tr><td>&amp;nbsp;</td><td>

<font class="f-navbar">Cluster Host = </font><font class='f-body'>

something.com</font>

I want to get Cluster host and the something.com in 1 string. That or just the something.com would be good.

Edited by Athos

Share this post


Link to post
Share on other sites

Use function_IETableWriteToArray to retrieve the content of the table. You then can loop through the array and search for your data.


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Thanks so much water! _IETable was great, because from the array, I could just specify the column and row I wanted. :D

Edited by Athos

Share this post


Link to post
Share on other sites

Glad to be of service :D


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • ur
      By ur
      I have date in the string format as "DD-MM-YYYY".
       
      I need to get yesterday's date from it.
      I tried converting this from _DateTimeFormat but not working.
      Is there any direct UDF available to get this.?
    • rkr
      By rkr
      Hi, i want to read a particular string from a text file using autoit. i wish to read it without explicitly opening the text file. the copied string should be then transferred to an excelbook (again, no need to explicitly open the excel book)... 

      with reference to my screenshot attached, my input to the  'script'' is going to be 0017-0008, and the script should copy the highlighted two lines from the input file to excel
      thanks

    • zenocon
      By zenocon
      Hi, After scouring the forums for many hours, I'm trying to compile the most up to date / recent information on the options available for integrating with JavaScript / DOM -- as it relates to scraping + automation of web pages.
      It's my understanding there is IE.au3 script for automation of IE through a COM interface.  But I believe this only works with IE and won't work with Edge, correct?  Is there a COM interface that works with Edge, or any other options for integrating with Edge (other than IUIAuatomation?)
      I know there was also a FF.au3 UDF, but Mozilla abandoned the support for their mozrepl in favor of Web Extensions, and my understanding is that the FF.au3 UDF no longer works, is that correct?
      There was also a Chrome.au3 UDF, but my read on the forums indicate that this also broke many Chrome releases past.
      Which leaves IUIAutomation which I have been using to automate / scrape Windows apps, but when I am trying it on a website, it is not as useful.  For example, if I know the exact DOM id / class, I can get at it and do whatever I need to in JavaScript very simply.  With IUIAutomation, the DOM properties are not available, and most tags / elements in DOM have no useful defining characteristics to be able to get at them reliably (if they are targetable at all).  Some things might be able to be done with IUIAutomation, but I see it's value in targeting website automation / scraping as fairly limited.
      At this point, it seems like my best option is to use IE.au3, but that forces users on IE, which is probably a showstopper.
      Is there another way to bridge into the DOM?  I have written Web Extensions for Chrome and Firefox before.  They can communicate with external processes via AJAX or messaging.  I'm wondering if I can build what I need in a WebExtension and then trigger it from AutoIT Script, and gather up the results somewhere.
      I know there was the ISimpleDOM.au3 and some Microsoft Accessability scripts, but they seem to only be partially supported in browsers, and I didn't have a lot of luck getting those examples to run correctly.
    • cheeroke
      By cheeroke
      Hi all,
      I got this code and would like to be able to change Baud Rate and instead of sending character by character i would like to be able (if possible) to send whole string. But i don't know how to change it.
      I am taking input from file and processing whole line (this is done in FilesHandling.au3).
      To execute this i am just calling SendData("FileName", int) in "main" script.
      Any help very appreciated.
      #include <WinAPI.au3> #include <Array.au3> #include "FilesHandling.au3" ;init DLL function, we need handle to call the function $h = DllCall("Kernel32.dll", "hwnd", "CreateFile", "str", "\\.\COM19", "int", BitOR($GENERIC_READ,$GENERIC_WRITE), "int", 0, "ptr", 0, "int", $OPEN_EXISTING, "int", $FILE_ATTRIBUTE_NORMAL, "int", 0) $handle=$h[0] Func SendData($FileName, $LineNumber) ;string to be send $c = readFile($FileName, $LineNumber) $cLenght = StringLen($c) $aArray = StringSplit($c, "") ;_ArrayDisplay($aArray, "", Default, 64) For $i = 1 To $cLenght writeChar($handle, $aArray[$i], $cLenght) Next ;move to next line writeChar($handle, @CR,1) EndFunc ;write a single char func writeChar($handle,$c,) $stString = DLLStructCreate("char str") $lpNumberOfBytesWritten = 0 DllStructSetData($stString, 1, $c) $res = _WinAPI_WriteFile($handle, DllStructGetPtr($stString, "str"), 1,$lpNumberOfBytesWritten) if ($res<>true) then ConsoleWrite ( _WinAPI_GetLastErrorMessage() & @LF) EndIf EndFunc  
    • FroVN
      By FroVN
      Hi, i have a problem :" can't set the name of file with a special character like: \;/;";|;...  have anyway to short the StringInSrt and Stringreplace? i am using this code but too long
      $title=InputBox(0,'','','')
         if StringInStr($title,'\') or StringInStr($title,'/') or StringInStr($title,':') or StringInStr($title,'*') or StringInStr($title,'?') or StringInStr($title,'"') or StringInStr($title,'<') or StringInStr($title,'>') or StringInStr($title,'|') Then
             $title=StringReplace($title,'\','-')
              $title=StringReplace($title,'/','-')
               $title=StringReplace($title,':','-')
                $title=StringReplace($title,'*','-')
                 $title=StringReplace($title,'?','-')
                  $title=StringReplace($title,'"','-')
                   $title=StringReplace($title,'<','-')
                    $title=StringReplace($title,'>','-')
                     $title=StringReplace($title,'|','-')
         EndIf