Jump to content

Read Internet Explorer table by rows with link text

Recommended Posts


I'm working on pulling some data from a table in Internet Explorer and writing the contents, on a row by row basis, into a csv file. . I am able to loop through to an extent but have the following questions:

1. How do I determine the end of a row

2. Sometimes the table data will have a hyperlink attached. I'd very much like to capture that url in my csv file.

If I use the _IETableWriteToArray function, I am able to do everything I need except for getting the URL. Does anyone know a way to pull the URL for certain cells in the table? I have looked at a For...Next using _IETagnameGetCollection but am not successful. Thanks in advance for any assistance! Code below:

$oTable = _IETableGetCollection($oIE,0)
    $oTRs = _IETagnameGetCollection($oTable, "TR")

    For $oTR In $oTRs
        $oTDs = _IETagnameGetCollection($oTR, "TD")
        For $oTD In $oTDs
            $sRowCont = _IEPropertyGet($oTD, "innertext")
            msgbox(0,"Cell Text:",$sRowCont)
            $oTD = _IETagnameGetCollection($oTR, "TD", 0)
            $oLink = _IETagnameGetCollection($oTD, "a", 0)
            msgbox(0,"Link Text",$oLink )
            ;ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $blnFound = ' & $blnFound & @crlf & '>Error code: ' & @error & @crlf) ;### Debug Console
Link to post
Share on other sites

I would personally just modify _IETableWriteToArray() to return .innerHTML instead of .innerText

Try the function below which is the _IETableWriteToArray function with "$a_TableCells[$col][$row] = $td.innerText" changed to "$a_TableCells[$col][$row] = $td.innerHTML".

Keep in mind, this will return all of the data in the <td> so it may return more than you want, but it's a start.

; #FUNCTION# ====================================================================================================================
; Name...........: _IETableWriteToArray
; Description ...: Reads the contents of a Table into an array
; Parameters ....: $o_object    - Object variable of an InternetExplorer.Application, Table object
;                   $f_transpose- Boolean value.  If True, swap rows and columns in output array
; Return values .: On Success     - Returns a 2-dimensional array containing the contents of the Table
;                  On Failure    - Returns 0 and sets @ERROR
;                    @ERROR        - 0 ($_IEStatus_Success) = No Error
;                                - 3 ($_IEStatus_InvalidDataType) = Invalid Data Type
;                                - 4 ($_IEStatus_InvalidObjectType) = Invalid Object Type
;                    @Extended    - Contains invalid parameter number
; Author ........: Dale Hohm
; ===============================================================================================================================
Func _IETableWriteToArrayHTML(ByRef $o_object, $f_transpose = False)
    If Not IsObj($o_object) Then
        __IEErrorNotify("Error", "_IETableWriteToArray", "$_IEStatus_InvalidDataType")
        Return SetError($_IEStatus_InvalidDataType, 1, 0)
    If Not __IEIsObjType($o_object, "table") Then
        __IEErrorNotify("Error", "_IETableWriteToArray", "$_IEStatus_InvalidObjectType")
        Return SetError($_IEStatus_InvalidObjectType, 1, 0)
    Local $i_cols = 0, $tds, $i_col
    Local $trs = $o_object.rows
    For $tr In $trs
        $tds = $tr.cells
        $i_col = 0
        For $td In $tds
            $i_col = $i_col + $td.colSpan
        If $i_col > $i_cols Then $i_cols = $i_col
    Local $i_rows = $trs.length
    Local $a_TableCells[$i_cols][$i_rows]
    Local $col, $row = 0
    For $tr In $trs
        $tds = $tr.cells
        $col = 0
        For $td In $tds
            $a_TableCells[$col][$row] = $td.innerHTML
            $col = $col + $td.colSpan
        $row = $row + 1
    If $f_transpose Then
        Local $i_d1 = UBound($a_TableCells, 1), $i_d2 = UBound($a_TableCells, 2), $aTmp[$i_d2][$i_d1]
        For $i = 0 To $i_d2 - 1
            For $j = 0 To $i_d1 - 1
                $aTmp[$i][$j] = $a_TableCells[$j][$i]
        $a_TableCells = $aTmp
    Return SetError($_IEStatus_Success, 0, $a_TableCells)
EndFunc   ;==>_IETableWriteToArray
Edited by danwilli
Link to post
Share on other sites
  • 2 years later...

One of my signature links is good for this kind of thing.  You write an xpath, and it will return an array of matching objects.

edit: oops, posted on someone digging up a necro

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By therks
      Hi there,
      So I've created this very simple "chat" program so that my brother and I can quickly and easily share text and links to other computers on the network. It just reads/writes to a text file that the script interprets/formats appropriately. I will put the script it in a shared network folder and then any of the computers can open and use it.
      For displaying the "chat" I'm currently using an embedded IE browser and formatting the text via HTML, but I've realized that this will cause problems with opening the links because it will use IE instead of the system's default browser (Chrome in our case). Any suggestions? (To be clear, I'm specifically looking for assistance on having text with clickable links that will open in the default browser.)
      For anyone interested, here's the code so far:
      #include <GUIConstants.au3> #include <GUIEdit.au3> #include <IE.au3> Opt('TrayIconDebug', 1) Global $CHAT_FILE = @ScriptDir & '\NetworkChat.txt' Main() Func Main() Local $hGUIMain = GUICreate('Network Chat', 500, 500, Default, Default, $WS_OVERLAPPEDWINDOW) Local $oEmbedIE = _IECreateEmbedded() Local $ob_EmbedIE = GUICtrlCreateObj($oEmbedIE, 5, 5, 490, 300) GUICtrlSetResizing(-1, $GUI_DOCKBORDERS) _IENavigate($oEmbedIE, 'about:blank') Local $dm_AccelTab = GUICtrlCreateDummy() Local $dm_AccelCtrlA = GUICtrlCreateDummy() Local $dm_AccelEnter = GUICtrlCreateDummy() Local $dm_AccelShiftEnter = GUICtrlCreateDummy() Local $dm_AccelPgUp = GUICtrlCreateDummy() Local $dm_AccelPgDn = GUICtrlCreateDummy() Local $ed_Chat = GUICtrlCreateEdit('', 5, 310, 470, 60, BitOR($ES_WANTRETURN, $WS_VSCROLL, $ES_AUTOVSCROLL)) GUICtrlSetResizing(-1, BitOR($GUI_DOCKSTATEBAR, $GUI_DOCKLEFT, $GUI_DOCKRIGHT)) Local $aTabStop = [ 4 * 4 ] _GUICtrlEdit_SetTabStops($ed_Chat, $aTabStop) Local $bt_Send = GUICtrlCreateButton('>', 475, 310, 20, 60) GUICtrlSetResizing(-1, BitOR($GUI_DOCKSTATEBAR, $GUI_DOCKSIZE, $GUI_DOCKRIGHT)) GUICtrlSetState(-1, $GUI_DEFBUTTON) Local $ch_Timestamps = GUICtrlCreateCheckbox('Show &timestamps', 5, 370, 200, 20) GUICtrlSetResizing(-1, BitOR($GUI_DOCKSTATEBAR, $GUI_DOCKSIZE, $GUI_DOCKLEFT)) Local $ra_Enter = GUICtrlCreateRadio('&1. Enter to send / Shift+Enter for new line', 280, 370, 215, 20) GUICtrlSetResizing(-1, BitOR($GUI_DOCKSTATEBAR, $GUI_DOCKSIZE, $GUI_DOCKRIGHT)) GUICtrlSetState(-1, $GUI_CHECKED) Local $ra_ShiftEnter = GUICtrlCreateRadio('&2. Shift+Enter to send / Enter for new line', 280, 390, 215, 20) GUICtrlSetResizing(-1, BitOR($GUI_DOCKSTATEBAR, $GUI_DOCKSIZE, $GUI_DOCKRIGHT)) Local $aAccel = [ _ [ '{enter}', $dm_AccelEnter ], _ [ '+{enter}', $dm_AccelShiftEnter ], _ [ '{tab}', $dm_AccelTab ], _ [ '{pgup}', $dm_AccelPgUp ], _ [ '{pgdn}', $dm_AccelPgDn ], _ [ '^a', $dm_AccelCtrlA ] ] Local $aAccelSwap = $aAccel $aAccelSwap[0][0] = '+{enter}' $aAccelSwap[1][0] = '{enter}' GUISetAccelerators($aAccel) GUISetState() GUICtrlSetState($ed_Chat, $GUI_FOCUS) Local $sHTML, $aChatTime[2], $hFocused, $hIEControl = ControlGetHandle($hGUIMain, '', '[CLASS:Internet Explorer_Server; INSTANCE:1]') While 1 $hFocused = _WinAPI_GetFocus() $aChatTime[0] = FileGetTime($CHAT_FILE, 0, 1) If $aChatTime[0] <> $aChatTime[1] Then $sHTML = _LoadChat(BitAND(GUICtrlRead($ch_Timestamps), $GUI_CHECKED)) _IEDocWriteHTML($oEmbedIE, $sHTML) _IEAction($oEmbedIE, 'refresh') $oEmbedIE.document.parentwindow.scrollTo(0, $oEmbedIE.document.body.scrollHeight) $aChatTime[1] = $aChatTime[0] EndIf Local $iMsg = GUIGetMsg() If $iMsg > 0 Then ConsoleWrite($iMsg & @CRLF) Switch $iMsg Case $ch_Timestamps $aChatTime[1] = 0 Case $ra_Enter GUISetAccelerators($aAccel) Case $ra_ShiftEnter GUISetAccelerators($aAccelSwap) Case $dm_AccelPgUp $oEmbedIE.document.parentwindow.scrollBy(0, -200) Case $dm_AccelPgDn $oEmbedIE.document.parentwindow.scrollBy(0, 200) Case $dm_AccelCtrlA If $hFocused = GUICtrlGetHandle($ed_Chat) Then _GUICtrlEdit_SetSel($ed_Chat, 0, -1) Case $dm_AccelEnter If $hFocused = GUICtrlGetHandle($ed_Chat) Then If BitAND(GUICtrlRead($ra_Enter), $GUI_CHECKED) Then _SendChat($ed_Chat) Else _GUICtrlEdit_ReplaceSel($ed_Chat, @CRLF) EndIf EndIf Case $dm_AccelShiftEnter If $hFocused = GUICtrlGetHandle($ed_Chat) Then If BitAND(GUICtrlRead($ra_ShiftEnter), $GUI_CHECKED) Then _SendChat($ed_Chat) Else _GUICtrlEdit_ReplaceSel($ed_Chat, @CRLF) EndIf EndIf Case $bt_Send If $hFocused = GUICtrlGetHandle($ed_Chat) Then _SendChat($ed_Chat) Else GUICtrlSetState($ed_Chat, $GUI_FOCUS) EndIf Case $dm_AccelTab If $hFocused = GUICtrlGetHandle($ed_Chat) Then _GUICtrlEdit_ReplaceSel($ed_Chat, @TAB) Else GUICtrlSetState($ed_Chat, $GUI_FOCUS) EndIf Case $GUI_EVENT_CLOSE ExitLoop EndSwitch WEnd EndFunc Func _EncodeForFile($sString) $sString = StringStripCR($sString) $sString = StringReplace($sString, '\', '\\') $sString = StringReplace($sString, @LF, '\n') $sString = StringReplace($sString, @TAB, '\t') Return $sString EndFunc Func _EncodeFromFile($sString) $sString = StringReplace($sString, '<', '<') $sString = StringReplace($sString, '>', '>') $sString = StringFormat($sString) $sString = StringReplace($sString, @TAB, '    ') $sString = StringReplace($sString, @LF, '<br />') $sString = StringRegExpReplace($sString, '(http://\S+)', '<a href="\1" target="_blank">\1</a>') Return $sString EndFunc Func _SendChat($iCtrl) Local $sChat = StringStripWS(GUICtrlRead($iCtrl), 3) If $sChat Then FileWrite($CHAT_FILE, @CRLF & @YEAR & @MON & @MDAY & @HOUR & @MIN & @SEC & @TAB & @ComputerName & @TAB & _EncodeForFile($sChat)) GUICtrlSetData($iCtrl, '') Return True EndIf EndFunc Func _LoadChat($iShowTS) Local $aLines = FileReadToArray($CHAT_FILE), _ $sOutput = '<style>' $sOutput &= 'body, table { margin: 0; font-family: Arial; font-size: 0.8em; border-collapse: collapse; width: 100%; } ' $sOutput &= 'tr { vertical-align: top; text-align: left; } ' $sOutput &= '.name_column { white-space: nowrap; } ' $sOutput &= '.text_column { width: 100%; } ' $sOutput &= '.row1 { background: #eee; } ' $sOutput &= '.date { background: #bef; text-align: center; border: solid #000; border-width: 1px 0; } ' If Not $iShowTS Then $sOutput &= '.timestamp { display: none }' $sOutput &= '</style>' $sOutput &= '<table>' Local $sDateMem For $L = 0 To @extended-1 If Not $aLines[$L] Then ContinueLoop Local $aRegExLine = StringRegExp($aLines[$L], '(.+)\t(.+)\t(.+)', 1), $sChat If Not @error Then $aDateTime = _FormatTime($aRegExLine[0]) If $aDateTime[0] <> $sDateMem Then $sOutput &= '<tr><th class="date" colspan="2">' & $aDateTime[0] & '</th></tr>' $sDateMem = $aDateTime[0] EndIf $sOutput &= '<tr class="row' & Mod($L, 2) & '" title="' & $aDateTime[1] & '">' & _ '<th class="name_column"><span class="timestamp">[' & $aDateTime[1] & '] </span>' & $aRegExLine[1] & '</th>' & _ '<td class="text_column">' & _EncodeFromFile($aRegExLine[2]) & '</td></tr>' & @CRLF EndIf Next $sOutput &= '</table>' Return $sOutput EndFunc Func _FormatTime($sTime) Local $aMonths = StringSplit('Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec', '|') Local $aReturn[2] Local $aRegEx = StringRegExp($sTime, '(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})', 1) If Not @error Then $aReturn[0] = $aRegEx[0] &'-'& $aMonths[Int($aRegEx[1])] &'-'& $aRegEx[2] $aReturn[1] = $aRegEx[3] &':'& $aRegEx[4] &':'& $aRegEx[5] EndIf Return $aReturn EndFunc
    • By Messy_Code_Guy
      I need some help with the following:
      1. Finding an image in a Word doc.  I have read the help file but I cannot figure out how to reference the image in the Word doc.
      2. Adding a hyperlink to that image.
      3. How would I loop the add hyperlink (text) and add hyperlink (image) to replace multiple links in a document.
      I have the add image and hyperlink working with the following code:
      $pic = "<PHOTO>" $picpath = IniRead(@ScriptDir & "\Config\Config.ini", "User Info", "Picture", 0) Local $oRange = _Word_DocFind($oDoc, $pic) _Word_DocPictureAdd($oDoc, $picpath, Default, Default, $oRange) _Word_DocFindReplace($oDoc, $pic, "", Default, 0, True, True) If @error Then $file1 = FileOpen("C:\Tech\Log_Files\_Error_Logs\Error_LOG.txt", 9) _FileWriteLog($file1, "," & @ComputerName & "," & @UserName & ",Error adding a picture to the document. " & $picpath & " " & " @error = " & @error & " @extended = " & @extended) FileClose($file1) EndIf $Link = "<LINKEDIN>" $LinkedIn = IniRead(@ScriptDir & "\Config\Config.ini", "User Info", "LinkedIn", 0) Local $oRange = _Word_DocFind($oDoc, $Link) _Word_DocLinkAdd($oDoc, $oRange, $LinkedIn, Default, "Click here to visit my LinkedIn page. " & @CRLF & $LinkedIn, "LinkedIn") If @error Then $file1 = FileOpen("C:\Tech\Log_Files\_Error_Logs\Error_LOG.txt", 9) _FileWriteLog($file1, "," & @ComputerName & "," & @UserName & ",Error adding a link to the document. " & $LinkedIn & " " & " @error = " & @error & " @extended = " & @extended) FileClose($file1) EndIf I just can't figure out how to find the images in a Word doc.
      Thanks for reading my post!
    • By InunoTaishou
      I think RichEdit has been my favorite thing I've ever discovered on AutoIt lol. In my quest to add in more html tags to my _StringToRichEditArray I needed a way to do href! There was an example I found that I followed but it didn't format correctly and didn't work 100% but it gave me a good base. Think I'll tackle inserting an image next, not looking forward to that. If anyone has an idea on how to do it let me know.
      Known issues (these will cause the hyperlink to lose the +li attribute after the RichEdit is updated):
      The hyperlink and friendly text are appended/inserted (directly adjacent to a non whitespace) but the hyperlink is not a valid hyperlink. Changing the char color for the control causes the hyperlink to lose it's hyperlink color (the light blue). Fix for Issue 1:
      Use the full URL for the hyperlink (https://www.autoitscript.com/site/ instead of www.autoitscript.com/site) Use any hyperlink with any friendly text that does not have www at the beginning (Hyperlink: www.google.com, Friendly Text: google.com) Use any hyperlink with any, or no, friendly text, but have a whitespace to the left of the hyperlink. Fix for Issue 2:
      I have no fix. Updated RichEdit Hyperlink.au3
      Original Post, outdated: Had a problem with inserting/appending hyperlinks that pointed to the local computer. (C:\Windows\)
    • By Decipher
      I would like to have a clickable text(hyperlink) in a RichEdit control besides a URL. To be more specific, I would like to display "Download" instead. How can I achieve this functionality? I have searched with no luck. Thanks for any help.

      Related Information:

      From GUIRichEdit.au3:

      Func _GUICtrlRichEdit_SetEventMask($hWnd, $iEventMask) If Not _WinAPI_IsClassName($hWnd, $_GRE_sRTFClassName) Then Return SetError(101, 0, False) If Not __GCR_IsNumeric($iEventMask) Then Return SetError(102, 0, False) _SendMessage($hWnd, $EM_SETEVENTMASK, 0, $iEventMask) Return True EndFunc ;==&gt;_GUICtrlRichEdit_SetEventMask$iEventMask = $ENM_LINK - Sends $EN_LINK notifications when the mouse pointer is over text having the link character. What is the link character "http://"? The following would work but is not a perfect solution "http://Download".
      From RichEditConstants.au3 used in _GUICtrlRichEdit_AutoDetectURL($hWnd, $fState)

      RichEdit Friendly Name Hyperlinks on MSDN

      $EN_LINK - http://msdn.microsoft.com/en-us/library/windows/desktop/bb787970%28v=vs.85%29.aspx

      GUIRegisterMsg($WM_COMMAND, "WM_NOTIFY") WM_NOTIFY is the function called when a link is clicked.

      Func WM_NOTIFY($hWnd, $iMsg, $iwParam, $ilParam) Local $hWndFrom, $iIDFrom, $iCode, $tNMHDR $tNMHDR = DllStructCreate($tagNMHDR, $ilParam) $hWndFrom = HWnd(DllStructGetData($tNMHDR, "hWndFrom")) $iIDFrom = DllStructGetData($tNMHDR, "IDFrom") $iCode = DllStructGetData($tNMHDR, "Code") Switch $hWndFrom Case $h_RichEdit Select Case $iCode = $EN_LINK Local $ENLINK = DllStructCreate($tagENLINK,$ilParam) Local $Link_Msg = DllStructGetData($ENLINK,4) If $Link_Msg = $WM_LBUTTONUP Then Local $Link = _GUICtrlRichEdit_GetTextRange($hWndFrom,DllStructGetData($ENLINK,7),DllStructGetData($ENLINK,8)) MsgBox(0, '', $Link) EndIf Case $iCode = $EN_MSGFILTER Local $tMsgFilter = DllStructCreate($tagEN_MSGFILTER, $ilParam) If DllStructGetData($tMsgFilter, 4) = $WM_RBUTTONUP Then ; WM_RBUTTONUP Local $hMenu = GUICtrlGetHandle($RichMENU[0]) _SetMenuTexts($hWndFrom, $RichMENU) _GUICtrlMenu_TrackPopupMenu($hMenu, $hWnd) EndIf EndSelect EndSwitch Return $GUI_RUNDEFMSG EndFunc ;==>WM_NOTIFY

      I am doing more research but from what I can tell this all done using DLLs therfore no easy solution that I will be able to conjure up.

      Suggested Methods:
      ITextRange2::SetURL method
      ITextRange2::GetURL method

      I have no idea how to use these.

      Resolved see last post.
    • By Ascend4nt

      Okay, since I created the _ClipPutHyperlink() function, I figured I might as well just go one step further and open up the whole HTML Clipboard send/put interface and make it really simple to use - not only for simple Hyperlinks, but also for complete pieces of HTML code as well.

      Note that 'PlainText' is the optional view of your HTML code that should be free of formatting. This is helpful when pasting to applications that don't accept HTML formatted strings, such as Notepad.

      Note also that the HTML code NEEDS to be encoded in UTF-8 format. (For straight-ANSI/ASCII code, you don't need to do anything, UTF-8 encoding only comes into play for Unicode formatting).

      Anyway, bundled in the ZIP is the UDF and a short example (same as the below HTMLPut Example).

      Additionally, see the sample code for _ClipGetHTML().
      Example 1: HyperlinkPut:

      ; Special Unicode text call _ClipPutHyperlink("http://www.google.co.jp/",ChrW(0x30B0)& ChrW(0x30FC)& ChrW(0x30B0)& ChrW(0x30EB)& " (Japanese Google)") ; Regular text _ClipPutHyperlink("http://www.google.com","Google")- Example 'paste' Output -
      Unicode Hyperlink:
      グーグル (Japanese Google)

      Regular Hyperlink:
      Example 2: HTMLPut

      - Example 'paste' output -
      Headline TextThis is a paragraph showing the formatting possibilities using the _ClipPutHTML() functions. The regular modifiders, such as bold, italics, and underlines work as usual, just like all other HTML formatting.

      Here's an example list:
      List item #1. List item #2. List item #3 with a Hyperlink Get the Code at my Site

      Ascend4nt's AutoIT Code License agreement:
      While I provide this source code freely, if you do use the code in your projects, all I ask is that:
      If you provide source, keep the header as I have put it, OR, if you expand it, then at least acknowledge me as the original author, and any other authors I credit

      If the program is released, acknowledge me in your credits (it doesn't have to state which functions came from me, though again if the source is provided - see #1)

      The source on it's own (as opposed to part of a project) can not be posted unless a link to the page(s) where the code were retrieved from is provided and a message stating that the latest updates will be available on the page(s) linked to.

      Pieces of the code can however be discussed on the threads where Ascend4nt has posted the code without worrying about further linking.
  • Create New...