Jump to content

Recommended Posts

_ClipGetHTML

Seeing as how I created the _ClipPutHTML() & _ClipPutHyperlink() functions, I figured why not complete the set of available functions and finish out the HTML clipboard read/set UDFs.

The below example is included in the ZIP file on my site.

Example: HTML Clipboard Monitor

#include <Misc.au3>     ; _IsPressed()

#include <_ClipGetHTML.au3>
; ================================================================================================
; <HTMLClipBoardMonitor.au3>
;
; Simple program used to Monitor and Report on HTML formatted ClipBoard data
;
; Functions:
; MemoWrite() ; from the AutoIT documentation examples
;
; Dependencies:
; <_ClipGetHTML.au3> ; _ClipGetHTML()
;
; See also:
; <_ClipPutHTML.au3>
;
; Author: Ascend4nt, and [??] (whoever coded the AutoIT Help examples with MemoWrite())
; ================================================================================================

Global $iMemo

; MemoWrite and GUI creation courtesy of AutoIT Help Examples

; Write message to memo
Func MemoWrite($sMessage = "")
    GUICtrlSetData($iMemo, $sMessage & @CRLF, 1)
EndFunc ;==>MemoWrite

Local $hGUI
Local $sHTMLStrPrev="",$aHTMLData
Local $sPlainText,$aHTMLLinks

; Create GUI
$hGUI = GUICreate("HTML ClipBoard Monitor ([F5] Forces Refresh)", 600, 400)
$iMemo = GUICtrlCreateEdit("", 2, 2, 596, 396, 0x200000)    ; $WS_VSCROLL=0x00200000
GUICtrlSetLimit($iMemo,1000000)
GUICtrlSetFont($iMemo, 9, 400, 0, "Courier New")
GUISetState()
    
Do
    $aHTMLData=_ClipGetHTML()
    If Not @error And ($aHTMLData[0]<>$sHTMLStrPrev Or _IsPressed("74")) Then
    $sPlainText=ClipGet()
    ; Clear the Edit Control
    GUICtrlSetData($iMemo,"","")
    MemoWrite("==== New HTML Data received ("&@HOUR&':'&@MIN&":"&@SEC&") ===="&@CRLF&"Version #"&$aHTMLData[1]&@CRLF& _
    "Fragment Start:"&$aHTMLData[2]&", Fragment End:"&$aHTMLData[3]&@CRLF& _
    "Selection Start (optional [-1=unavailable]):"&$aHTMLData[4]& _
    ", Selection End (optional):"&$aHTMLData[5]&@CRLF& _
    "Source URL (optional string):"&$aHTMLData[6]&@CRLF& _
    "4 characters at Fragment Start:"&StringMid($aHTMLData[0],$aHTMLData[2],4)&@CRLF& _
    "4 characters at Fragment End:"&StringMid($aHTMLData[0],$aHTMLData[3],4)&@CRLF)
    If $aHTMLData[4]<>-1 Then
    MemoWrite("4 characters at Selection Start:"&StringMid($aHTMLData[0],$aHTMLData[4],4)&@CRLF& _
    "4 characters at Selection End:"&StringMid($aHTMLData[0],$aHTMLData[5],4))
    EndIf
    MemoWrite("---- CF_HTML Header (size="&StringLen($aHTMLData[7])&") ----"&@CRLF&$aHTMLData[7]&@CRLF)
    MemoWrite("---- RAW HTML Data (UTF-8 size="&StringLen($aHTMLData[0])&") ----"&@CRLF&BinaryToString($aHTMLData[0],4)&@CRLF)
    MemoWrite("---- Plain Text Variant (size="&StringLen($sPlainText)&") ----"&@CRLF&$sPlainText)
    
    #cs     
    ; Want to put it back just the way it came? This is one approach, but
    ; the Offsets will not be placed properly
    ;_ClipPutHTML($aHTMLData[0],$sPlainText)
    
    ; This is the proper way:

    Local $sHTMLData=$aHTMLData[7]&$aHTMLData[0]
    _ClipBoard_SendHTML($sHTMLData,$sPlainText)
    #ce

    $sHTMLStrPrev=$aHTMLData[0]
    EndIf
Until _IsPressed("1B") Or GUIGetMsg()=-3    ; $GUI_EVENT_CLOSE=-3
GUIDelete($hGUI)

Get the Code at my Site

Ascend4nt's AutoIT Code License agreement:

While I provide this source code freely, if you do use the code in your projects, all I ask is that:

  • If you provide source, keep the header as I have put it, OR, if you expand it, then at least acknowledge me as the original author, and any other authors I credit
  • If the program is released, acknowledge me in your credits (it doesn't have to state which functions came from me, though again if the source is provided - see #1)
  • The source on it's own (as opposed to part of a project) can not be posted unless a link to the page(s) where the code were retrieved from is provided and a message stating that the latest updates will be available on the page(s) linked to.
  • Pieces of the code can however be discussed on the threads where Ascend4nt has posted the code without worrying about further linking.
*EDIT: added Memory Lock/Unlock (recommended and often-used way to ensure a successful grab of a Clipboard memory object), and added clarification on correct way to get memory block size Edited by Ascend4nt
Link to post
Share on other sites

..

Edited by Ascend4nt
Link to post
Share on other sites

..

Edited by Ascend4nt
Link to post
Share on other sites
  • 5 years later...

Autoit v3.3.10.2

Thanks for this excellent work!

There's a small bug in the URL extraction: url containing the z letter are truncated.
 

Character classes

A character classes defines a set of allowed (resp. disallowed) characters, which the next character in subject is expected to match (resp. not to match).
Inside a character classes, most metacharacters loose their meaning (like $ . or *) or mean something else (like ^).
 

[ ... ] Matches any character in the explicit set: "[aeiou]" matches any lowercase vowel. A contiguous (in Unicode codepoint increasing order) set can be defined by putting an hyphen between the starting and ending characters: "[a-z]" matches any lowercase ASCII letter. To include a hyphen (-) in a set, put it as the first or last character of the set or escape it (-).
Notice that the pattern "[A-z]" is not the same as "[A-Za-z]": the former is equivalent to "[A-Z[]^_`a-z]".
To include a closing bracket in a set, use it as the first character of the set or escape it: "[][]" and "[[]]" will both match either "[" or "]".
Note that in a character class, only d, D, h, H, p{}, P{}, s, Q...E, S, v, V, w, W, and x sequences retain their special meaning, while b means the backspace character (Chr(8)). [^ ... ] Matches any character not in the set: "[^0-9]" matches any non-digit. To include a caret (^) in a set, put it after the beginning of the set or escape it (^).

 

 

 

 So ^ rnz are not intepreted. I replaced :

Local $sSourceURL=StringRegExp($sHTMLHeader,"(?i)SourceURL:([^ \r\n\z]+)",1)

with:

Local $sSourceURL=StringRegExp($sHTMLHeader,"(?i)SourceURL:(.*)\r\n",1)
Edited by Silpark
Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By goku200
      I'm having an issue with my html paginated table. The script work as expected. It reads the html table and clicks on the Download button. However when it clicks on the next page its not iterating the items. instead it goes to the next URL from the spreadsheet and then iterates through the html table clicking the Download button and so on. Not sure why its doing that. I want it to click the next page and then continue iterating then after it has reached the end of the pagination go to the next url in the spreadsheet and repeat the process. Below is my script. Any help is appreciated 🙂
       
       
    • By Hermes
      Hi, I have a site that has the following elements below: 
      <div>More element here</div> <div>More element here</div> <div>More element here</div> When I do this in Auto It:
      Local $oSelectDiv = _WD_FindElement($sSession, $_WD_LOCATOR_ByCSSSelector, "div") _WD_HighlightElement($sSession, $oSelectDiv, 1) I also tried to add [3], but it doesnt seems to work:
      Local $oSelectDiv = _WD_FindElement($sSession, $_WD_LOCATOR_ByCSSSelector, "div[3]") _WD_HighlightElement($sSession, $oSelectDiv, 1) It always highlight the first one, but I am trying to highlight the 3rd in the list. Is there anyway to select the 3rd div without having to add any class/id in the divs, and without using XPATH? The structure of the elements in that site were built that way.
    • By Pured
      I am looking to create a script which refreshes/reads a webpage every few seconds. My goal is to see if the page has changed, then I will send myself a notification that the webpage has been updated.
       
      However, rather than downloading the entire webpage every single time, is there a way to check when the webpage last updated?
       
      If not, is there away to partially download/read html source until a specific tag is hit?
       
      Goal: I would like to increase my poll rate and not excessively waste data.
    • By Mr_Microphone
      Alright, I may be an idiot.
      Three years ago, I wrote a program that pushed component information to a secure site via their API. I went back to add some attributes and (here's the idiot part) ended up losing the  source code and my modified code does not quite work. I have the compiled version that works minus the new attributes, so I know that their system has not changed. I stripped the larger program down from 3,000 lines to the part that is broken, but I am stumped. This was one of my first scripts, so it heavily leverages examples and isn't as pretty as I'd like it to be.
      Be gentle. 
      The program / script creates a new records as expected, but for some reason, I cannot access information in the response, which I need for a later step.
      I use Charles, a web debugging proxy tool so I can see the request and the response and both are as expected. Also, when I write to log file, the JSON reply is exactly what I expect and need, but when I try to do anything with the http body, it seems to be blank. 
      Here is the script minus  the URL and token:
      #include <Array.au3> #include <Curl.au3> #include <MsgBoxConstants.au3> #include <json.au3>  ; this was added as an alternate way to read the data Global $WM_serial_number = "WM20745001" Global $wm_component_status_id = "10" Global $wm_manufacturer ="Multi-Tech" Global $wm_model = "MTR-LAT1-B07" Global $cellular_carrier_id = "3" Global $iccid_esn = "89010303300012345678" Global $ip_address = "192.168.2.11" Global $NewIDNumber     Local $Curl = Curl_Easy_Init()     Local $Html = $Curl ; any number as identify     Local $Header = $Curl + 1 ; any number as identify     Local $HtmlFile = "cURL_Request.html"     Local $File = FileOpen($HtmlFile, 2 + 16)     Local $Slist = Curl_Slist_Append(0, "content-type: multipart/form-data; boundary=---011000010111000001101001")     $Slist = Curl_Slist_Append($Slist, "authorization: Token token=" & $Token)     Curl_Easy_Setopt($Curl, $CURLOPT_PROXY, "127.0.0.1") ; needed to use Charles web debugging proxy     Curl_Easy_Setopt($Curl, $CURLOPT_PROXYPORT, 8888) ; needed to use Charles     Curl_Easy_Setopt($Curl, $CURLOPT_HTTPHEADER, $Slist) ;     Curl_Easy_Setopt($Curl, $CURLOPT_URL, $Server & "wireless_module" & "s")     Curl_Easy_Setopt($Curl, $CURLOPT_SSL_VERIFYPEER, 0)     Curl_Easy_Setopt($Curl, $CURLOPT_TIMEOUT, 30)     Curl_Easy_Setopt($Curl, $CURLOPT_WRITEDATA, $Html)     Curl_Easy_Setopt($Curl, $CURLOPT_WRITEFUNCTION, Curl_FileWriteCallback())     Curl_Easy_Setopt($Curl, $CURLOPT_WRITEDATA, $File)     Local $HttpPost = ""     Local $LastItem = ""         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[serial_number]", $CURLFORM_COPYCONTENTS, $WM_serial_number, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[component_status_id]", $CURLFORM_COPYCONTENTS, $wm_component_status_id, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[manufacturer]", $CURLFORM_COPYCONTENTS, $wm_manufacturer, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[model]", $CURLFORM_COPYCONTENTS, $wm_model, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[cellular_carrier_id]", $CURLFORM_COPYCONTENTS, $cellular_carrier_id, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[iccid_esn]", $CURLFORM_COPYCONTENTS, $iccid_esn, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[ip_address]", $CURLFORM_COPYCONTENTS, $ip_address, $CURLFORM_END)         ; submit         Curl_Easy_Setopt($Curl, $CURLOPT_HTTPPOST, $HttpPost)         Local $Code = Curl_Easy_Perform($Curl)         If $Code = $CURLE_OK Then         ConsoleWrite("Content Type: " & Curl_Easy_GetInfo($Curl, $CURLINFO_CONTENT_TYPE) & @LF)         ConsoleWrite("Download Size: " & Curl_Easy_GetInfo($Curl, $CURLINFO_SIZE_DOWNLOAD) & @LF)         MsgBox(0, 'Html', BinaryToString(Curl_Data_Get($Html))) ; this is something I threw in for debugging, expecting to see SOMETHING. Returns nothing         MsgBox(0, 'Header', BinaryToString(Curl_Data_Get($Header))) ; this is something I threw in for debugging, expecting to see SOMETHING. Returns nothing         Local $response = Curl_Easy_GetInfo($Curl, $CURLINFO_RESPONSE_CODE)             If $response = "409" Then $response = "Failed due to a conflict."             If $response = "200" Then $response = "Was NOT created."             If $response = "201" Then $response = "Was created."             ; read the ID that was assigned and store it         $NewIDNumber = StringRight(StringLeft(BinaryToString(Curl_Data_Get($Html)), 10), 4) ; this DID work, but now it doesn't. An old compiled version still works ;~         Global $JsonObject = json_decode($Html); another debugging attempt. Did not use json functions previously and the program worked without it. ;~         Global $NewIDNumber = json_get($JsonObject, '.id')         ConsoleWrite(@CRLF &'! id:' & $NewIDNumber & @CRLF & @CRLF)    ; debugging feedback         MsgBox(0, $response, $wm_serial_number & " new ID = " & $NewIDNumber); debugging feedback         If $Code <> $CURLE_OK Then ConsoleWrite(Curl_Easy_StrError($Code) & @LF)             Local $Data = BinaryToString(Curl_Data_Get($Curl))             Curl_Easy_Cleanup($Curl)             Curl_Data_Cleanup($Curl)             Curl_Data_Cleanup($Header)             Curl_Data_Cleanup($Html)             Curl_FormFree($HttpPost)             Curl_slist_free_all($Slist)             curl_easy_reset($Curl)             FileClose($File)             ConsoleWrite(@LF)         EndIf  This is the captured request (minus the host and token)
      POST /api/v2/wireless_modules HTTP/1.1 Host: api. Accept: */* authorization: Token token= Content-Length: 942 Expect: 100-continue content-type: multipart/form-data; boundary=---011000010111000001101001; boundary=------------------------9adb0d87c7ea5061 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[serial_number]" WM20745001 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[component_status_id]" 10 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[manufacturer]" Multi-Tech --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[model]" MTR-LAT1-B07 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[cellular_carrier_id]" 3 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[iccid_esn]" 89010303300012345678 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[ip_address]" 192.168.2.11 --------------------------9adb0d87c7ea5061-- and the captured response
      HTTP/1.1 201 Created Date: Sun, 04 Apr 2021 00:12:18 GMT Server: Apache Cache-Control: max-age=0, private, must-revalidate Access-Control-Allow-Origin: not-allowed Vary: Accept-Encoding Access-Control-Max-Age: 1728000 X-XSS-Protection: 1; mode=block X-Request-Id: 71cfcf36-6020-48a6-a822-d2b393a27b69 Access-Control-Allow-Credentials: true Access-Control-Allow-Methods: PUT, OPTIONS, GET, POST ETag: W/"25d97fe8a9387cb4b9029a9e62b0bfa2" X-Frame-Options: SAMEORIGIN, SAMEORIGIN X-Runtime: 0.344005 X-Content-Type-Options: nosniff Access-Control-Request-Method: * X-Powered-By: Phusion Passenger 5.2.1 Strict-Transport-Security: max-age=63072000; includeSubDomains; preload Location: /wireless_modules/3195 Status: 201 Created Connection: close Transfer-Encoding: chunked Content-Type: application/json; charset=utf-8 X-Charles-Received-Continue: HTTP/1.1 100 Continue {"id":3195,"model":"MTR-LAT1-B07","serial_number":"WM20745001","manufacturer":"Multi-Tech","mfg_date":null,"iccid_esn":"89010303300012345678","ip_address":"192.168.2.11","purchase_order":null,"supplier":null,"cellular_carrier_id":3,"component_status_id":10,"component_status":{"id":10,"name":"Hold","description":"Available- Held for specific use"},"custom_attributes":[{"name":"Deactivated","type":"Boolean","value":false},{"name":"Port 3001","type":"Boolean","value":false}],"comments":[]}  
      Also attached is the log file. I need to read the id value. Clearly, it is arriving back to cURL, since it is being written out to the log, but I cannot seem to get to it within the code. 
      It is established that I may be an idiot, but this idiot has wasted days in non-billable hours trying to figure out what should be a simple glitch.
      Help???
       
      cURL_Request.html
    • By Hermes
      I have an html table that displays data along with an excel spreadsheet that has the same data as the html table. I am wanting to only match the Title column in my html table with the Title column in my Excel spreadsheet. If the titles match, click on the Edit hyperlink and continue to loop to next row. The issue I'm experience is its not matching correctly. So far  i've written the codes below:
      <table border="1" class="test"> <tr> <th> UniqueID</th> <th> Title</th> <th> UserID</th> <th> Address</th> <th> Gender </th> </tr> <tr> <td> 1 </td> <td> Title1 </td> <td> 12345 </td> <td> Manila </td> <td> <span> Male </span> </td> </tr> <tr> <td align="center" colspan="5"> <a href="#" class="testlink">Edit</a> </td> </tr> <tr> <td> 2 </td> <td> Title2 </td> <td> 67891 </td> <td> Valenzuela </td> <td> <span> Female </span> </td> </tr> <tr> <td align="center" colspan="5" > <a href="#" class="testlink">Edit</a> </td> </tr> <tr> <td> 3 </td> <td> Title3 </td> <td> 88888 </td> <td> Ohio </td> <td> <span> Male </span> </td> </tr> <tr> <td align="center" colspan="5" > <a href="#" class="testlink">Edit</a> </td> </tr> <tr> <td> 4 </td> <td> Title4 </td> <td> 77777 </td> <td> California </td> <td> <span> Female </span> </td> </tr> <tr> <td align="center" colspan="5" > <a href="#" class="testlink">Edit</a> </td> </tr> <tr> <td> 5 </td> <td> Title5 </td> <td> 33333 </td> <td> Arizona </td> <td> <span> Male </span> </td> </tr> <tr> <td align="center" colspan="5" > <a href="#" class="testlink">Edit</a> </td> </tr> </table> #Include "Chrome.au3" #Include "wd_core.au3" #Include "wd_helper.au3" #Include "Excel.au3" #Include "_HtmlTable2Array.au3" #Include "Array.au3" Local $sDesiredCapabilities, $sSession SetupChrome() _WD_Startup() $sSession = _WD_CreateSession($sDesiredCapabilities) _WD_LoadWait($sSession) _WD_Navigate($sSession, "index.html") Sleep(6000) Local $oExcel = _Excel_Open() Local $oWorkbook = _Excel_BookOpen($oExcel, "test.xlsx") ; Get the table element $sElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//table[@class='test']") ; Retrieve HTML $sHTML = _WD_ElementAction($sSession, $sElement, "Property", "outerHTML") ;Local $aTable = _HtmlTableGetWriteToArray($sHTML) Local $aArray1 = _Excel_RangeRead($oWorkbook,1,$oWorkbook.ActiveSheet.Usedrange.Columns("B:B")) Local $aArray2 = _HtmlTableGetWriteToArray($sHTML) ;_ArrayDisplay($aArray1) ;_ArrayDisplay($aArray2) For $i = UBound($aArray1) - 1 To 0 step - 1 For $j = UBound($aArray2) - 1 to 0 step - 1 If $aArray1[$i][1] == $aArray2[$j][1] Then _WD_WaitElement($sSession, $_WD_LOCATOR_ByXPath, "//a[contains(@class,'testlink') or contains(text(),'Edit')]") $test1 = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//a[contains(@class,'testlink') or contains(text(),'Edit')]") _WD_ElementAction($sSession, $test1, 'click') ;_ArrayDisplay($aArray1) ;_ArrayDelete($aArray1 , $i) ;exitloop EndIf Next Next _WD_Shutdown() Func SetupChrome() _WD_Option('Driver', 'chromedriver.exe') _WD_Option('Port', 9515) _WD_Option('DriverParams', '--log-path="' & @ScriptDir & '\chrome.log"') $sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "args":["start-maximized","disable-infobars"]}}}}' EndFunc ;==>SetupChrome Would appreciate if anyone can provide tips, or point me in the right direction in doing it.
       
      test.xlsx
×
×
  • Create New...