Sign in to follow this  
Followers 0
Ascend4nt

Atom Table Functions (Local + Global Atoms)

4 posts in this topic

#1 ·  Posted (edited)

Atom Table UDF

Local and Global Atom Table


Since I've had this collecting dust on my computer, I figured I'd release it to see if anyone can find some use for it.

About Atom Tables - MSDN

2ytr75y.jpg

Example Global Atom Table Listing

 

Basically, a bunch of strings can be stored locally (at program level) or globally (at O/S level) with unique numerical identifiers. This UDF lets you add, find, delete, and query these atoms.

 

Description from the top of the AtomFunctions UDF:

----------------------------------------------------------------

Functions for accesing Local and Global Atom Tables, which contain strings up to 255 characters long

The Atom tables are used for multiple types of Windows data including Window Classes (RegisterClass/Ex), Clipboard formats (RegisterClipboardFormat), Hotkeys (RegisterHotKey), and Dynamic Data Exchange (DDE) which is a form of interprocess communication. (Some of the above may use a Local rather than the Global Atom table)

Other uses can include storing program strings to lookup by a 16-bit value and interprocess communication

 "RegisterHotKey function" @ MSDN states this:

 "An application must specify an id value in the range 0x0000 through 0xBFFF.

  A shared DLL must specify a value in the range 0xC000 through 0xFFFF

  (the range returned by the GlobalAddAtom function).

  To avoid conflicts with hot-key identifiers defined by other shared DLLs,

  a DLL should use the GlobalAddAtom function to obtain the hot-key identifier."

 Kernel Note: The Atom Table is a separate entity from Kernel Objects like Events, Mutexes, Files, etc.  However, the Atom Table is (separately) accessible from within the Kernel, so its like a Kernish object?

Notes on Atom Tables:

  • Stringlimit is 255 characters NOT including a null-terminator. This is unfortunately not   enough for a MAX_PATH string, which is 259+null-term. However, stripping off root-drive prefix get you within 1 character of the length (259 - 3 ["C:"] = 256] which may be enough - but using IPC, one can store 4 - 6 (8-12 in x64) extra chars in wParam/lParam..   See <AtomExample_IPC.au3> for an implementation of this

     

  • There is both a Local and Global Atom Table. The Local one is local to this Process,   and the Global one is available/accessible to/from all Processes.

     

  • Atoms numbered 0 - 49151 (0xBFFF) are not actually part of the Atom Table.   Querying these values will return a string representation of the number as an unsigned integer

       Example: 1234 is translated to "#1234"

     

  • Adding atom strings that start with a "#" will return an unsigned integer if the numbers   following "#" are an integer less than 49152 (1 - 49151 [0xBFFF]).

        Examples: "#1" => 1, "#49151" => 49151

     

  • Atoms numbered from 49152 - 65535 (from 0xC000 - 0xFFFF) DO reference the Atom Table strings

     

  • Maximum Atom string length is 255 characters (not including a null-terminator)

 

Functions:

;  Local:
;    _AtomTableInit()        ; Initializes Local Atom Table. Optional [37 hash buckets allocated by default]
;    _AtomAddLocal()            ; Adds a string to the Local Atom table, returns # identifier [increments reference if already exists]
;    _AtomGetNameLocal()        ; Gets the Local Atom string associated with a numerical identifier
;    _AtomDeleteLocal()        ; Decrements the reference count of Local Atom, deletes when reaches 0
;    _AtomFindLocal()        ; Finds # identifier for a string in the Local Atom table
;    _AtomGetAllLocal()        ; Returns all found Local atoms in an array
; Global:
;    _AtomAddGlobal()        ; Adds a string to the Global Atom table, returns # identifier [increments reference if already exists]
;    _AtomGetNameGlobal()    ; Gets the Global Atom string associated with a numerical identifier
;    _AtomDeleteGlobal()        ; Decrements the reference count of Global Atom, deletes when reaches 0
;    _AtomFindGlobal()        ; Finds # identifier for a string in the Global Atom table
;    _AtomGetAllGlobal()        ; Returns all found Global atoms in an array
; 'Undocumented':
;    _AtomGetGlobalTableUD() ; Using 'undocumented' function calls, gets info on all Global Atoms
;    _AtomGetInfoUD()        ; Returns reference count, pinned status, as well as Atom Name

----------------------------------------------------------------

There are a few 'undocumented' functions used for getting information on the Atom Table included (I can't help myself).

In addition to the core UDF, there's 2 example files:

  • AtomTableExample.au3 - This demonstrates various use - adding, deleting, and querying information in the Local and Global atom tables. (Extended info comes from 'undocumented' functions)

     

  • AtomExample_IPC.au3 - This demonstrates how the Atom Table could be used in InterProcess Communication. A rather rough sketch, but shows how you could squeeze MAX_PATH pathnames into IPC messages combined with Atom tables.

It's all a bit light on the documentation, but I just wanted to prevent this thing from collecting more dust ;)  Might prove useful to someone!

History::

Update 2014-08-17:

- Added some documentation to all functions

- made the ugly 'AtomTableExample' script into something half-decent.

- IPC script modified to change how messages are reported

- Added array-length return in @extended in 'GetAll' functions

AtomTablesUDF.zip ~prev Downloads: 29

 

Edited by Ascend4nt
2 people like this

Share this post


Link to post
Share on other sites



Something new everyday! Thanks for sharing.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

So I just looked at the script today and thought it was too much of a mess.  I cleaned it up a little so now the examples make a little more sense :)  See 1st post for updated version

Update 2014-08-17:

- Added some documentation to all functions

- made the ugly 'AtomTableExample' script into something half-decent.

- IPC script modified to change how messages are reported

- Added array-length return in @extended in 'GetAll' functions

Share this post


Link to post
Share on other sites

Very cool


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • KimberlyJillPereira
      By KimberlyJillPereira
      I could only extract the first 20 from table into Microsoft Excel by using Array Extract but I want to extract until the end what I mean is until the second page. How to do it? Please revert. Thanks.



    • TheDcoder
      By TheDcoder
      Hello everyone, I discovered a bug yesterday and I posted it at the bug tracker:
      I also made a simple script which can be used to reproduce the bug:
      CreateVariable() ConsoleWrite($sGlobalVariable & @CRLF) Func CreateVariable() Global $sGlobalVariable = "Foobar" EndFunc The bug was closed by @BrewManNH:
      While I partially agree with the above statement, My code was not practical enough... so @mLipok advised me to create a thread on the forums with practical code (Thanks!). That is the point of this thread, I am going to provide the code where I experience this bug/problem .
      I discovered this bug when I was working on one of my projects called "ProxAllium". When the main script finishes execution, Au3Check throws a nasty warning about "variable possibly used before declaration":

      As you can see, the variable is indeed being used after calling the function in which the variable is declared... The warning won't appear if I declare the function ABOVE the variable. As @BrewManNH said, Au3Check reads line by line... I think this should be changed, Au3Check should not throw warnings if the interpreter is able to run the code, at least most of the time anyway!
      So what do you guys think? Is this a valid bug?... and I request those who participate in the discussion not to discuss the code being "poor", that is another thing/thread in itself
      P.S I had already written this once but the forum editor decided to mess up and when I undid (Ctrl + Z) something... This is a poorly written version of that article, I was very frustrated while writing this!
    • czardas
      By czardas
      Test instructions are easy.
      1. Run the following code in post #46
      2. Click the advanced tab
      3. Select a start date and an end date
      4. Put a tick in the checkbox where it says 'Include Days Of The Week'
      5. Click the Okay button
      You should get an array with a column of dates in your local format and a second column with day names in your language. It has a current range limit of just under 89 years but this range can appear anywhere between the years 1601 and 9999. The code is untidy and unfinished, but I would still like to know if it works outside the UK. No extra includes are needed. If it works in China, that would make me happy.
      Latest version in post #46
      https://www.autoitscript.com/forum/topic/184849-please-test-this-in-your-language-region/?do=findComment&comment=1327947
      #include <Array.au3> #include <GUIConstants.au3> #include <GuiEdit.au3> #include <Misc.au3> #include <GuiTab.au3> #include <Date.au3> #include <APILocaleConstants.au3> #include <WinAPILocale.au3> ;#include <ArrayWorkshop.au3> ; required functions are already included in this demo NewTable() Func NewTable(); ($hParent, $idListView, $hListView) ; [missing parent window] Local $sTitle = "New Table", _ $iStyle = BitOR($WS_CAPTION, $WS_POPUP, $WS_SYSMENU), _ $iExStyle = BitOR($WS_EX_MDICHILD, $WS_EX_TOOLWINDOW, $WS_EX_TOPMOST), _ $iWidth = 250, _ $iHeight = 292 Local $hChild = GUICreate($sTitle, $iWidth, $iHeight, Default + 100, Default + 100); , $iStyle, $iExStyle, $hParent) Local $idTab = GUICtrlCreateTab(2, 2, $iWidth -2, $iHeight -32) GUICtrlCreateTabItem("Standard") GUICtrlCreateLabel("Table Dimensions", 14, 35, 120, 20) GUICtrlSetColor(-1, 0x008000) GUICtrlCreateLabel("Number Of Columns", 14, 60, 94, 18) Local $hStandardCols = GUICtrlCreateInput("0", 104 +14, 55, 50, 20, BitOR($WS_TABSTOP, $ES_CENTER, $ES_NUMBER)) GUICtrlSetFont(-1, 10) GUICtrlCreateLabel("Number Of Rows", 14, 83, 90, 18) Local $hStandardRows = GUICtrlCreateInput("0", 104 +14, 81, 50, 20, BitOR($WS_TABSTOP, $ES_CENTER, $ES_NUMBER)) GUICtrlSetFont(-1, 10) GUICtrlCreateLabel("Table Indices", 14, 114, 120, 20) GUICtrlSetColor(-1, 0x008000) Local $hIndices = GUICtrlCreateCheckbox(" Enumerate First Column", 14, 137, 160, 20) ; REMEMBER YES, SETTINGS NO Local $hPadding = GUICtrlCreateCheckbox(" Include Leading Zeros", 14, 161, 124, 20) ; REMEMBER YES, SETTINGS NO GUICtrlCreateTabItem("Advanced") GUICtrlCreateLabel("Column 1 = Date Range", 14, 35, 120, 20) GUICtrlSetColor(-1, 0x008000) GUICtrlCreateLabel("Start Date", 14, 35 +20, 100, 20) Local $idDate1 = GUICtrlCreateDate("", 14, 56 +20, 105, 20, $WS_TABSTOP) GUICtrlCreateLabel("End Date", $iWidth - 118, 35 +20, 100, 20) Local $idDate2 = GUICtrlCreateDate("", $iWidth - 118, 56 +20, 105, 20, BitOR($WS_TABSTOP, $DTS_RIGHTALIGN)) GUICtrlCreateLabel("Column 2 = Days Of The Week (Optional)", 14, 108, 220, 20) GUICtrlSetColor(-1, 0x008000) Local $hCheckBox = GUICtrlCreateCheckbox(" Include Days Of The Week ", 14, 128, 190, 20) ; REMEMBER YES, SETTINGS NO Local $hRadio1 = GUICtrlCreateRadio(" Use Short Name", 14, 151) Local $hRadio2 = GUICtrlCreateRadio(" Use Long Name", $iWidth - 118, 151) GUICtrlSetState(-1, $GUI_CHECKED) GUICtrlSetState($hRadio1, $GUI_DISABLE) ; check settings later GUICtrlSetState($hRadio2, $GUI_DISABLE) ; check settings later GUICtrlCreateLabel("Add More Columns", 14, 181, 222, 20) GUICtrlSetColor(-1, 0x008000) GUICtrlCreateLabel("Total Number Of Columns", 14, 182 +23, 129, 18) Local $hColNum = GUICtrlCreateInput("1", 130 +14, 182 +21, 50, 20, BitOR($WS_TABSTOP, $ES_CENTER, $ES_NUMBER)) ; REMEMBER YES, SETTINGS NO GUICtrlSetFont(-1, 10) GUICtrlCreateTabItem("") ; end tabitem definition Local $hCancel = GUICtrlCreateButton("Cancel", $iWidth -143, $iHeight -25, 66, 20) Local $hOkay = GUICtrlCreateButton("OK", $iWidth -69, $iHeight -25, 66, 20) GUISetState(@SW_SHOW) Local $msg2, $bDisable = True, $iMaxFields = 65000, $bStandardError = False, $iRows = 0, $iCols = 0, _ $sMonitorAdv = GUICtrlRead($idDate1) & GUICtrlRead($idDate2) & GUICtrlRead($hColNum), $sDate1, $sDate2, $sColsAdv, $iDiff, $bErrorAdv = False While 1 $msg2 = GUIGetMsg() If $msg2 = $hCancel Or $msg2 = $GUI_EVENT_CLOSE Then ExitLoop If BitAND(GUICtrlRead($hCheckBox), $GUI_CHECKED) == $GUI_CHECKED Then If $bDisable Then GUICtrlSetState($hRadio1, $GUI_ENABLE) GUICtrlSetState($hRadio2, $GUI_ENABLE) $bDisable = False EndIf ElseIf Not $bDisable Then GUICtrlSetState($hRadio1, $GUI_DISABLE) GUICtrlSetState($hRadio2, $GUI_DISABLE) $bDisable = True EndIf $sDate1 = GUICtrlRead($idDate1) $sDate2 = GUICtrlRead($idDate2) $sColsAdv = GUICtrlRead($hColNum) If $sDate1 & $sDate2 & $sColsAdv <> $sMonitorAdv Then $iDiff = DateRange(LocaleDateToYMD($sDate1), LocaleDateToYMD($sDate2)) ; _DateDiff() is too limited If $iDiff * $sColsAdv > $iMaxFields Then If Not $bErrorAdv Then GUICtrlSetBkColor($hColNum, 0xFFA090) $bErrorAdv = True EndIf ElseIf $bErrorAdv Then GUICtrlSetBkColor($hColNum, 0xFFFFFF) $bErrorAdv = False EndIf $sMonitorAdv = $sDate1 & $sDate2 & $sColsAdv EndIf $iCols = GUICtrlRead($hStandardCols) $iRows = GUICtrlRead($hStandardRows) If Not $iRows Then GUICtrlSetData($hStandardRows, 0) If Not $iCols Then GUICtrlSetData($hStandardCols, 0) If ControlGetFocus($hChild) <> "Edit2" And StringRegExp($iRows, '\A0+[1-9]') Then GUICtrlSetData($hStandardRows, StringRegExpReplace($iRows, '(\A0+)([1-9])(\d+)*', '$2$3')) If ControlGetFocus($hChild) <> "Edit1" And StringRegExp($iCols, '\A0+[1-9]') Then GUICtrlSetData($hStandardCols, StringRegExpReplace($iCols, '(\A0+)([1-9])(\d+)*', '$2$3')) If $iRows > $iMaxFields Or $iCols > $iMaxFields Or GUICtrlRead($hStandardRows) * GUICtrlRead($hStandardCols) > $iMaxFields Then If Not $bStandardError Then GUICtrlSetBkColor($hStandardRows, 0xFFA090) GUICtrlSetBkColor($hStandardCols, 0xFFA090) $bStandardError = True EndIf ElseIf $bStandardError Then GUICtrlSetBkColor($hStandardRows, 0xFFFFFF) GUICtrlSetBkColor($hStandardCols, 0xFFFFFF) $bStandardError = False EndIf $iCols = GUICtrlRead($hColNum) If ControlGetFocus($hChild) <> "Edit3" And StringRegExp($iCols, '\A0+[1-9]') Then GUICtrlSetData($hColNum, StringRegExpReplace($iCols, '(\A0+)([1-9])(\d+)*', '$2$3')) $iCols = GUICtrlRead($hColNum) If $iCols < 1 Then If BitAND(GUICtrlRead($hCheckBox), $GUI_CHECKED) == $GUI_CHECKED And ControlGetFocus($hChild) <> "Edit3" Then GUICtrlSetData($hColNum, 2) Else GUICtrlSetData($hColNum, 1) EndIf If ControlGetFocus($hChild) = "Edit3" Then _GUICtrlEdit_SetSel($hColNum, 0, -1) EndIf If BitAND(GUICtrlRead($hCheckBox), $GUI_CHECKED) == $GUI_CHECKED Then If $iCols < 2 And ControlGetFocus($hChild) <> "Edit3" Then If ControlGetFocus($hChild) = "Button3" Then While _IsPressed('01') Sleep(20) WEnd EndIf If BitAND(GUICtrlRead($hCheckBox), $GUI_CHECKED) == $GUI_CHECKED Then GUICtrlSetData($hColNum, 2) Else GUICtrlSetData($hColNum, 1) EndIf EndIf Else If $iCols < 1 Then GUICtrlSetData($hColNum, 1) If ControlGetFocus($hChild) = "Edit3" Then _GUICtrlEdit_SetSel($hColNum, 0, -1) EndIf EndIf If $msg2 = $hOkay Then If $bErrorAdv Then MsgBox(262160, "uh-uh!", "Out of Range") ContinueLoop EndIf Local $sDelim = '-' ; read from settings ??? [maybe] If _GUICtrlTab_GetCurFocus($idTab) = 1 Then Local $iName, $sFormat = Default ; read from settings [ini] If BitAND(GUICtrlRead($hCheckBox), $GUI_CHECKED) == $GUI_CHECKED Then $iName = (BitAND(GUICtrlRead($hRadio1), $GUI_CHECKED) == $GUI_CHECKED) ? 3 : 2 Else $iName = -1 EndIf ; $sStartDate, $sStopDate, $sDelim = Default, $sFormat = Default, $iDayName = -1) GetDateArray(GUICtrlRead($idDate1), GUICtrlRead($idDate2), $sDelim, $sFormat, $iName) ; US [, '$3-$5-$1']) EndIf EndIf WEnd GUIDelete($hChild) EndFunc ;==> NewTable Func LocaleDateToYMD($sDate, $sDelim = '/') Local $iID = _WinAPI_GetUserDefaultLCID(), _ $aDateSplit = StringRegExp(_WinAPI_GetLocaleInfo($iID, $LOCALE_SSHORTDATE), '\w+', 3), _ $vTest = True ; [floating variable] If IsArray($aDateSplit) And UBound($aDateSplit) = 3 Then For $i = 0 To 2 ; determine the locale ymd order If StringInStr($aDateSplit[$i], 'y') Then $aDateSplit[$i] = 1 ; year ElseIf StringInStr($aDateSplit[$i], 'm') Then $aDateSplit[$i] = 2 ; month ElseIf StringInStr($aDateSplit[$i], 'd') Then $aDateSplit[$i] = 3 ; day EndIf Next $vTest = $aDateSplit[0] & $aDateSplit[1] & $aDateSplit[2] If StringInStr($vTest, '1') And StringInStr($vTest, '2') And StringInStr($vTest, '3') Then $vTest = False ; success EndIf If $vTest Then ; failure with the previous method [both methods work on Win7 in the UK] Local $sLongDate = _WinAPI_GetDateFormat(0, 0, $DATE_LONGDATE) $aDateSplit = StringRegExp($sLongDate, '(*UCP)\w+', 3) If Not IsArray($aDateSplit) And UBound($aDateSplit) <> 3 Then Return SetError(1) ; undetermined error? For $i = 0 To 2 ; determine the locale ymd order If $aDateSplit[$i] = @YEAR Then ; hopefully this method will work for all international regions $aDateSplit[$i] = 1 ; year ElseIf $aDateSplit[$i] = _DateToMonth(@MON, $DMW_LOCALE_LONGNAME) Then $aDateSplit[$i] = 2 ; month Else $aDateSplit[$i] = 3 ; day EndIf Next $vTest = $aDateSplit[0] & $aDateSplit[1] & $aDateSplit[2] If Not (StringInStr($vTest, '1') And StringInStr($vTest, '2') And StringInStr($vTest, '3')) Then Return SetError(2) ; undetermined error? EndIf Local $iCount = 1, $sYMD = '' Do For $i = 0 To 2 If $aDateSplit[$i] = $iCount Then $sYMD &= '$'& ($i*2 +1) & ($iCount <> 3 ? $sDelim : '') $iCount += 1 ExitLoop EndIf Next Until $iCount = 4 Return StringRegExpReplace($sDate, '(\d+)(\D)(\d+)(\D)(\d+)', $sYMD) EndFunc Func GetDateArray($sStartDate, $sStopDate, $sDelim = Default, $sFormat = Default, $iDayName = -1) Local $iID = _WinAPI_GetUserDefaultLCID(), _ $aDateSplit = StringRegExp(_WinAPI_GetLocaleInfo($iID, $LOCALE_SSHORTDATE), '\w+', 3) If IsArray($aDateSplit) And UBound($aDateSplit) = 3 Then For $i = 0 To 2 ; determine the locale ymd order If StringInStr($aDateSplit[$i], 'y') Then $aDateSplit[$i] = 1 ; year ElseIf StringInStr($aDateSplit[$i], 'm') Then $aDateSplit[$i] = 2 ; month ElseIf StringInStr($aDateSplit[$i], 'd') Then $aDateSplit[$i] = 3 ; day EndIf Next Else ; let's try an alternative method [both methods work on Win7 in the UK] Local $sLongDate = _WinAPI_GetDateFormat(0, 0, $DATE_LONGDATE) $aDateSplit = StringRegExp($sLongDate, '(*UCP)\w+', 3) If Not IsArray($aDateSplit) And UBound($aDateSplit) <> 3 Then Return SetError(1) ; undetermined error? For $i = 0 To 2 ; determine the locale ymd order If $aDateSplit[$i] = @YEAR Then ; hopefully this method will work for all international regions $aDateSplit[$i] = 1 ; year ElseIf $aDateSplit[$i] = _DateToMonth(@MON, $DMW_LOCALE_LONGNAME) Then $aDateSplit[$i] = 2 ; month Else $aDateSplit[$i] = 3 ; day EndIf Next EndIf ; check the array contains numbers 1 to 3 Local $sTest = $aDateSplit[0] & $aDateSplit[1] & $aDateSplit[2] If Not (StringInStr($sTest, '1') And StringInStr($sTest, '2') And StringInStr($sTest, '3')) Then Return SetError(2) ; undetermined error? If $sDelim = Default Then $sDelim = StringRegExp($sStartDate, '\D+', 3)[0] Local $iCount = 1, $sYMD = '' Do For $i = 0 To 2 If $aDateSplit[$i] = $iCount Then $sYMD &= '$'& ($i*2 +1) & ($iCount <> 3 ? $sDelim : '') $iCount += 1 ExitLoop EndIf Next Until $iCount = 4 $sStartDate = StringRegExpReplace($sStartDate, '(\d+)(\D)(\d+)(\D)(\d+)', $sYMD) $sStopDate = StringRegExpReplace($sStopDate, '(\d+)(\D)(\d+)(\D)(\d+)', $sYMD) Local $vReverse = False If $sStartDate > $sStopDate Then $vReverse = $sStartDate $sStartDate = $sStopDate $sStopDate = $vReverse EndIf ; internal format = yyyy/mm/dd Local $iStartYear = Number(StringRegExp($sStartDate, '\d+', 3)[0]), $iStopYear = Number(StringRegExp($sStopDate, '\d+', 3)[0]) Local $aTimeLine[(($iStopYear - $iStartYear +1) *366)] $iCount = 0 For $iYear = $iStartYear To $iStopYear For $iMon = 1 To 12 Switch $iMon Case 1, 3, 5, 7, 8, 10, 12 For $iDay = 1 To 31 ; this section of code can be shortened $aTimeLine[$iCount] = $iYear & $sDelim & StringFormat('%02i', $iMon) & $sDelim & StringFormat('%02i', $iDay) If $aTimeLine[$iCount] = $sStopDate Then ExitLoop 3 $iCount += 1 Next Case 4, 6, 9, 11 For $iDay = 1 To 30 $aTimeLine[$iCount] = $iYear & $sDelim & StringFormat('%02i', $iMon) & $sDelim & StringFormat('%02i', $iDay) If $aTimeLine[$iCount] = $sStopDate Then ExitLoop 3 $iCount += 1 Next Case Else For $iDay = 1 To IsLeapYear($iYear) ? 29 : 28 $aTimeLine[$iCount] = $iYear & $sDelim & StringFormat('%02i', $iMon) & $sDelim & StringFormat('%02i', $iDay) If $aTimeLine[$iCount] = $sStopDate Then ExitLoop 3 $iCount += 1 Next EndSwitch Next Next ReDim $aTimeLine[$iCount +1] ; get rid of unused elements For $i = 0 To UBound($aTimeLine) -1 If $aTimeLine[$i] = $sStartDate Then _DeleteRegion($aTimeLine, 1, 0, $i) ExitLoop EndIf Next _Predim($aTimeLine, 2) ; add 2nd dimension ; adding a day name column only works for start dates up to christmas day 2999 [millennium bug], is it worth fixing? ==> [FIXED] If $iDayName <> -1 Then ; $iDayName set to -1 because _DateDayOfWeek() format values range from 0 to 3 ReDim $aTimeLine[UBound($aTimeLine)][2] Local $aNextDay, $iDayNum If $sStartDate <= ('2999' & $sDelim & '12' & $sDelim & '25') Then For $i = 0 To 6 If $i = UBound($aTimeLine) Then ExitLoop $aNextDay = StringRegExp($aTimeLine[$i][0], '\d+', 3) $iDayNum = _DateToDayOfWeek (Number($aNextDay[0]), Number($aNextDay[1]), Number($aNextDay[2])) $aTimeLine[$i][1] = _DateDayOfWeek($iDayNum, $iDayName) Next For $i = 7 To UBound($aTimeLine) -1 $aTimeLine[$i][1] = $aTimeLine[Mod($i, 7)][1] Next Else Local $aDayName[7] For $i = 25 To 31 $aNextDay = StringRegExp('2999/12/' & $i, '\d+', 3) $iDayNum = _DateToDayOfWeek (Number($aNextDay[0]), Number($aNextDay[1]), Number($aNextDay[2])) $aDayName[$i -25] = _DateDayOfWeek($iDayNum, $iDayName) Next Local $iDateDiff = DateRange('2999/12/25', StringReplace($aTimeLine[0][0], $sDelim, '/')) For $i = 0 to 6 If $i = UBound($aTimeLine) Then ExitLoop $aTimeLine[$i][1] = $aDayName[Mod($iDateDiff + $i -1, 7)] Next For $i = 7 To UBound($aTimeLine) -1 $aTimeLine[$i][1] = $aTimeLine[Mod($i, 7)][1] Next EndIf EndIf If $sFormat = Default Then ; [regexp pattern stored in ini or settings] $aDateSplit = StringRegExp($sYMD, '\d', 3) _Predim($aDateSplit, 2) ReDim $aDateSplit[3][2] $aDateSplit[0][1] = 1 $aDateSplit[1][1] = 3 $aDateSplit[2][1] = 5 _ArraySort($aDateSplit) $sFormat = '$' & $aDateSplit[0][1] & $sDelim & '$' & $aDateSplit[1][1] & $sDelim & '$' & $aDateSplit[2][1] EndIf For $i = 0 To UBound($aTimeLine) -1 $aTimeLine[$i][0] = StringRegExpReplace($aTimeLine[$i][0], '(\d+)(\D)(\d+)(\D)(\d+)', $sFormat) Next If $vReverse Then _ReverseArray($aTimeLine) _ArrayDisplay($aTimeLine) Return $aTimeLine EndFunc ;==> GetDateArray Func IsLeapYear($iYear) If Mod($iYear, 4) Then Return False ElseIf Mod($iYear, 100) Then Return True ElseIf Mod($iYear, 400) Then Return False Else Return True EndIf EndFunc ;==> IsLeapYear Func DateRange($sDate1, $sDate2) ; must be yyyy/mm/dd Local $vTemp If $sDate1 > $sDate2 Then $vTemp = $sDate1 $sDate1 = $sDate2 $sDate2 = $vTemp EndIf Local $aArray1 = StringRegExp($sDate1, '\d+', 3), $aArray2 = StringRegExp($sDate2, '\d+', 3) For $i = 0 To 2 $aArray1[$i] = Number($aArray1[$i]) $aArray2[$i] = Number($aArray2[$i]) Next Local $iCount = 0 For $i = $aArray1[0] +1 To $aArray2[0] -1 If Not IsLeapYear($i) Then $iCount += 365 Else $iCount += 366 EndIf Next Local $iFirstMonth = 0 Switch $aArray1[1] Case 1, 3, 5, 7, 8, 10, 12 $iFirstMonth = 32 - $aArray1[2] Case 4, 6, 9, 11 $iFirstMonth = 31 - $aArray1[2] Case Else $iFirstMonth = (IsLeapYear($aArray1[0]) ? 30 : 29) - $aArray1[2] EndSwitch If $aArray1[0] < $aArray2[0] Then $iCount += $iFirstMonth For $i = $aArray1[1] +1 to 12 Switch $i Case 1, 3, 5, 7, 8, 10, 12 $iCount += 31 Case 4, 6, 9, 11 $iCount += 30 Case Else $iCount += (IsLeapYear($aArray1[0]) ? 29 : 28) EndSwitch Next $iCount += $aArray2[2] ; add remaining days in final month If $aArray2[1] > 1 Then For $i = 1 To $aArray2[1] -1 Switch $i Case 1, 3, 5, 7, 8, 10, 12 $iCount += 31 Case 4, 6, 9, 11 $iCount += 30 Case Else $iCount += (IsLeapYear($aArray2[0]) ? 29 : 28) EndSwitch Next EndIf ElseIf $aArray1[1] < $aArray2[1] Then ; same year, different month $iCount += $iFirstMonth For $i = $aArray1[1] +1 to $aArray2[1] -1 Switch $i Case 1, 3, 5, 7, 8, 10, 12 $iCount += 31 Case 4, 6, 9, 11 $iCount += 30 Case Else $iCount += (IsLeapYear($aArray1[0]) ? 29 : 28) EndSwitch Next $iCount += $aArray2[2] Else ; same month and year $iCount = $aArray2[2] - $aArray1[2] +1 EndIf Return $iCount EndFunc ;==> DateRange ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Functions from ArrayWorkshop.au3 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; #Au3Stripper_Off Func _ReverseArray(ByRef $aArray, $iDimension = 1, $iStart = 0, $iEnd = -1) If Not IsArray($aArray) Or UBound($aArray, 0) > 9 Then Return SetError(1) ; not a valid array Local $aBound = __GetBounds($aArray) If @error Then Return SetError(2) ; array contains zero elements $iDimension = ($iDimension = Default) ? 1 : Int($iDimension) If $iDimension < 1 Or $iDimension > $aBound[0] Then Return SetError(3) ; dimension does not exist $iStart = ($iStart = Default) ? 0 : Int($iStart) If $iStart < 0 Or $iStart > $aBound[$iDimension] - 2 Then Return SetError(4) ; meaningless $iStart value $iEnd = ($iEnd = -1 Or $iEnd = Default) ? $aBound[$iDimension] - 1 : Int($iEnd) If $iEnd <= $iStart Or $iEnd >= $aBound[$iDimension] Then Return SetError(5) ; meaningless $iEnd value If $aBound[0] = 1 Then ___Reverse1D($aArray, $iStart, $iEnd) Else $aBound[$iDimension] = 1 Local $aRegion = ___NewArray($aBound) ; to store extracted regions For $i = 1 To $aBound[0] $aBound[$i] -= 1 Next Local $sIndices = __HiddenIndices($aBound[0], $iDimension), $fnFloodFill = __FloodFunc()[$aBound[0]], _ $sTransfer = '$aSource' & $sIndices ; array syntax While $iEnd > $iStart $fnFloodFill($aRegion, $aBound, $iDimension, 0, $iStart, $aArray, $sTransfer) ; extract the current start region $sTransfer = '$aTarget' & $sIndices $fnFloodFill($aArray, $aBound, $iDimension, $iStart, $iEnd, '', $sTransfer) ; overwrite the current start region $sTransfer = '$aSource' & $sIndices $fnFloodFill($aArray, $aBound, $iDimension, $iEnd, 0, $aRegion, $sTransfer) ; overwrite the current end region $iStart += 1 $iEnd -= 1 WEnd EndIf EndFunc ;==>_ReverseArray Func _DeleteRegion(ByRef $aArray, $iDimension = 1, $iSubIndex = 0, $iRange = 1) If Not IsArray($aArray) Then Return SetError(1) Local $aBound = __GetBounds($aArray) ; get the bounds of each dimension If @error Then Return SetError(4) ; $aArray must contain at least one element If $aBound[0] > 9 Then Return SetError(2) ; nine dimension limit $iDimension = ($iDimension = Default) ? 1 : Int($iDimension) If $iDimension > $aBound[0] Or $iDimension < 1 Then Return SetError(3) ; out of bounds dimension $iSubIndex = ($iSubIndex = Default) ? 0 : Int($iSubIndex) If $iSubIndex < 0 Or $iSubIndex > $aBound[$iDimension] - 1 Then Return SetError(5) ; sub-index does not exist in the dimension $iRange = ($iRange = Default) ? 1 : Int($iRange) If $iRange < 1 Then Return SetError(6) ; range must be greater than zero $iRange = ($iSubIndex + $iRange < $aBound[$iDimension]) ? $iRange : $aBound[$iDimension] - $iSubIndex ; corrects for overflow If $iRange = $aBound[$iDimension] Then Return SetError(6) ; deleting the whole region is not currently supported [give reason] $aBound[$iDimension] -= $iRange ; the size of the dimension in the new array If $aBound[0] = 1 Then For $iNext = $iSubIndex To $aBound[$iDimension] - 1 $aArray[$iNext] = $aArray[$iNext + $iRange] Next ReDim $aArray[$aBound[$iDimension]] Return EndIf Local $iMaxIndex = $aBound[$iDimension] - 1 For $i = 1 To $aBound[0] $aBound[$i] -= 1 Next $aBound[$iDimension] = 0 ; set to loop once [one region at a time] Local $iFrom, $sTransfer = '$aTarget' & __HiddenIndices($aBound[0], $iDimension), $fnFloodFill = __FloodFunc()[$aBound[0]] For $iNext = $iSubIndex To $iMaxIndex $iFrom = $iNext + $iRange $fnFloodFill($aArray, $aBound, $iDimension, $iNext, $iFrom, '', $sTransfer) ; overwrite the final [untouched] region Next $aBound[$iDimension] = $iMaxIndex For $i = 1 To $aBound[0] $aBound[$i] += 1 Next __ResetBounds($aArray, $aBound) ; delete remaining indices EndFunc ;==>_DeleteRegion Func _PreDim(ByRef $aArray, $iDimensions, $bPush = False) If Not IsArray($aArray) Then Return SetError(1) $iDimensions = Int($iDimensions) If $iDimensions < 1 Or $iDimensions > 9 Then Return SetError(2) Local $iPreDims = UBound($aArray, 0) ; current number of dimensions If $iPreDims = $iDimensions Then Return ; no change If $iPreDims > 9 Then Return SetError(3) ; too many dimensions Local $aBound = __GetBounds($aArray) ; get the size of each original dimension If @error Then Return SetError(4) ; $aArray must contain at least one element $aBound[0] = $iDimensions ; overwrite this value with the new number of dimensions Local $sTransfer = '[$a[1]][$a[2]][$a[3]][$a[4]][$a[5]][$a[6]][$a[7]][$a[8]][$a[9]]' ; array syntax to be sent to the remote loop region If $bPush Then ; prefix dimensions, or delete from the left Local $iOffset = Abs($iDimensions - $iPreDims) If $iPreDims > $iDimensions Then ; lower dimensions get deleted For $i = 1 To $iDimensions ; shift elements to lower indices $aBound[$i] = $aBound[$i + $iOffset] Next $sTransfer = '$aSource' & StringLeft('[0][0][0][0][0][0][0][0]', $iOffset * 3) & StringLeft($sTransfer, $iDimensions * 7) Else ; lower dimensions are created ReDim $aBound[$iDimensions + 1] ; make space for more dimensions For $i = $iDimensions To $iOffset + 1 Step -1 ; shift elements to higher indices $aBound[$i] = $aBound[$i - $iOffset] Next For $i = 1 To $iOffset ; assign the size of each additional dimension [1][1][1]... etc... $aBound[$i] = 1 Next $sTransfer = '$aSource' & StringMid($sTransfer, 1 + $iOffset * 7, $iPreDims * 7) EndIf Else ; Default behaviour = append dimensions, or delete from the right ReDim $aBound[$iDimensions + 1] ; modify the number of dimensions [according to the new array] For $i = $iPreDims + 1 To $iDimensions ; assign the size of each new dimension ...[1][1][1] etc... $aBound[$i] = 1 Next $sTransfer = '$aSource' & StringLeft($sTransfer, $iPreDims * 7) EndIf ; add or remove dimensions Local $aNewArray = ___NewArray($aBound) For $i = 1 To $iDimensions $aBound[$i] -= 1 ; convert elements to the maximum index value within each dimension Next ; access the remote loop region Local $iSubIndex = 0, $aFloodFill = __FloodFunc() $aFloodFill[$iDimensions]($aNewArray, $aBound, 0, $iSubIndex, '', $aArray, $sTransfer) $aArray = $aNewArray EndFunc ;==>_PreDim Func __GetBounds($aArray, $iHypothetical = 0) Local $iMaxDim = UBound($aArray, 0) Local $aBound[($iHypothetical ? $iHypothetical : $iMaxDim) + 1] ; [or ==> Local $aBound[9]] $aBound[0] = $iMaxDim For $i = 1 To $iMaxDim $aBound[$i] = UBound($aArray, $i) If $aBound[$i] = 0 Then Return SetError(1) Next If $iHypothetical Then For $i = $iMaxDim + 1 To $iHypothetical $aBound[$i] = 1 ; imaginary dimensions Next EndIf Return $aBound EndFunc ;==>__GetBounds Func __HiddenIndices($iBound, $iDimension) Local $sSyntax = '' ; to access elements at their original indices For $i = 1 To $iBound If $i <> $iDimension Then $sSyntax &= '[$a[' & $i & ']]' ; default ==> '$aSource[$iFrom][$a[2]][$a[3]][$a[4]][$a[5]] etc...' Else $sSyntax &= '[$iFrom]' EndIf Next Return $sSyntax EndFunc ;==>__HiddenIndices Func ___NewArray($aBound) Switch $aBound[0] Case 1 Local $aArray[$aBound[1]] Case 2 Local $aArray[$aBound[1]][$aBound[2]] Case 3 Local $aArray[$aBound[1]][$aBound[2]][$aBound[3]] Case 4 Local $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]] Case 5 Local $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]] Case 6 Local $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]][$aBound[6]] Case 7 Local $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]][$aBound[6]][$aBound[7]] Case 8 Local $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]][$aBound[6]][$aBound[7]][$aBound[8]] Case 9 Local $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]][$aBound[6]][$aBound[7]][$aBound[8]][$aBound[9]] EndSwitch Return $aArray EndFunc ;==>___NewArray Func ___Reverse1D(ByRef $aArray, $iStart, $iStop) Local $vTemp While $iStop > $iStart $vTemp = $aArray[$iStart] $aArray[$iStart] = $aArray[$iStop] $aArray[$iStop] = $vTemp $iStart += 1 $iStop -= 1 WEnd EndFunc ;==>___Reverse1D Func __ResetBounds(ByRef $aArray, $aBound) Switch $aBound[0] Case 1 ReDim $aArray[$aBound[1]] Case 2 ReDim $aArray[$aBound[1]][$aBound[2]] Case 3 ReDim $aArray[$aBound[1]][$aBound[2]][$aBound[3]] Case 4 ReDim $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]] Case 5 ReDim $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]] Case 6 ReDim $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]][$aBound[6]] Case 7 ReDim $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]][$aBound[6]][$aBound[7]] Case 8 ReDim $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]][$aBound[6]][$aBound[7]][$aBound[8]] Case 9 ReDim $aArray[$aBound[1]][$aBound[2]][$aBound[3]][$aBound[4]][$aBound[5]][$aBound[6]][$aBound[7]][$aBound[8]][$aBound[9]] EndSwitch EndFunc ;==>__ResetBounds Func __FloodFunc() ; [modified for this demo] Local $aFloodFunc = ['', ___Flood1D, ___Flood2D] ; , ___Flood3D, ___Flood4D, ___Flood5D, ___Flood6D, ___Flood7D, ___Flood8D, ___Flood9D] Return $aFloodFunc EndFunc ;==>__FloodFunc Func ___Flood1D(ByRef $aTarget, $aBound, $iDimension, $iSubIndex, $iFrom, $aSource, $sTransfer) ; [still experimental] #forceref $iDimension, $iFrom, $aSource ; $iDimension would normally not apply here (special case) Local $a[10] = ['', 0, 0, 0, 0, 0, 0, 0, 0, 0] ; loop iteration count [or indices of higher dimensions within the source array] For $a[1] = $iSubIndex To $aBound[1] ; from the start to the bounds of the 1st dimension (special case) ; only one operation is needed in this special case $aTarget[$a[1]] = Execute($sTransfer) ; hidden parameters may appear in the code being executed Next EndFunc ;==>___Flood1D Func ___Flood2D(ByRef $aTarget, $aBound, $iDimension, $iSubIndex, $iFrom, $aSource, $sTransfer) #forceref $iFrom, $aSource ; hidden parameters Local $a[10] = ['', 0, 0, 0, 0, 0, 0, 0, 0, 0] ; loop iteration count [or indices of higher dimensions within the source array] For $a[2] = 0 To $aBound[2] For $a[1] = 0 To $aBound[1] $a[$iDimension] = $iSubIndex ; override the iteration count (fast method) - $a[0] has no influence $aTarget[$a[1]][$a[2]] = Execute($sTransfer) ; hidden parameters may appear in the code being executed Next Next EndFunc ;==>___Flood2D #Au3Stripper_On  
    • PELock
      By PELock
      I'm trying to use #OnAutoItStartRegister to modify the Global variable, but it seems it doesn't work, is that on purpose, that those callback functions cannot modify anything except in their own scope?
    • Chimp
      By Chimp
      This is for extraction of data from HTML tables to an array.
      It uses an raw html source file as input, and does not relies on any browser.
      You can get the source of the html using commands like InetGet(), InetRead(), _INetGetSource(), _IEDocReadHTML() for example, or load an html file from disc as well.
      It also takes care of the data position in the table due to rowspan and colspan trying to keep the same layout in the generated array.
      It has the option to fill the cells in the array corresponding with the "span" zones all with the same value of the first "span" cell of the corresponding area.
      ; save this as _HtmlTable2Array.au3 #include-once #include <array.au3> ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetList ; Description ...: Finds and enumerates all the html tables contained in an html listing (even if nested). ; if the optional parameter $i_index is passed, then only that table is returned ; Syntax ........: _HtmlTableGetList($sHtml[, $i_index = -1]) ; Parameters ....: $sHtml - A string value containing an html page listing ; $i_index - [optional] An integer value indicating the number of the table to be returned (1 based) ; with the default value of -1 an array with all found tables is returned ; Return values .: Success; Returns an 1D 1 based array containing all or single html table found in the html. ; element [0] (and @extended as well) contains the number of tables found (or 0 if no tables are returned) ; if an error occurs then an ampty string is returned and the following @error code is setted ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetList($sHtml, $i_index = -1) Local $aTables = _ParseTags($sHtml, "<table", "</table>") If @error Then Return SetError(@error, 0, "") ElseIf $i_index = -1 Then Return SetError(0, $aTables[0], $aTables) Else If $i_index > 0 And $i_index <= $aTables[0] Then Local $aTemp[2] = [1, $aTables[$i_index]] Return SetError(0, 1, $aTemp) Else Return SetError(4, 0, "") ; bad index EndIf EndIf EndFunc ;==>_HtmlTableGetList ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableWriteToArray ; Description ...: It writes values from an html table to a 2D array. It tries to take care of the rowspan and colspan formats ; Syntax ........: _HtmlTableWriteToArray($sHtmlTable[, $bFillSpan = False[, $iFilter = 0]]) ; Parameters ....: $sHtmlTable - A string value containing the html code of the table to be parsed ; $bFillSpan - [optional] Default is False. If span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: Success: 2D array containing data from the html table ; Faillure: An empty strimg and sets @error as following: ; @error: 1 - no table content is present in the passed HTML ; 2 - error while parsing rows and/or columns, (opening and closing tags are not balanced) ; 3 - error while parsing rows and/or columns, (open/close mismatch error) ; =============================================================================================================================== Func _HtmlTableWriteToArray($sHtmlTable, $bFillSpan = False, $iFilter = 0) $sHtmlTable = StringReplace(StringReplace($sHtmlTable, "<th", "<td"), "</th>", "</td>") ; th becomes td ; rows of the wanted table Local $iError, $aTempEmptyRow[2] = [1, ""] Local $aRows = _ParseTags($sHtmlTable, "<tr", "</tr>") ; $aRows[0] = nr. of rows If @error Then Return SetError(@error, 0, "") Local $aCols[$aRows[0] + 1], $aTemp For $i = 1 To $aRows[0] $aTemp = _ParseTags($aRows[$i], "<td", "</td>") $iError = @error If $iError = 1 Then ; check if it's an empty row $aTemp = $aTempEmptyRow ; Empty Row Else If $iError Then Return SetError($iError, 0, "") EndIf If $aCols[0] < $aTemp[0] Then $aCols[0] = $aTemp[0] ; $aTemp[0] = max nr. of columns in table $aCols[$i] = $aTemp Next Local $aResult[$aRows[0]][$aCols[0]], $iStart, $iEnd, $aRowspan, $aColspan, $iSpanY, $iSpanX, $iSpanRow, $iSpanCol, $iMarkerCode, $sCellContent Local $aMirror = $aResult For $i = 1 To $aRows[0] ; scan all rows in this table $aTemp = $aCols[$i] ; <td ..> xx </td> ..... For $ii = 1 To $aTemp[0] ; scan all cells in this row $iSpanY = 0 $iSpanX = 0 $iY = $i - 1 ; zero base index for vertical ref $iX = $ii - 1 ; zero based indexes for horizontal ref ; following RegExp kindly provided by SadBunny in this post: ; http://www.autoitscript.com/forum/topic/167174-how-to-get-a-number-located-after-a-name-from-within-a-string/?p=1222781 $aRowspan = StringRegExp($aTemp[$ii], "(?i)rowspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of rowspan If IsArray($aRowspan) Then $iSpanY = $aRowspan[0] - 1 If $iSpanY + $iY > $aRows[0] Then $iSpanY -= $iSpanY + $iY - $aRows[0] + 1 EndIf EndIf ; $aColspan = StringRegExp($aTemp[$ii], "(?i)colspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of colspan If IsArray($aColspan) Then $iSpanX = $aColspan[0] - 1 ; $iMarkerCode += 1 ; code to mark this span area or single cell If $iSpanY Or $iSpanX Then $iX1 = $iX For $iSpY = 0 To $iSpanY For $iSpX = 0 To $iSpanX $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf $iSpanCol = $iX1 + $iSpX If $iSpanCol > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf ; While $aMirror[$iSpanRow][$iX1 + $iSpX] ; search first free column $iX1 += 1 ; $iSpanCol += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf WEnd Next Next EndIf ; $iX1 = $iX ; following RegExp kindly provided by mikell in this post: ; http://www.autoitscript.com/forum/topic/167309-how-to-remove-from-a-string-all-between-and-pairs/?p=1224207 $sCellContent = StringRegExpReplace($aTemp[$ii], '<[^>]+>', "") If $iFilter Then $sCellContent = _HTML_Filter($sCellContent, $iFilter) For $iSpX = 0 To $iSpanX For $iSpY = 0 To $iSpanY $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf While $aMirror[$iSpanRow][$iX1 + $iSpX] $iX1 += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][$iX1 + $iSpX + 1] ReDim $aMirror[$aRows[0]][$iX1 + $iSpX + 1] EndIf WEnd $aMirror[$iSpanRow][$iX1 + $iSpX] = $iMarkerCode ; 1 If $bFillSpan Then $aResult[$iSpanRow][$iX1 + $iSpX] = $sCellContent Next $aResult[$iY][$iX1] = $sCellContent Next Next Next ; _ArrayDisplay($aMirror, "Debug") Return SetError(0, $aResult[0][0], $aResult) EndFunc ;==>_HtmlTableWriteToArray ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetWriteToArray ; Description ...: extract the html code of the required table from the html listing and copy the data of the table to a 2D array ; Syntax ........: _HtmlTableGetWriteToArray($sHtml[, $iWantedTable = 1[, $bFillSpan = False[, $iFilter = 0]]]) ; Parameters ....: $sHtml - A string value containing the html listing ; $iWantedTable - [optional] An integer value. The nr. of the table to be parsed (default is first table) ; $bFillSpan - [optional] Default is False. If all span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: success: 2D array containing data from the wanted html table. ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetWriteToArray($sHtml, $iWantedTable = 1, $bFillSpan = False, $iFilter = 0) Local $aSingleTable = _HtmlTableGetList($sHtml, $iWantedTable) If @error Then Return SetError(@error, 0, "") Local $aTableData = _HtmlTableWriteToArray($aSingleTable[1], $bFillSpan, $iFilter) If @error Then Return SetError(@error, 0, "") Return SetError(0, $aTableData[0][0], $aTableData) EndFunc ;==>_HtmlTableGetWriteToArray ; #FUNCTION# ==================================================================================================================== ; Name ..........: _ParseTags ; Description ...: searches and extract all portions of html code within opening and closing tags inclusive. ; Returns an array containing a collection of <tag ...... </tag> lines. one in each element (even if are nested) ; Syntax ........: _ParseTags($sHtml, $sOpening, $sClosing) ; Parameters ....: $sHtml - A string value containing the html listing ; $sOpening - A string value indicating the opening tag ; $sClosing - A string value indicating the closing tag ; Return values .: success: an 1D 1 based array containing all the portions of html code representing the element ; element [0] af the array (and @extended as well) contains the counter of found elements ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _ParseTags($sHtml, $sOpening, $sClosing) ; example: $sOpening = '<table', $sClosing = '</table>' ; it finds how many of such tags are on the HTML page StringReplace($sHtml, $sOpening, $sOpening) ; in @xtended nr. of occurences Local $iNrOfThisTag = @extended ; I assume that opening <tag and closing </tag> tags are balanced (as should be) ; (so NO check is made to see if they are actually balanced) If $iNrOfThisTag Then ; if there is at least one of this tag ; $aThisTagsPositions array will contain the positions of the ; starting <tag and ending </tag> tags within the HTML Local $aThisTagsPositions[$iNrOfThisTag * 2 + 1][3] ; 1 based (make room for all open and close tags) ; 2) find in the HTML the positions of the $sOpening <tag and $sClosing </tag> tags For $i = 1 To $iNrOfThisTag $aThisTagsPositions[$i][0] = StringInStr($sHtml, $sOpening, 0, $i) ; start position of $i occurrence of <tag opening tag $aThisTagsPositions[$i][1] = $sOpening ; it marks which kind of tag is this $aThisTagsPositions[$i][2] = $i ; nr of this tag $aThisTagsPositions[$iNrOfThisTag + $i][0] = StringInStr($sHtml, $sClosing, 0, $i) + StringLen($sClosing) - 1 ; end position of $i^ occurrence of </tag> closing tag $aThisTagsPositions[$iNrOfThisTag + $i][1] = $sClosing ; it marks which kind of tag is this Next _ArraySort($aThisTagsPositions, 0, 1) ; now all opening and closing tags are in the same sequence as them appears in the HTML Local $aStack[UBound($aThisTagsPositions)][2] Local $aTags[Ceiling(UBound($aThisTagsPositions) / 2)] ; will contains the collection of <tag ..... </tag> from the html For $i = 1 To UBound($aThisTagsPositions) - 1 If $aThisTagsPositions[$i][1] = $sOpening Then ; opening <tag $aStack[0][0] += 1 ; nr of tags in html $aStack[$aStack[0][0]][0] = $sOpening $aStack[$aStack[0][0]][1] = $i ElseIf $aThisTagsPositions[$i][1] = $sClosing Then ; a closing </tag> was found If Not $aStack[0][0] Or Not ($aStack[$aStack[0][0]][0] = $sOpening And $aThisTagsPositions[$i][1] = $sClosing) Then Return SetError(3, 0, "") ; Open/Close mismatch error Else ; pair detected (the reciprocal tag) ; now get coordinates of the 2 tags ; 1) extract this tag <tag ..... </tag> from the html to the array $aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]] = StringMid($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0], 1 + $aThisTagsPositions[$i][0] - $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0]) ; 2) remove that tag <tag ..... </tag> from the html $sHtml = StringLeft($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0] - 1) & StringMid($sHtml, $aThisTagsPositions[$i][0] + 1) ; 3) adjust the references to the new positions of remaining tags For $ii = $i To UBound($aThisTagsPositions) - 1 $aThisTagsPositions[$ii][0] -= StringLen($aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]]) Next $aStack[0][0] -= 1 ; nr of tags still in html EndIf EndIf Next If Not $aStack[0][0] Then ; all tags where parsed correctly $aTags[0] = $iNrOfThisTag Return SetError(0, $iNrOfThisTag, $aTags) ; OK Else Return SetError(2, 0, "") ; opening and closing tags are not balanced EndIf Else Return SetError(1, 0, "") ; there are no of such tags on this HTML page EndIf EndFunc ;==>_ParseTags ; #============================================================================= ; Name ..........: _HTML_Filter ; Description ...: Filter for strings ; AutoIt Version : V3.3.0.0 ; Syntax ........: _HTML_Filter(ByRef $sString[, $iMode = 0]) ; Parameter(s): .: $sString - String to filter ; $iMode - Optional: (Default = 0) : removes nothing ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return Value ..: Success - Filterd String ; Failure - Input String ; Author(s) .....: Thorsten Willert, Stephen Podhajecki {gehossafats at netmdc. com} _ConvertEntities ; Date ..........: Wed Jan 27 20:49:59 CET 2010 ; modified ......: by Chimp Removed a double "&nbsp;" entities declaration, ; replace it with char(160) instead of chr(32), ; declaration of the $aEntities array as Static instead of just Local ; ============================================================================== Func _HTML_Filter(ByRef $sString, $iMode = 0) If $iMode = 0 Then Return $sString ;16 simple HTML tag / entities converter If $iMode >= 16 And $iMode < 32 Then Static Local $aEntities[95][2] = [["&quot;", 34],["&amp;", 38],["&lt;", 60],["&gt;", 62],["&nbsp;", 160] _ ,["&iexcl;", 161],["&cent;", 162],["&pound;", 163],["&curren;", 164],["&yen;", 165],["&brvbar;", 166] _ ,["&sect;", 167],["&uml;", 168],["&copy;", 169],["&ordf;", 170],["&not;", 172],["&shy;", 173] _ ,["&reg;", 174],["&macr;", 175],["&deg;", 176],["&plusmn;", 177],["&sup2;", 178],["&sup3;", 179] _ ,["&acute;", 180],["&micro;", 181],["&para;", 182],["&middot;", 183],["&cedil;", 184],["&sup1;", 185] _ ,["&ordm;", 186],["&raquo;", 187],["&frac14;", 188],["&frac12;", 189],["&frac34;", 190],["&iquest;", 191] _ ,["&Agrave;", 192],["&Aacute;", 193],["&Atilde;", 195],["&Auml;", 196],["&Aring;", 197],["&AElig;", 198] _ ,["&Ccedil;", 199],["&Egrave;", 200],["&Eacute;", 201],["&Ecirc;", 202],["&Igrave;", 204],["&Iacute;", 205] _ ,["&Icirc;", 206],["&Iuml;", 207],["&ETH;", 208],["&Ntilde;", 209],["&Ograve;", 210],["&Oacute;", 211] _ ,["&Ocirc;", 212],["&Otilde;", 213],["&Ouml;", 214],["&times;", 215],["&Oslash;", 216],["&Ugrave;", 217] _ ,["&Uacute;", 218],["&Ucirc;", 219],["&Uuml;", 220],["&Yacute;", 221],["&THORN;", 222],["&szlig;", 223] _ ,["&agrave;", 224],["&aacute;", 225],["&acirc;", 226],["&atilde;", 227],["&auml;", 228],["&aring;", 229] _ ,["&aelig;", 230],["&ccedil;", 231],["&egrave;", 232],["&eacute;", 233],["&ecirc;", 234],["&euml;", 235] _ ,["&igrave;", 236],["&iacute;", 237],["&icirc;", 238],["&iuml;", 239],["&eth;", 240],["&ntilde;", 241] _ ,["&ograve;", 242],["&oacute;", 243],["&ocirc;", 244],["&otilde;", 245],["&ouml;", 246],["&divide;", 247] _ ,["&oslash;", 248],["&ugrave;", 249],["&uacute;", 250],["&ucirc;", 251],["&uuml;", 252],["&thorn;", 254]] $sString = StringRegExpReplace($sString, '(?i)<p.*?>', @CRLF & @CRLF) $sString = StringRegExpReplace($sString, '(?i)<br>', @CRLF) Local $iE = UBound($aEntities) - 1 For $x = 0 To $iE $sString = StringReplace($sString, $aEntities[$x][0], Chr($aEntities[$x][1]), 0, 2) Next For $x = 32 To 255 $sString = StringReplace($sString, "&#" & $x & ";", Chr($x)) Next $iMode -= 16 EndIf ;8 Tag filter If $iMode >= 8 And $iMode < 16 Then ;$sString = StringRegExpReplace($sString, '<script.*?>.*?</script>', "") $sString = StringRegExpReplace($sString, "<[^>]*>", "") $iMode -= 8 EndIf ; 4 remove all double cr, lf If $iMode >= 4 And $iMode < 8 Then $sString = StringRegExpReplace($sString, "([ \t]*[\n\r]+[ \t]*)", @CRLF) $sString = StringRegExpReplace($sString, "[\n\r]+", @CRLF) $iMode -= 4 EndIf ; 2 remove all double withespaces If $iMode = 2 Or $iMode = 3 Then $sString = StringRegExpReplace($sString, "[[:blank:]]+", " ") $sString = StringRegExpReplace($sString, "\n[[:blank:]]+", @CRLF) $sString = StringRegExpReplace($sString, "[[:blank:]]+\n", "") $iMode -= 2 EndIf ; 1 remove all non ASCII (remove all chars with ascii code > 127) If $iMode = 1 Then $sString = StringRegExpReplace($sString, "[^\x00-\x7F]", " ") EndIf Return $sString EndFunc ;==>_HTML_Filter This simple demo allow to test those functions, showing what it can extract from the html tables in a web page of your choice or loading the html file from the disc.
      ; #include <_HtmlTable2Array.au3> ; <--- udf already included (hard coded) at bottom of this demo #include <GUIConstantsEx.au3> #include <EditConstants.au3> #include <WindowsConstants.au3> #include <File.au3> ; needed for _FileWriteFromArray() #include <array.au3> #include <IE.au3> Local $oIE1 = _IECreateEmbedded(), $oIE2 = _IECreateEmbedded(), $iFilter = 0 Local $sHtml_File, $iIndex, $aTable, $aMyArray, $sFilePath GUICreate("Html tables to array demo", 1000, 450, (@DesktopWidth - 1000) / 2, (@DesktopHeight - 450) / 2 _ , $WS_OVERLAPPEDWINDOW + $WS_CLIPSIBLINGS + $WS_CLIPCHILDREN) GUICtrlCreateObj($oIE1, 010, 10, 480, 360) ; left browser GUICtrlCreateTab(500, 10, 480, 360) GUICtrlCreateTabItem("view table") GUICtrlCreateObj($oIE2, 502, 33, 474, 335) ; right browser GUICtrlCreateTabItem("view html") Local $idLabel_HtmlTable = GUICtrlCreateInput("", 502, 33, 474, 335, $ES_MULTILINE + $ES_AUTOVSCROLL) GUICtrlSetFont(-1, 10, 0, 0, "Courier new") GUICtrlCreateTabItem("") Local $idInputUrl = GUICtrlCreateInput("", 10, 380, 440, 20) Local $idButton_Go = GUICtrlCreateButton("Go", 455, 380, 25, 20) Local $idButton_Load = GUICtrlCreateButton("Load html from disk", 10, 410, 480, 30) Local $idButton_Prev = GUICtrlCreateButton("Prev <-", 510, 375, 50, 30) Local $idLabel_NunTable = GUICtrlCreateLabel("00 / 00", 570, 375, 40, 30) GUICtrlSetFont(-1, 9, 700) Local $idButton_Next = GUICtrlCreateButton("Next ->", 620, 375, 50, 30) GUICtrlCreateGroup("Fill Span", 680, 370, 80, 40) Local $iFillSpan = GUICtrlCreateCheckbox("", 715, 388, 15, 15) GUICtrlCreateGroup("", -99, -99, 1, 1) ;close group Local $idButton_Array0 = GUICtrlCreateButton("Preview array", 770, 375, 100, 30) Local $idButton_Array1 = GUICtrlCreateButton("Write array to file", 880, 375, 100, 30) ; options for filtering GUICtrlCreateGroup("Filters", 510, 410, 470, 35) Local $iFilter01 = GUICtrlCreateCheckbox("non ascii", 520, 425, 85, 15) Local $iFilter02 = GUICtrlCreateCheckbox("double spaces", 610, 425, 85, 15) Local $iFilter04 = GUICtrlCreateCheckbox("double @LF", 700, 425, 85, 15) Local $iFilter08 = GUICtrlCreateCheckbox("html-tags", 790, 425, 85, 15) Local $iFilter16 = GUICtrlCreateCheckbox("tags to entities", 880, 425, 85, 15) GUICtrlCreateGroup("", -99, -99, 1, 1) ;close group GUISetState(@SW_SHOW) ;Show GUI ; _IEDocWriteHTML($oIE2, "<HTML></HTML>") GUICtrlSetData($idInputUrl, "http://www.danshort.com/HTMLentities/") ; GUICtrlSetData($idInputUrl, "http://www.mojotoad.com/sisk/projects/HTML-TableExtract/tables.html") ; example page ControlClick("", "", $idButton_Go) ; _IEAction($oIE1, "stop") Do; Waiting for user to close the window $iMsg = GUIGetMsg() Select Case $iMsg = $idButton_Go _IENavigate($oIE1, GUICtrlRead($idInputUrl)) ; _IEAction($oIE1, "stop") $aTables = _HtmlTableGetList(_IEBodyReadHTML($oIE1)) If Not @error Then ; _ArrayDisplay($aTables, "Tables contained in this html") $iIndex = 1 _IEBodyWriteHTML($oIE2, "<html>" & $aTables[$iIndex] & "</html>") ControlClick("", "", $idButton_Prev) _IEAction($oIE2, "stop") Else MsgBox(0, 0, "@error " & @error) EndIf Case $iMsg = $idButton_Load ConsoleWrite("$idButton_Load" & @CRLF) $sHtml_File = FileOpenDialog("Choose an html file", @ScriptDir & "\", "html page (*.htm;*.html)") If Not @error Then GUICtrlSetData($idInputUrl, $sHtml_File) ControlClick("", "", $idButton_Go) EndIf Case $iMsg = $idButton_Next If IsArray($aTables) Then $iIndex += $iIndex < $aTables[0] GUICtrlSetData($idLabel_NunTable, "Table" & @CRLF & $iIndex & " / " & $aTables[0]) GUICtrlSetData($idLabel_HtmlTable, $aTables[$iIndex]) _IEBodyWriteHTML($oIE2, "<html>" & $aTables[$iIndex] & "</html>") _IEAction($oIE2, "stop") EndIf Case $iMsg = $idButton_Prev If IsArray($aTables) Then $iIndex -= $iIndex > 1 GUICtrlSetData($idLabel_NunTable, "Table" & @CRLF & $iIndex & " / " & $aTables[0]) GUICtrlSetData($idLabel_HtmlTable, $aTables[$iIndex]) _IEBodyWriteHTML($oIE2, "<html>" & $aTables[$iIndex] & "</html>") _IEAction($oIE2, "stop") EndIf Case $iMsg = $idButton_Array0 ; Preview Array If IsArray($aTables) Then $iFilter = 1 * _IsChecked($iFilter01) + 2 * _IsChecked($iFilter02) + 4 * _IsChecked($iFilter04) + 8 * _IsChecked($iFilter08) + 16 * _IsChecked($iFilter16) $aMyArray = _HtmlTableWriteToArray($aTables[$iIndex], _IsChecked($iFillSpan), $iFilter) If Not @error Then _ArrayDisplay($aMyArray) EndIf Case $iMsg = $idButton_Array1 ; Saves the array in a csv file of your choice If IsArray($aTables) Then $iFilter = 1 * _IsChecked($iFilter01) + 2 * _IsChecked($iFilter02) + 4 * _IsChecked($iFilter04) + 8 * _IsChecked($iFilter08) + 16 * _IsChecked($iFilter16) $aMyArray = _HtmlTableWriteToArray($aTables[$iIndex], _IsChecked($iFillSpan), $iFilter) If Not @error Then $sFilePath = FileSaveDialog("Choose a file to save to", @ScriptDir, "(*.csv)") If $sFilePath <> "" Then If Not _FileWriteFromArray($sFilePath, $aMyArray, 0, Default, ",") Then MsgBox(0, "Error on file write", "Error code is " & @error & @CRLF & @CRLF & "@error meaning:" & @CRLF & _ "1 - Error opening specified file" & @CRLF & _ "2 - $aArray is not an array" & @CRLF & _ "3 - Error writing to file" & @CRLF & _ "4 - $aArray is not a 1D or 2D array" & @CRLF & _ "5 - Start index is greater than the $iUbound parameter") EndIf EndIf EndIf EndIf EndSelect Until $iMsg = $GUI_EVENT_CLOSE GUIDelete() ; returns 1 if CheckBox is checked Func _IsChecked($idControlID) ; $GUI_CHECKED = 1 Return GUICtrlRead($idControlID) = $GUI_CHECKED EndFunc ;==>_IsChecked ; ------------------------------------------------------------------------ ; Following code should be included by the #include <_HtmlTable2Array.au3> ; hard coded here for easy load an run to try the example ; ------------------------------------------------------------------------ #include-once #include <array.au3> ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetList ; Description ...: Finds and enumerates all the html tables contained in an html listing (even if nested). ; if the optional parameter $i_index is passed, then only that table is returned ; Syntax ........: _HtmlTableGetList($sHtml[, $i_index = -1]) ; Parameters ....: $sHtml - A string value containing an html page listing ; $i_index - [optional] An integer value indicating the number of the table to be returned (1 based) ; with the default value of -1 an array with all found tables is returned ; Return values .: Success; Returns an 1D 1 based array containing all or single html table found in the html. ; element [0] (and @extended as well) contains the number of tables found (or 0 if no tables are returned) ; if an error occurs then an ampty string is returned and the following @error code is setted ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetList($sHtml, $i_index = -1) Local $aTables = _ParseTags($sHtml, "<table", "</table>") If @error Then Return SetError(@error, 0, "") ElseIf $i_index = -1 Then Return SetError(0, $aTables[0], $aTables) Else If $i_index > 0 And $i_index <= $aTables[0] Then Local $aTemp[2] = [1, $aTables[$i_index]] Return SetError(0, 1, $aTemp) Else Return SetError(4, 0, "") ; bad index EndIf EndIf EndFunc ;==>_HtmlTableGetList ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableWriteToArray ; Description ...: It writes values from an html table to a 2D array. It tries to take care of the rowspan and colspan formats ; Syntax ........: _HtmlTableWriteToArray($sHtmlTable[, $bFillSpan = False[, $iFilter = 0]]) ; Parameters ....: $sHtmlTable - A string value containing the html code of the table to be parsed ; $bFillSpan - [optional] Default is False. If span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: Success: 2D array containing data from the html table ; Faillure: An empty strimg and sets @error as following: ; @error: 1 - no table content is present in the passed HTML ; 2 - error while parsing rows and/or columns, (opening and closing tags are not balanced) ; 3 - error while parsing rows and/or columns, (open/close mismatch error) ; =============================================================================================================================== Func _HtmlTableWriteToArray($sHtmlTable, $bFillSpan = False, $iFilter = 0) $sHtmlTable = StringReplace(StringReplace($sHtmlTable, "<th", "<td"), "</th>", "</td>") ; th becomes td ; rows of the wanted table Local $iError, $aTempEmptyRow[2] = [1, ""] Local $aRows = _ParseTags($sHtmlTable, "<tr", "</tr>") ; $aRows[0] = nr. of rows If @error Then Return SetError(@error, 0, "") Local $aCols[$aRows[0] + 1], $aTemp For $i = 1 To $aRows[0] $aTemp = _ParseTags($aRows[$i], "<td", "</td>") $iError = @error If $iError = 1 Then ; check if it's an empty row $aTemp = $aTempEmptyRow ; Empty Row Else If $iError Then Return SetError($iError, 0, "") EndIf If $aCols[0] < $aTemp[0] Then $aCols[0] = $aTemp[0] ; $aTemp[0] = max nr. of columns in table $aCols[$i] = $aTemp Next Local $aResult[$aRows[0]][$aCols[0]], $iStart, $iEnd, $aRowspan, $aColspan, $iSpanY, $iSpanX, $iSpanRow, $iSpanCol, $iMarkerCode, $sCellContent Local $aMirror = $aResult For $i = 1 To $aRows[0] ; scan all rows in this table $aTemp = $aCols[$i] ; <td ..> xx </td> ..... For $ii = 1 To $aTemp[0] ; scan all cells in this row $iSpanY = 0 $iSpanX = 0 $iY = $i - 1 ; zero base index for vertical ref $iX = $ii - 1 ; zero based indexes for horizontal ref ; following RegExp kindly provided by SadBunny in this post: ; http://www.autoitscript.com/forum/topic/167174-how-to-get-a-number-located-after-a-name-from-within-a-string/?p=1222781 $aRowspan = StringRegExp($aTemp[$ii], "(?i)rowspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of rowspan If IsArray($aRowspan) Then $iSpanY = $aRowspan[0] - 1 If $iSpanY + $iY > $aRows[0] Then $iSpanY -= $iSpanY + $iY - $aRows[0] + 1 EndIf EndIf ; $aColspan = StringRegExp($aTemp[$ii], "(?i)colspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of colspan If IsArray($aColspan) Then $iSpanX = $aColspan[0] - 1 ; $iMarkerCode += 1 ; code to mark this span area or single cell If $iSpanY Or $iSpanX Then $iX1 = $iX For $iSpY = 0 To $iSpanY For $iSpX = 0 To $iSpanX $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf $iSpanCol = $iX1 + $iSpX If $iSpanCol > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf ; While $aMirror[$iSpanRow][$iX1 + $iSpX] ; search first free column $iX1 += 1 ; $iSpanCol += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf WEnd Next Next EndIf ; $iX1 = $iX ; following RegExp kindly provided by mikell in this post: ; http://www.autoitscript.com/forum/topic/167309-how-to-remove-from-a-string-all-between-and-pairs/?p=1224207 $sCellContent = StringRegExpReplace($aTemp[$ii], '<[^>]+>', "") If $iFilter Then $sCellContent = _HTML_Filter($sCellContent, $iFilter) For $iSpX = 0 To $iSpanX For $iSpY = 0 To $iSpanY $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf While $aMirror[$iSpanRow][$iX1 + $iSpX] $iX1 += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][$iX1 + $iSpX + 1] ReDim $aMirror[$aRows[0]][$iX1 + $iSpX + 1] EndIf WEnd $aMirror[$iSpanRow][$iX1 + $iSpX] = $iMarkerCode ; 1 If $bFillSpan Then $aResult[$iSpanRow][$iX1 + $iSpX] = $sCellContent Next $aResult[$iY][$iX1] = $sCellContent Next Next Next ; _ArrayDisplay($aMirror, "Debug") Return SetError(0, $aResult[0][0], $aResult) EndFunc ;==>_HtmlTableWriteToArray ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetWriteToArray ; Description ...: extract the html code of the required table from the html listing and copy the data of the table to a 2D array ; Syntax ........: _HtmlTableGetWriteToArray($sHtml[, $iWantedTable = 1[, $bFillSpan = False[, $iFilter = 0]]]) ; Parameters ....: $sHtml - A string value containing the html listing ; $iWantedTable - [optional] An integer value. The nr. of the table to be parsed (default is first table) ; $bFillSpan - [optional] Default is False. If all span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: success: 2D array containing data from the wanted html table. ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetWriteToArray($sHtml, $iWantedTable = 1, $bFillSpan = False, $iFilter = 0) Local $aSingleTable = _HtmlTableGetList($sHtml, $iWantedTable) If @error Then Return SetError(@error, 0, "") Local $aTableData = _HtmlTableWriteToArray($aSingleTable[1], $bFillSpan, $iFilter) If @error Then Return SetError(@error, 0, "") Return SetError(0, $aTableData[0][0], $aTableData) EndFunc ;==>_HtmlTableGetWriteToArray ; #FUNCTION# ==================================================================================================================== ; Name ..........: _ParseTags ; Description ...: searches and extract all portions of html code within opening and closing tags inclusive. ; Returns an array containing a collection of <tag ...... </tag> lines. one in each element (even if are nested) ; Syntax ........: _ParseTags($sHtml, $sOpening, $sClosing) ; Parameters ....: $sHtml - A string value containing the html listing ; $sOpening - A string value indicating the opening tag ; $sClosing - A string value indicating the closing tag ; Return values .: success: an 1D 1 based array containing all the portions of html code representing the element ; element [0] af the array (and @extended as well) contains the counter of found elements ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _ParseTags($sHtml, $sOpening, $sClosing) ; example: $sOpening = '<table', $sClosing = '</table>' ; it finds how many of such tags are on the HTML page StringReplace($sHtml, $sOpening, $sOpening) ; in @xtended nr. of occurences Local $iNrOfThisTag = @extended ; I assume that opening <tag and closing </tag> tags are balanced (as should be) ; (so NO check is made to see if they are actually balanced) If $iNrOfThisTag Then ; if there is at least one of this tag ; $aThisTagsPositions array will contain the positions of the ; starting <tag and ending </tag> tags within the HTML Local $aThisTagsPositions[$iNrOfThisTag * 2 + 1][3] ; 1 based (make room for all open and close tags) ; 2) find in the HTML the positions of the $sOpening <tag and $sClosing </tag> tags For $i = 1 To $iNrOfThisTag $aThisTagsPositions[$i][0] = StringInStr($sHtml, $sOpening, 0, $i) ; start position of $i occurrence of <tag opening tag $aThisTagsPositions[$i][1] = $sOpening ; it marks which kind of tag is this $aThisTagsPositions[$i][2] = $i ; nr of this tag $aThisTagsPositions[$iNrOfThisTag + $i][0] = StringInStr($sHtml, $sClosing, 0, $i) + StringLen($sClosing) - 1 ; end position of $i^ occurrence of </tag> closing tag $aThisTagsPositions[$iNrOfThisTag + $i][1] = $sClosing ; it marks which kind of tag is this Next _ArraySort($aThisTagsPositions, 0, 1) ; now all opening and closing tags are in the same sequence as them appears in the HTML Local $aStack[UBound($aThisTagsPositions)][2] Local $aTags[Ceiling(UBound($aThisTagsPositions) / 2)] ; will contains the collection of <tag ..... </tag> from the html For $i = 1 To UBound($aThisTagsPositions) - 1 If $aThisTagsPositions[$i][1] = $sOpening Then ; opening <tag $aStack[0][0] += 1 ; nr of tags in html $aStack[$aStack[0][0]][0] = $sOpening $aStack[$aStack[0][0]][1] = $i ElseIf $aThisTagsPositions[$i][1] = $sClosing Then ; a closing </tag> was found If Not $aStack[0][0] Or Not ($aStack[$aStack[0][0]][0] = $sOpening And $aThisTagsPositions[$i][1] = $sClosing) Then Return SetError(3, 0, "") ; Open/Close mismatch error Else ; pair detected (the reciprocal tag) ; now get coordinates of the 2 tags ; 1) extract this tag <tag ..... </tag> from the html to the array $aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]] = StringMid($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0], 1 + $aThisTagsPositions[$i][0] - $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0]) ; 2) remove that tag <tag ..... </tag> from the html $sHtml = StringLeft($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0] - 1) & StringMid($sHtml, $aThisTagsPositions[$i][0] + 1) ; 3) adjust the references to the new positions of remaining tags For $ii = $i To UBound($aThisTagsPositions) - 1 $aThisTagsPositions[$ii][0] -= StringLen($aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]]) Next $aStack[0][0] -= 1 ; nr of tags still in html EndIf EndIf Next If Not $aStack[0][0] Then ; all tags where parsed correctly $aTags[0] = $iNrOfThisTag Return SetError(0, $iNrOfThisTag, $aTags) ; OK Else Return SetError(2, 0, "") ; opening and closing tags are not balanced EndIf Else Return SetError(1, 0, "") ; there are no of such tags on this HTML page EndIf EndFunc ;==>_ParseTags ; #============================================================================= ; Name ..........: _HTML_Filter ; Description ...: Filter for strings ; AutoIt Version : V3.3.0.0 ; Syntax ........: _HTML_Filter(ByRef $sString[, $iMode = 0]) ; Parameter(s): .: $sString - String to filter ; $iMode - Optional: (Default = 0) : removes nothing ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return Value ..: Success - Filterd String ; Failure - Input String ; Author(s) .....: Thorsten Willert, Stephen Podhajecki {gehossafats at netmdc. com} _ConvertEntities ; Date ..........: Wed Jan 27 20:49:59 CET 2010 ; modified ......: by Chimp Removed a double "&nbsp;" entities declaration, ; replace it with char(160) instead of chr(32), ; declaration of the $aEntities array as Static instead of just Local ; ============================================================================== Func _HTML_Filter(ByRef $sString, $iMode = 0) If $iMode = 0 Then Return $sString ;16 simple HTML tag / entities converter If $iMode >= 16 And $iMode < 32 Then Static Local $aEntities[95][2] = [["&quot;", 34],["&amp;", 38],["&lt;", 60],["&gt;", 62],["&nbsp;", 160] _ ,["&iexcl;", 161],["&cent;", 162],["&pound;", 163],["&curren;", 164],["&yen;", 165],["&brvbar;", 166] _ ,["&sect;", 167],["&uml;", 168],["&copy;", 169],["&ordf;", 170],["&not;", 172],["&shy;", 173] _ ,["&reg;", 174],["&macr;", 175],["&deg;", 176],["&plusmn;", 177],["&sup2;", 178],["&sup3;", 179] _ ,["&acute;", 180],["&micro;", 181],["&para;", 182],["&middot;", 183],["&cedil;", 184],["&sup1;", 185] _ ,["&ordm;", 186],["&raquo;", 187],["&frac14;", 188],["&frac12;", 189],["&frac34;", 190],["&iquest;", 191] _ ,["&Agrave;", 192],["&Aacute;", 193],["&Atilde;", 195],["&Auml;", 196],["&Aring;", 197],["&AElig;", 198] _ ,["&Ccedil;", 199],["&Egrave;", 200],["&Eacute;", 201],["&Ecirc;", 202],["&Igrave;", 204],["&Iacute;", 205] _ ,["&Icirc;", 206],["&Iuml;", 207],["&ETH;", 208],["&Ntilde;", 209],["&Ograve;", 210],["&Oacute;", 211] _ ,["&Ocirc;", 212],["&Otilde;", 213],["&Ouml;", 214],["&times;", 215],["&Oslash;", 216],["&Ugrave;", 217] _ ,["&Uacute;", 218],["&Ucirc;", 219],["&Uuml;", 220],["&Yacute;", 221],["&THORN;", 222],["&szlig;", 223] _ ,["&agrave;", 224],["&aacute;", 225],["&acirc;", 226],["&atilde;", 227],["&auml;", 228],["&aring;", 229] _ ,["&aelig;", 230],["&ccedil;", 231],["&egrave;", 232],["&eacute;", 233],["&ecirc;", 234],["&euml;", 235] _ ,["&igrave;", 236],["&iacute;", 237],["&icirc;", 238],["&iuml;", 239],["&eth;", 240],["&ntilde;", 241] _ ,["&ograve;", 242],["&oacute;", 243],["&ocirc;", 244],["&otilde;", 245],["&ouml;", 246],["&divide;", 247] _ ,["&oslash;", 248],["&ugrave;", 249],["&uacute;", 250],["&ucirc;", 251],["&uuml;", 252],["&thorn;", 254]] $sString = StringRegExpReplace($sString, '(?i)<p.*?>', @CRLF & @CRLF) $sString = StringRegExpReplace($sString, '(?i)<br>', @CRLF) Local $iE = UBound($aEntities) - 1 For $x = 0 To $iE $sString = StringReplace($sString, $aEntities[$x][0], Chr($aEntities[$x][1]), 0, 2) Next For $x = 32 To 255 $sString = StringReplace($sString, "&#" & $x & ";", Chr($x)) Next $iMode -= 16 EndIf ;8 Tag filter If $iMode >= 8 And $iMode < 16 Then ;$sString = StringRegExpReplace($sString, '<script.*?>.*?</script>', "") $sString = StringRegExpReplace($sString, "<[^>]*>", "") $iMode -= 8 EndIf ; 4 remove all double cr, lf If $iMode >= 4 And $iMode < 8 Then $sString = StringRegExpReplace($sString, "([ \t]*[\n\r]+[ \t]*)", @CRLF) $sString = StringRegExpReplace($sString, "[\n\r]+", @CRLF) $iMode -= 4 EndIf ; 2 remove all double withespaces If $iMode = 2 Or $iMode = 3 Then $sString = StringRegExpReplace($sString, "[[:blank:]]+", " ") $sString = StringRegExpReplace($sString, "\n[[:blank:]]+", @CRLF) $sString = StringRegExpReplace($sString, "[[:blank:]]+\n", "") $iMode -= 2 EndIf ; 1 remove all non ASCII (remove all chars with ascii code > 127) If $iMode = 1 Then $sString = StringRegExpReplace($sString, "[^\x00-\x7F]", " ") EndIf Return $sString EndFunc ;==>_HTML_Filter Any error reports or suggestions for enhancements are welcome