MaoMao Posted January 21, 2012 Posted January 21, 2012 (edited) Performing statistic for data from internet. The sources are format in tables.Need help to read some spacific columns and made summation calculation.Example: http://www.bts.gov/publications/national...rtation_statistics/html/table_Used to save html pages then read to excel then back to Autoit array for processing.However for large amount of data excel is too slow.Try to find direct way to read web page table data to array for calculation.How to read column 2009 to an array and calculate the sum. (i.e. the 9,671,776 vertically);$oIE = _IECreate($Url, 0, 0) ; Working but very slow ;$oIE = InetRead($Url) ; not getting the table?? ;$oIE = _INetGetSource($Url) ; fast but not getting the table $oIE = _INetGetSource($Url, 'True') ; fast but not getting the table Edited January 22, 2012 by MaoMao
water Posted January 21, 2012 Posted January 21, 2012 Please check functions _IETableGetCollection and _IETableWriteToArray found in the IE UDF which comes with AutoIt. My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
MaoMao Posted January 21, 2012 Author Posted January 21, 2012 Thanks. How come it is not displaying the table in array. Any where to verify and work on specific column and rows? #include <IE.au3> #include <Array.au3> $oIE = _IE_Example ("http://www.bts.gov/publications/national_transportation_statistics/html/table_01_37.html") $oTable = _IETableGetCollection ($oIE, 0) $aTableData = _IETableWriteToArray ($oTable) $iNumTables = @extended MsgBox(0, "Table Info", "There are " & $iNumTables & " tables on the page") _ArrayDisplay($aTableData)
water Posted January 21, 2012 Posted January 21, 2012 Replace _IE_Example with _IECreate and give it a try. My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
MaoMao Posted January 21, 2012 Author Posted January 21, 2012 OK. Get the table. For web site with multiple tables, it is only getting the top table. Any trick to make it collect the bottom table?
water Posted January 21, 2012 Posted January 21, 2012 Something like that: #include <IE.au3> #include <Array.au3> $oIE = _IECreate("http://www.bts.gov/publications/national_transportation_statistics/html/table_01_37.html", 0, 0) $oTable = _IETableGetCollection ($oIE) ; Get number of tables $iNumTables = @extended MsgBox(0, "Table Info", "There are " & $iNumTables & " tables on the page") $oTable = _IETableGetCollection ($oIE, $iNumTables-1) ; Get last table $aTableData = _IETableWriteToArray ($oTable) _ArrayDisplay($aTableData) My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
MaoMao Posted January 21, 2012 Author Posted January 21, 2012 Thank you. For large site data it seems the IETableGetCollection able to array display 230 columns. Are there maximum rows and columns the IETableGetCollection able to read adn display? (Help file seems didn't tell the limit) Need to read around 1600 rows. Are there also way to swap the rows and columns?
Robjong Posted January 21, 2012 Posted January 21, 2012 _IETableWriteToArray--------------------------------------------------------------------------------Reads the contents of a Table into an array.$o_object Object variable of an InternetExplorer.Application, Table object$f_transpose [optional] Boolean value specifying whether to swap the rows and columns in the output array
MaoMao Posted January 21, 2012 Author Posted January 21, 2012 (edited) Yes.The _IETableWriteToArray is able to write the table data to array. Seems there is 230 column limit when display.Trying to write to Excel using the following to verify and output result but seems not writing any into Excel._ExcelWriteSheetFromArray($oExcel, $aTableData, 1, 1, 0, 0) (error writing to first row to 250 cell)$oExcel = $aTableData ( didn't write any cell )$oExcel.Range("A1") = $aTableData ( write A1 cell Only)How to write the whole array to excel? Edited January 21, 2012 by MaoMao
MaoMao Posted January 21, 2012 Author Posted January 21, 2012 An early post by Water seem solve the problem. _ExcelWriteSheetFromArray($oExcel, $aTableData, 1, 1, 0, 0) and array display seems have difficulity writing cell with large columns. $aArray2=$oExcel.transpose($aArray) _ExcelWriteSheetFromArray($oExcel, $aTableData, 1, 1, 0, 0) and array display fine. Now trying to find way to calculate the sum of a column.
water Posted January 21, 2012 Posted January 21, 2012 If you want to write it to the end of each column calculate the last used row, add 1 and then write a formula into the cell using _ExcelWriteFormula.How to calculate the last row can be found My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
MaoMao Posted January 21, 2012 Author Posted January 21, 2012 (edited) Working with array may faster. But got problem in calculation. Got the Web value in Array : 108,326,000 95,585,808 90,883,640 90,880,000 86,986,500 Sum is only calculating the first few digits before the "," ( comma ) For $i= 1 to 10;UBound($aArray2) -1 $Sum = $aArray2[$i][3]+ $Sum Next MsgBox(0, "Table Info", "There are Sub Total " & $Sum & " sum on the page") Result = 108 + 95 + 90 +90 +86 = 469 Are there function to correct this? Edited January 22, 2012 by MaoMao
water Posted January 21, 2012 Posted January 21, 2012 (edited) Yes, remove the "," and then change the string to an integer. $a = Int(StringReplace("108,326,042", "," , "")) ConsoleWrite($a & @CRLF) Edited January 21, 2012 by water My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
MaoMao Posted January 22, 2012 Author Posted January 22, 2012 Good. Sum is calcualted. The _IECreate is very slow. Not sure why open with IE for the url. But FireFox open very fast on the computer. Tried alternative but not working. ;$oIE = _IECreate($Url, 0, 0) ; Working but very slow ;$oIE = InetRead($Url) ; not getting the table?? ;$oIE = _INetGetSource($Url) ; fast but not getting the table $oIE = _INetGetSource($Url, 'True') ; fast but not getting the table Any advice?
raziel2109 Posted September 24, 2012 Posted September 24, 2012 I guys, I need help for something related to your problems, as described in this post. My problem is that the array was fullfilled with informations in the table cells, but what it's in thata array is not what is in the html code but what i see in the page in IE. the difference is that in the IE Page the tabel cell contain "my pc is a..." and the html code contain "my pc is able to fly". What was in the array is the first one, "my pc is a..." but i need the second one, "my pc is able to fly"... Some help? I think this is caused by a javascript code in html page, limiting the lenght of what to display in a cell... but how can this be bypassed? thanks a lot...
guinness Posted September 24, 2012 Posted September 24, 2012 Stick to your own topic please. UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018
DanielTyrkiel Posted September 24, 2012 Posted September 24, 2012 Have you tried requesting the excel file from the statistics office? If this is your country they ought to be able to help you... Just an optimistic thought regards Dan
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now