Jump to content

[solved] extract data from website and export to Excel - (Moved)


Recommended Posts

Hi all,

I haven't used AutoIt in more than 10 years and I am sure a lot has improved since that long time. I hope you can give me some suggestions on my approach.

Task: I need to extract user data (for around 1700 users) from a website tool. That tool shows an output in a table on the website. However, no export feature is available and I need the data in an Excel file, such as:

username, serial number (of a laptop), ID number (of laptop) and some more

 

With my knowledge from 2009 I would do this:

1) use _IEextract with each username in the url to get the whole source code of the website with the user's data summary

2) Work with lots of regexpressions to extract each data piece, save them into variables/array

3) Write variable values into an Excel file

4) rinse repeat 1700 times

 

The relevant line for step 3 looks like this:

<td class="resultcell"><span class="new">2021-03-23 11:05:00</span></td><td class="resultcell">Hostname-1234</td><td class="resultcell"><a href="?&Search=Search&result=summarized%20history&field=serial%20numbers&criteria=123456">123456</a></td><td class="resultcell">0987654/td><td class="resultcell"><a href="?&Search=Search&result=summarized%20history&field=usernames&criteria=myusername">myusername</a>

and so on.. so here it would be Hostname-1234, 0987654 and myusername that I would need to extract.


Although this may work it does not appear very efficient and would take a while. So I am happy for an alternate approach. Preferably, without using additional exe binary files due to company policies besides AutoIt itself.

Edited by Automania
Link to post
Share on other sites

Based on your description, I would think you could do something like --

  • Use _IENavigate to load the webpage for a given user
  • Use _IETableWriteToArray to retrieve the user data
  • Write the desired contents to an Excel spreadsheet using the Excel UDF
  • Rinse and repeat 😅

Another option would be to use InetRead to retrieve the raw HTML, but then you're back to parsing the contents manually.

Can you tell us more about the website? Do they offer an API?

Link to post
Share on other sites
  • Moderators

Moved to the appropriate forum, as the Developer General Discussion forum very clearly states:

Quote

General development and scripting discussions.


Do not create AutoIt-related topics here, use the AutoIt General Help and Support or AutoIt Technical Discussion forums.

Moderation Team

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to post
Share on other sites
Posted (edited)

Thanks for moving, sorry for posting this in the wrong area.

53 minutes ago, Danp2 said:

Based on your description, I would think you could do something like --

  • Use _IENavigate to load the webpage for a given user
  • Use _IETableWriteToArray to retrieve the user data
  • Write the desired contents to an Excel spreadsheet using the Excel UDF
  • Rinse and repeat 😅

Another option would be to use InetRead to retrieve the raw HTML, but then you're back to parsing the contents manually.

Can you tell us more about the website? Do they offer an API?

It's an internal tool. I am not certain as I do not really have a clue about API. I suppose if they have one I wouldn't get access. It uses a form where I can put in single usernames and then it outputs associated hardware data in a table. My knowledge goes only so far that I can manipulate the url to input the username in the url already instead of putting it into the form field. Hence my initial idea of using IEextract. If there is anything more you'd like to know I'll be happy to answer (as far as I know).

_IETableWriteToArray sounds very interesting, thank you! Will be interesting to find out if the function is able to identify each cell entry. That would save me a lot of trial'n'error regexpression work!

 

edit: I just tested_IETableWriteToArray with my use case. Works like a charm! Thank you so much, Danp2!!

Edited by Automania
Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By DannyJ
      $sCommands1 = 'powershell.exe Get-ChildItem' $iPid = run($sCommands1   , @WorkingDir , @SW_SHOW , 0x2) $sOutput = ""  While 1     $sOutput &= StdoutRead($iPID)         If @error Then             ExitLoop         EndIf  WEnd ;~ msgbox(0, '' , $sOutput) ConsoleWrite("$sOutput") ConsoleWrite($sOutput) ConsoleWrite(@CRLF) $aOutput = stringsplit($sOutput ,@LF , 2) For $i=0 To  UBound($aOutput) - 1 Step 1     ConsoleWrite($aOutput[$i]) Next The script above reads the whole directory into a one dimensional array, but I need to work with the array, so I need to split the array into multiple dimensions.
      I have already read some forum answers here, and I have already tried these commands:
       
      Are there any way to use the $aOutput variable like in PowerShell:
      PowerShell:
      $a = Get-ChildItem $a.Mode I imagine this in AutoIt  $aOutput
      ConsoleWrite($aOutput[i].Mode) Or if I split this command into 2 dimension like:
      For $i To UBound($aOutput)-1 Step 1 ConsoleWrite($aOutput[$i][1]) ConsoleWrite($aOutput[$i][2]) Next  
    • By DannyJ
      If I run this code, it works perfectly
      $CmdPid = Run("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -noexit " & 'Get-ChildItem',@DesktopDir, @SW_SHOW) But this code
      $CmdPid = Run("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -noexit " & 'Get-RDUserSession',@DesktopDir, @SW_SHOW) I get this error:
      Get-RDUserSession : The term 'Get-RDUserSession' is not recognized as the name of a cmdlet, function, script file, or o perable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try aga in. If I try run the command Get-RDUserSession  in normal PowerShell (started from windows start menu) the command works perfectly.
      But If I run with AutoIt I get the above mentioned error .
      Any ideas?
    • By SEuBo
      Hi!
      I am just getting started with C and C++. I have created a pretty simple C code which is calling a dll function.
      When I compile and run, I get the appropriate Output. So it works fine.

       
      Now I would want to transform that to AutoIt. -> I would like to call the "RfcOpenConnection" function from AutoIt - but whatever I try with DLLCall, I can not get it to work. 
      Can someone point me in the right direction? DLL, C Sourcecode and compiled exe are attached too large to be attached, so they're uploaded here: 
      https://drive.google.com/file/d/12CUSsISl0mojiMCNxKjps1Sdoox3JlCX/view?usp=sharing
       
      Thanks a bunch!
    • By sudeepjd
      I am looking for a way to pull up a Child GUI Window that users can enter information into and return that information to the main for loop which is running off an array. However, I have been unable to do so because the For loop continues even though the child window is open. If I put in another while loop inside the child window function, I am not able to poll the windows for the events looking for the close button getting clicked.
      I have put together a simple test application that shows this. Any help with holding the main loop while the child window is open and returning when the Close button is clicked is appreciated.
      In the below example, the child window contains a single text box, however, on my main application the Child GUI is much more complex with multiple pieces of information being returned.
      #include <GUIConstantsEx.au3> #include <WindowsConstants.au3> #include <Array.au3> AutoItSetOption("GUIOnEventMode", 1) Global $hGUI, $hChild Global $childClose = False Global $childValue OpenMainGUI() While Sleep(100) ConsoleWrite("Main Loop") WEnd Func OpenMainGUI() ; Create a GUI with various controls. Local $hGUI = GUICreate("Example", 400, 100) $btnMain = GUICtrlCreateButton("Open", 10, 10, 100, 30) ; Display the GUI. GUISetState(@SW_SHOW, $hGUI) ; Events GUISetOnEvent($GUI_EVENT_CLOSE, "_ExitMain", $hGUI) GUICtrlSetOnEvent($btnMain, "createChildren") EndFunc ;==>Example Func createChildren() For $i = 1 to 5 createChild($i) ConsoleWrite($i & " - " & $childValue & @CRLF) Next EndFunc Func createChild($valInp) $childClose = False $hChild = GUICreate("Child GUI", 210, 72, -1, -1, -1, -1, $hGUI) $txtOperation = GUICtrlCreateInput($valInp, 10, 10, 100, 20) $btnCloseChild = GUICtrlCreateButton("Close", 10, 40, 100, 30) GUICtrlSetOnEvent($btnCloseChild, "_ExitChild") ; Display Child GUISetState(@SW_SHOW) ;Wait here till Close button is clicked While $childClose = False ConsoleWrite("Stuck in this loop..." & @LF) Sleep(100) $aMsg = GUIGetMsg(1) If $aMsg[0] >0 Then _ArrayDisplay($aMsg) Wend EndFunc Func _ExitMain() Exit EndFunc Func _ExitChild() ConsoleWrite("Exit Child Called") $childValue = GUICtrlRead($txtOperation) $childClose = True GUIDelete($hChild) EndFunc Thanks in advance for any help offered.
      My other alternative is to create a separate EXE for the child window and use ShellExecuteWait to wait for the child window to close before the loop continues, but I am hoping to avoid doing that.
    • By sudeepjd
      I am building an application which needs a child panel in the GUI Control that needs to be scrolled as it contains controls that extend beyond the panel height. But I also need to have the users to be able to Tab through those controls. I don't seem to be able to to both working together. 
      The Tabstops can be allowed on children by using $WS_EX_CONTROLPARENT and the Scrollbar creation using the GUIScrollBar.au3.
      If I set the $WS_EX_CONTROLPARENT it drags the entire child and does not allow the scroll, but if I remove it, the Scroll works but the tabstops don't. Please see the below sample application that can help reproduce this problem.
       
      #include <GUIConstants.au3> #include <GUIScrollbars.au3> #include <WindowsConstants.au3> Opt("GUIOnEventMode", 1) ; Change to OnEvent mode Global $width=500, $height=500, $titel="Tabtest in Childwindow" ; create the parentwindow $mainwindow = GUICreate($titel, $width, $height, -1, -1) GUISetBkColor(0x00ffff) ; show the parentwindow GUISetState(@SW_SHOW,$mainwindow) ; check on screen sleep(1000) ; create the childwindow with the scrollbars active ->TABSTOPS Dont work $childwindo1 = GUICreate("child", 220,$height, 10 ,0,$WS_CHILD, -1 ,$mainwindow) _GUIScrollBars_Init($childwindo1, 100, 100) GUISetBkColor(0xff0000, $childwindo1) GUISetState(@SW_SHOW, $childwindo1) $input_1 = GUICtrlCreateInput("Scroll Works",10,10) $input_2 = GUICtrlCreateInput("Tab Does Not",10,40) ; create the childwindow with the scrollbars active ->TABSTOPS Work, Scroll does not because window moves $childwindo2 = GUICreate("child", 220, $height, 240 ,0,$WS_CHILD, $WS_EX_CONTROLPARENT ,$mainwindow) _GUIScrollBars_Init($childwindo2, 100, 100) GUISetBkColor(0xff0000, $childwindo2) GUISetState(@SW_SHOW, $childwindo2) $input_3 = GUICtrlCreateInput("Tab Works",10,10) $input_4 = GUICtrlCreateInput("Scroll Does Not",10,40) GUIRegisterMsg($WM_VSCROLL, "WM_VSCROLL") ; register close GUISetOnEvent($GUI_EVENT_CLOSE, "close_it",$mainwindow) ;loop While 1 Sleep(100) ; Idle around WEnd Func close_it() ; exit application GUIDelete($mainwindow) exit EndFunc $childwindo = GUICreate("child",$width,$height,0,0,$WS_CHILD,-1,$mainwindow) GUISetBkColor(0xff0000) GUISetState(@SW_SHOW,$childwindo) Func WM_VSCROLL($hWnd, $iMsg, $wParam, $lParam) #forceref $iMsg, $wParam, $lParam Local $iScrollCode = BitAND($wParam, 0x0000FFFF) Local $iIndex = -1, $iCharY, $iPosY Local $iMin, $iMax, $iPage, $iPos, $iTrackPos For $x = 0 To UBound($__g_aSB_WindowInfo) - 1 If $__g_aSB_WindowInfo[$x][0] = $hWnd Then $iIndex = $x $iCharY = $__g_aSB_WindowInfo[$iIndex][3] ExitLoop EndIf Next If $iIndex = -1 Then Return 0 ; Get all the vertial scroll bar information Local $tSCROLLINFO = _GUIScrollBars_GetScrollInfoEx($hWnd, $SB_VERT) $iMin = DllStructGetData($tSCROLLINFO, "nMin") $iMax = DllStructGetData($tSCROLLINFO, "nMax") $iPage = DllStructGetData($tSCROLLINFO, "nPage") ; Save the position for comparison later on $iPosY = DllStructGetData($tSCROLLINFO, "nPos") $iPos = $iPosY $iTrackPos = DllStructGetData($tSCROLLINFO, "nTrackPos") Switch $iScrollCode Case $SB_TOP ; user clicked the HOME keyboard key DllStructSetData($tSCROLLINFO, "nPos", $iMin) Case $SB_BOTTOM ; user clicked the END keyboard key DllStructSetData($tSCROLLINFO, "nPos", $iMax) Case $SB_LINEUP ; user clicked the top arrow DllStructSetData($tSCROLLINFO, "nPos", $iPos - 1) Case $SB_LINEDOWN ; user clicked the bottom arrow DllStructSetData($tSCROLLINFO, "nPos", $iPos + 1) Case $SB_PAGEUP ; user clicked the scroll bar shaft above the scroll box DllStructSetData($tSCROLLINFO, "nPos", $iPos - $iPage) Case $SB_PAGEDOWN ; user clicked the scroll bar shaft below the scroll box DllStructSetData($tSCROLLINFO, "nPos", $iPos + $iPage) Case $SB_THUMBTRACK ; user dragged the scroll box DllStructSetData($tSCROLLINFO, "nPos", $iTrackPos) EndSwitch ; // Set the position and then retrieve it. Due to adjustments ; // by Windows it may not be the same as the value set. DllStructSetData($tSCROLLINFO, "fMask", $SIF_POS) _GUIScrollBars_SetScrollInfo($hWnd, $SB_VERT, $tSCROLLINFO) _GUIScrollBars_GetScrollInfo($hWnd, $SB_VERT, $tSCROLLINFO) ;// If the position has changed, scroll the window and update it $iPos = DllStructGetData($tSCROLLINFO, "nPos") If ($iPos <> $iPosY) Then _GUIScrollBars_ScrollWindow($hWnd, 0, $iCharY * ($iPosY - $iPos)) $iPosY = $iPos EndIf Return $GUI_RUNDEFMSG EndFunc ;==>WM_VSCROLL  
      Any help to get both working together is appreciated.
      Thanks,
      Sudeep.
×
×
  • Create New...