Jump to content

[solved] extract data from website and export to Excel - (Moved)


Recommended Posts

Hi all,

I haven't used AutoIt in more than 10 years and I am sure a lot has improved since that long time. I hope you can give me some suggestions on my approach.

Task: I need to extract user data (for around 1700 users) from a website tool. That tool shows an output in a table on the website. However, no export feature is available and I need the data in an Excel file, such as:

username, serial number (of a laptop), ID number (of laptop) and some more

 

With my knowledge from 2009 I would do this:

1) use _IEextract with each username in the url to get the whole source code of the website with the user's data summary

2) Work with lots of regexpressions to extract each data piece, save them into variables/array

3) Write variable values into an Excel file

4) rinse repeat 1700 times

 

The relevant line for step 3 looks like this:

<td class="resultcell"><span class="new">2021-03-23 11:05:00</span></td><td class="resultcell">Hostname-1234</td><td class="resultcell"><a href="?&Search=Search&result=summarized%20history&field=serial%20numbers&criteria=123456">123456</a></td><td class="resultcell">0987654/td><td class="resultcell"><a href="?&Search=Search&result=summarized%20history&field=usernames&criteria=myusername">myusername</a>

and so on.. so here it would be Hostname-1234, 0987654 and myusername that I would need to extract.


Although this may work it does not appear very efficient and would take a while. So I am happy for an alternate approach. Preferably, without using additional exe binary files due to company policies besides AutoIt itself.

Edited by Automania
Link to post
Share on other sites

Based on your description, I would think you could do something like --

  • Use _IENavigate to load the webpage for a given user
  • Use _IETableWriteToArray to retrieve the user data
  • Write the desired contents to an Excel spreadsheet using the Excel UDF
  • Rinse and repeat 😅

Another option would be to use InetRead to retrieve the raw HTML, but then you're back to parsing the contents manually.

Can you tell us more about the website? Do they offer an API?

Link to post
Share on other sites
  • Moderators

Moved to the appropriate forum, as the Developer General Discussion forum very clearly states:

Quote

General development and scripting discussions.


Do not create AutoIt-related topics here, use the AutoIt General Help and Support or AutoIt Technical Discussion forums.

Moderation Team

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to post
Share on other sites

Thanks for moving, sorry for posting this in the wrong area.

53 minutes ago, Danp2 said:

Based on your description, I would think you could do something like --

  • Use _IENavigate to load the webpage for a given user
  • Use _IETableWriteToArray to retrieve the user data
  • Write the desired contents to an Excel spreadsheet using the Excel UDF
  • Rinse and repeat 😅

Another option would be to use InetRead to retrieve the raw HTML, but then you're back to parsing the contents manually.

Can you tell us more about the website? Do they offer an API?

It's an internal tool. I am not certain as I do not really have a clue about API. I suppose if they have one I wouldn't get access. It uses a form where I can put in single usernames and then it outputs associated hardware data in a table. My knowledge goes only so far that I can manipulate the url to input the username in the url already instead of putting it into the form field. Hence my initial idea of using IEextract. If there is anything more you'd like to know I'll be happy to answer (as far as I know).

_IETableWriteToArray sounds very interesting, thank you! Will be interesting to find out if the function is able to identify each cell entry. That would save me a lot of trial'n'error regexpression work!

 

edit: I just tested_IETableWriteToArray with my use case. Works like a charm! Thank you so much, Danp2!!

Edited by Automania
Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By bogQ
      Simple script latest autoit version.
      #include <GUIConstantsEx.au3> #include <FontConstants.au3> Example() Func Example() GUICreate("test", 800, 540) GUISetFont(12, $FW_NORMAL, $GUI_FONTNORMAL) GUICtrlCreateLabel("testing",680,310) GUISetState(@SW_SHOW) Do Until GUIGetMsg() = $GUI_EVENT_CLOSE GUIDelete() EndFunc ;==>Example After using GUISetFont(12) or GUISetFont(12, $FW_NORMAL, $GUI_FONTNORMAL) every GUI control is changed to italic.
       
      Am i doing something wrong?

    • By woffi
      Hi,
      I'm afraid I'm just stupid or blind or both: how can I read user input from an AutoIt console program? Just a simple String input, terminated with pressing "Return"? 

      This can't be difficult, but I can't find a solution.
    • By WhaleJesus
      #include <FileConstants.au3> #include <MsgBoxConstants.au3> #include <file.au3> ; Create Data Folder if it doesn't exist yet If FileExists(@ScriptDir & "\Data") Then Else ShellExecute(@ScriptDir) DirCreate(@ScriptDir & "\Data") EndIf ; Playlist Name & location input Global $playlistnameinput = InputBox("Playlist", "Enter The playlist name", _ "Name") Global $playlistlocationinput = InputBox("Location", "Specify where you would like the playlist folder to be stored", @ScriptDir & "\Playlists\" & $playlistnameinput) ; Create file in Data folder and other vars Global $sDataFile = @ScriptDir & "\Data\Data.txt" Global $DataHandle = FileOpen($sDataFile, 1) Global $DataFileLine = FileReadLine($sDataFile, 1) FileClose($DataFileLine) MsgBox(0, "", $DataFileLine, 10) ; Prove it exists If FileExists($sDataFile) Then _FileWriteToLine($DataHandle, $DataFileLine, $playlistnameinput, True, True) $DataFileLine += 1 _FileWriteToLine($DataHandle, 1, $DataFileLine, True) Else MsgBox($MB_SYSTEMMODAL, "Error", "File " & $sDataFile & "Does not exist") EndIf Global $sPDataFile = @ScriptDir & "\Data\" & $playlistnameinput & "_Data.txt" Global $PDataHandle = FileOpen($sPDataFile, 1) If FileExists($sPDataFile) Then _FileWriteToLine($PDataHandle, 1, $playlistnameinput, True, True) _FileWriteToLine($PDataHandle, 2, $playlistlocationinput, True, True) Else MsgBox($MB_SYSTEMMODAL, "Error", "File " & $sPDataFile & "Does not exist") EndIf _FileWriteToLine stopped working and i don't know what it is in my code that's causing this, please help
    • By DannyJ
      $sCommands1 = 'powershell.exe Get-ChildItem' $iPid = run($sCommands1   , @WorkingDir , @SW_SHOW , 0x2) $sOutput = ""  While 1     $sOutput &= StdoutRead($iPID)         If @error Then             ExitLoop         EndIf  WEnd ;~ msgbox(0, '' , $sOutput) ConsoleWrite("$sOutput") ConsoleWrite($sOutput) ConsoleWrite(@CRLF) $aOutput = stringsplit($sOutput ,@LF , 2) For $i=0 To  UBound($aOutput) - 1 Step 1     ConsoleWrite($aOutput[$i]) Next The script above reads the whole directory into a one dimensional array, but I need to work with the array, so I need to split the array into multiple dimensions.
      I have already read some forum answers here, and I have already tried these commands:
       
      Are there any way to use the $aOutput variable like in PowerShell:
      PowerShell:
      $a = Get-ChildItem $a.Mode I imagine this in AutoIt  $aOutput
      ConsoleWrite($aOutput[i].Mode) Or if I split this command into 2 dimension like:
      For $i To UBound($aOutput)-1 Step 1 ConsoleWrite($aOutput[$i][1]) ConsoleWrite($aOutput[$i][2]) Next  
    • By DannyJ
      If I run this code, it works perfectly
      $CmdPid = Run("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -noexit " & 'Get-ChildItem',@DesktopDir, @SW_SHOW) But this code
      $CmdPid = Run("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -noexit " & 'Get-RDUserSession',@DesktopDir, @SW_SHOW) I get this error:
      Get-RDUserSession : The term 'Get-RDUserSession' is not recognized as the name of a cmdlet, function, script file, or o perable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try aga in. If I try run the command Get-RDUserSession  in normal PowerShell (started from windows start menu) the command works perfectly.
      But If I run with AutoIt I get the above mentioned error .
      Any ideas?
×
×
  • Create New...