Jump to content

[solved] extract data from website and export to Excel - (Moved)


Recommended Posts

Hi all,

I haven't used AutoIt in more than 10 years and I am sure a lot has improved since that long time. I hope you can give me some suggestions on my approach.

Task: I need to extract user data (for around 1700 users) from a website tool. That tool shows an output in a table on the website. However, no export feature is available and I need the data in an Excel file, such as:

username, serial number (of a laptop), ID number (of laptop) and some more

 

With my knowledge from 2009 I would do this:

1) use _IEextract with each username in the url to get the whole source code of the website with the user's data summary

2) Work with lots of regexpressions to extract each data piece, save them into variables/array

3) Write variable values into an Excel file

4) rinse repeat 1700 times

 

The relevant line for step 3 looks like this:

<td class="resultcell"><span class="new">2021-03-23 11:05:00</span></td><td class="resultcell">Hostname-1234</td><td class="resultcell"><a href="?&Search=Search&result=summarized%20history&field=serial%20numbers&criteria=123456">123456</a></td><td class="resultcell">0987654/td><td class="resultcell"><a href="?&Search=Search&result=summarized%20history&field=usernames&criteria=myusername">myusername</a>

and so on.. so here it would be Hostname-1234, 0987654 and myusername that I would need to extract.


Although this may work it does not appear very efficient and would take a while. So I am happy for an alternate approach. Preferably, without using additional exe binary files due to company policies besides AutoIt itself.

Edited by Automania
Link to comment
Share on other sites

Based on your description, I would think you could do something like --

  • Use _IENavigate to load the webpage for a given user
  • Use _IETableWriteToArray to retrieve the user data
  • Write the desired contents to an Excel spreadsheet using the Excel UDF
  • Rinse and repeat 😅

Another option would be to use InetRead to retrieve the raw HTML, but then you're back to parsing the contents manually.

Can you tell us more about the website? Do they offer an API?

Link to comment
Share on other sites

  • Moderators

Moved to the appropriate forum, as the Developer General Discussion forum very clearly states:

Quote

General development and scripting discussions.


Do not create AutoIt-related topics here, use the AutoIt General Help and Support or AutoIt Technical Discussion forums.

Moderation Team

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

Thanks for moving, sorry for posting this in the wrong area.

53 minutes ago, Danp2 said:

Based on your description, I would think you could do something like --

  • Use _IENavigate to load the webpage for a given user
  • Use _IETableWriteToArray to retrieve the user data
  • Write the desired contents to an Excel spreadsheet using the Excel UDF
  • Rinse and repeat 😅

Another option would be to use InetRead to retrieve the raw HTML, but then you're back to parsing the contents manually.

Can you tell us more about the website? Do they offer an API?

It's an internal tool. I am not certain as I do not really have a clue about API. I suppose if they have one I wouldn't get access. It uses a form where I can put in single usernames and then it outputs associated hardware data in a table. My knowledge goes only so far that I can manipulate the url to input the username in the url already instead of putting it into the form field. Hence my initial idea of using IEextract. If there is anything more you'd like to know I'll be happy to answer (as far as I know).

_IETableWriteToArray sounds very interesting, thank you! Will be interesting to find out if the function is able to identify each cell entry. That would save me a lot of trial'n'error regexpression work!

 

edit: I just tested_IETableWriteToArray with my use case. Works like a charm! Thank you so much, Danp2!!

Edited by Automania
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...