Jump to content

Loading Large Table for lookup


 Share

Recommended Posts

I have a table that'd I'd like to lookup things in my script based on the input.  Searching this forum I found some old posts about _ExcelReadSheetToArray().  After not getting it to work, I realized that is no longer in the UDF and _Excel_Range_Read is to be used instead.  I also read the _Excel_RangeFind is another method to lookup data in a table.  This method seems to use Excel to run these functions, whereas _Excel_Range_Read loads the entire range as an array and autoit does the work.

I'm looking at loading a 30k row csv with 3 columns.  Does anyone know if either this methods are better with this amount of data?  Pros/cons?  I'm leaning towards the _Excel_RangeFind so Excel can just run in the background and be the "database" vs. my script holding all that data in a massive array.  Or maybe there's a completely different method?

Let me know your thoughts!

Link to comment
Share on other sites

An alternative would be to load the data into an SQLite database only once and query it in SQL.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

  • 2 weeks later...

I often work with large volume spreadsheet data. 

After many years I finally made the decision to move all data to SQL ASAP and then work in SQL. I retain the original spreadsheets for backup and reference purposes, but it is just so much easier to work with data in SQL. 

Typical workflow would be to

 

  • Prep the spreadsheets for import, change empty cells to blank space or zero, unless I want to retain default NULLs. 
  • Save spreadsheet as CSV.
  • Create a temp SQL table the looks like the data. 
  • Load CSV into SQL.
  • Do quality control checks.
  • Proceed in SQL.

SQLite has the several benefits. Including it allows dynamically typed data columns, similar to spreadsheet and it provides for CLI dot imports,

SQLite Commands - SQL Docs

Extract, transform, load - Wikipedia

For your specific situation SQL, would provide an additional layer of data security in the sense that casual, unintentional data modification or deletion would not be a major concern.

Skysnake

Why is the snake in the sky?

Link to comment
Share on other sites

Either solve in excel or in au3. Why are both needed? 

30k is not large for excel or au3.

In general there is always a faster algorithm around. Sequentially searching or indexed searches can help but for 30k Sequentially will work.

Is it csv or xls as your input?

if csv deal with it in powershell

What are your requirements and use cases?

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...