Sign in to follow this  
Followers 0
skippynz

Autoit crashing while reading 10 millions rows from a database

10 posts in this topic

Hi all,

Its been a long time since I've posted as I can generally find the answers to my issues but I'm stuck on this one....

As noted in the title of this post - I'm trying to read in a database with over 10 million rows of data in it.  I get the following error in SciTE "AutoIt3.exe ended.rc:-1073741819" and this is the content of the "AutoIt v3 script has stopped working" message 

Problem signature:
  Problem Event Name:   APPCRASH
  Application Name: autoit3_x64.exe
  Application Version:  3.3.8.1
  Application Timestamp:    4f25bafd
  Fault Module Name:    autoit3_x64.exe
  Fault Module Version: 3.3.8.1
  Fault Module Timestamp:   4f25bafd
  Exception Code:   c0000005
  Exception Offset: 0000000000076268
  OS Version:   6.1.7601.2.1.0.256.4
  Locale ID:    5129
  Additional Information 1: 8398
  Additional Information 2: 83986c6c9a3edb8b92b4f434b9cdbd82
  Additional Information 3: f32f
  Additional Information 4: f32fff672878279da2fd64dbcbbaa64e

Read our privacy statement online:
  http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409

If the online privacy statement is not available, please read our privacy statement offline:
  C:\Windows\system32\en-US\erofflps.txt

I can read in 5 million rows ok but as soon as i try to read in the whole database the memory rises to over 1.5Gb and then the crash occurs

Here is my coding for the database read.

Func backup_database_contents()
    $backup_folder = FileSelectFolder('Please select the folder you would like to save the database backup to.', @DesktopDir, 7)
    $objquery = $objConn.Execute("SELECT column1,column2 from my_tbl")
    With $objquery
        $SQL_Result = .GetRows()
        ConsoleWrite($SQL_Result)
        If IsArray($SQL_Result) Then
            $iColumns = UBound($SQL_Result, 2)
            $iRows = UBound($SQL_Result)
            ReDim $SQL_Result[$iRows + 1][$iColumns];Adjust the array to fit the column names and move all data down 1 row
            For $x = $iRows To 1 Step -1
                For $y = 0 To $iColumns - 1
                    $SQL_Result[$x][$y] = $SQL_Result[$x - 1][$y]
                Next
            Next
;~          ;Add the coloumn names
            For $i = 0 To $iColumns - 1 ;get the column names and put into 0 array element
                $SQL_Result[0][$i] = .Fields($i).Name
            Next
        EndIf
    EndWith
    $rowcount = $iRows - 1
    $backup_file = $backup_folder & '\Database_Backup_' & @MDAY & '-' & @MON & '-' & @YEAR & '_' & @HOUR & @MIN & @SEC & '.txt'
    For $v = 1 To $rowcount
        FileWriteLine($backup_file, $SQL_Result[$v][0] & '*' & $SQL_Result[$v][1])
        GUICtrlSetData($progressbar, $v / $rowcount * 100)
    Next

    MsgBox(64, 'Backup complete', 'The database backup to the file "' & $backup_file & '" is complete.')
    GUICtrlSetData($progressbar, 0)
EndFunc   ;==>backup_database_contents

I have narrowed it down to this line - $SQL_Result = .GetRows() - but not sure if there is a way of fixing it so any help would be appreciated.

Cheers

Skip

Share this post


Link to post
Share on other sites



There must be limits in the arrays that can be created.  Why not just skip the 'middle man' of transporting all the data to an array, and instead go straight from the results object to the filewrite?

That will save a ton of execution time also.


IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

i would if i knew how :)

any chance you can put me on the right track

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Maybe start by writing your column names on the first line, and then in the second loop, where you are writing to the array...change that to write the line in the file?

edit - unless it hangs on the .getrows()?

edit2 - if that is the case, can you pull the data twice, when you select the table? maybe pull half of it and then the next time pull the rest, by starting at row X?

Edited by nitekram

All by me:

"Sometimes you have to go back to where you started, to get to where you want to go." 

"Everybody catches up with everyone, eventually" 

"As you teach others, you are really teaching yourself."

From my dad

"Do not worry about yesterday, as the only thing that you can control is tomorrow."

 

WindowsError.gif

WIKI | Tabs; | Arrays; | Strings | Wiki Arrays | How to ask a Question | Forum Search | FAQ | Tutorials | Original FAQ | ONLINE HELP | UDF's Wiki | AutoIt PDF

AutoIt Snippets | Multple Guis | Interrupting a running function | Another Send

StringRegExp | StringRegExp Help | RegEXTester | REG TUTOR | Reg TUTOT 2

AutoItSetOption | Macros | AutoIt Snippets | Wrapper | Autoit  Docs

SCITE | SciteJump | BB | MyTopics | Programming | UDFs | AutoIt 123 | UDFs Form | UDF

Learning to script | Tutorials | Documentation | IE.AU3 | Games? | FreeSoftware | Path_Online | Core Language

Programming Tips

Excel Changes

ControlHover.UDF

GDI_Plus

Draw_On_Screen

GDI Basics

GDI_More_Basics

GDI Rotate

GDI Graph

GDI  CheckExistingItems

GDI Trajectory

Replace $ghGDIPDll with $__g_hGDIPDll

DLL 101?

Array via Object

GDI Swimlane

GDI Plus French 101 Site

GDI Examples UEZ

GDI Basic Clock

GDI Detection

Ternary operator

Share this post


Link to post
Share on other sites

How about doing a DUMP directly from the database instead of iterating through it with AutoIt?


Whenever someone says "pls" because it's shorter than "please", I say "no" because it's shorter than "yes".

Share this post


Link to post
Share on other sites

AutoIt arrays are limited to 16 million cells.

Now I've no idea about which SQL engine you're interfacing but there must exist a more robust backup solution than fetching raw data and storing it in a flat file.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Here is a simple script to loop through results...

of course, you have to plug in the connection string, and query:

$sConnectionString = ""
$sSQLStatement = "select * from YOURTABLE"

$sSeperatorChar = "|"

$ado = ObjCreate("ADODB.Connection")
If @error = 1 Then
    MsgBox(1,1,"connection object failed")
    Exit
EndIf
$adors = ObjCreate("ADODB.RecordSet")
If @error = 1 Then
    MsgBox(1,1,"recordset object failed")
    Exit
EndIf

$ado.Open($sConnectionString)
$adors.Open($sSQLStatement, $ado)

ConsoleWrite($adors.Fields.item(0).name)
For $i = 1 To $adors.Fields.Count - 1
    ConsoleWrite($sSeperatorChar & $adors.Fields.item($i).name)
Next
ConsoleWrite(@CRLF)

While Not $adors.EOF
    ConsoleWrite($adors.Fields.item(0).value)
    For $i = 1 To $adors.Fields.Count - 1
        ConsoleWrite($sSeperatorChar & $adors.Fields.item($i).value)
    Next
    $adors.MoveNext
    ConsoleWrite(@CRLF)
WEnd

$adors.Close
$adors = 0

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Thanks for the replies everyone.

 

Maybe start by writing your column names on the first line, and then in the second loop, where you are writing to the array...change that to write the line in the file?

edit - unless it hangs on the .getrows()?

edit2 - if that is the case, can you pull the data twice, when you select the table? maybe pull half of it and then the next time pull the rest, by starting at row X?

 

.getrows() is where the crash is happening.  Splitting the command into multiple chunks might be the only way that i can do it.  I will have to put the thinking cap on...

 

How about doing a DUMP directly from the database instead of iterating through it with AutoIt?

 

I dont control the database server, or the backups of it.  By using AutoIt i can import content from text files etc and the hope was to create a backup of the database before importing files.

 

AutoIt arrays are limited to 16 million cells.

Now I've no idea about which SQL engine you're interfacing but there must exist a more robust backup solution than fetching raw data and storing it in a flat file.

 

If arrays are limited to 16 million cells then that could explain the issue. i have over 10 million rows with 2 columns so that would equate to over 20 million cells i guess.....the database being used is MS SQL - ideally i would just setup a backup from the server but without access to it and the IT guys being less than friendly about our access i wanted another way of keeping our data safe...

Edited by skippynz

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

Breaking up the query reminds me of a popular interview question... how do you get the second highest paid employee from a table.

"select top 1 from YOURTABLE where id in (select top 2 id from YOURTABLE order by pay desc) order by pay asc"

just need to change it up to your tables, and break outs.

Example of how to create your queries (where 2000050 is the return of a query that queries the count() of records on the table):

$igetvalue = 2000050

$iJumps = 500000
For $i = 500000 To $igetvalue Step 500000
    ConsoleWrite($i & @CRLF)
    If $igetvalue-$i < $iJumps Then
        $iJumps = $iJumps+$igetvalue-$i
        $iTotal = $igetvalue
    Else
        $iTotal = $i
    EndIf
    $sSQLStatement = "select top " & $iJumps & " from YOURTABLE where id in (select top " & $iTotal & " from YOURTABLE order by id asc) order by id desc"
    ConsoleWrite($sSQLStatement & @CRLF)
Next

outputs:

500000
select top 500000 from YOURTABLE where id in (select top 500000 from YOURTABLE order by id asc) order by id desc
1000000
select top 500000 from YOURTABLE where id in (select top 1000000 from YOURTABLE order by id asc) order by id desc
1500000
select top 500000 from YOURTABLE where id in (select top 1500000 from YOURTABLE order by id asc) order by id desc
2000000
select top 500050 from YOURTABLE where id in (select top 2000050 from YOURTABLE order by id asc) order by id desc

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

Thanks for that code Universalist - i need to work on the sql code but that should get me started...

cheers

Skip

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0