Jump to content

Fastest way to import data in SQLite DB


Recommended Posts

Good morning community! :)

I am working on a script which read from a text file ( .txt ) and should import all the content in a SQLite3 DB, in order to execute some queries that should be difficult to execute on a text file.
So, I was looking for something very very fast, because the file could be very large ( I don't know exaclty how much can became big, but I know a lot of rows, it's a log file ... )
I found the "Import method", but I don't know If I can implement it in a query ( @jchd, it's your turn! :D )
Do you know some methods that I can implement in my script to have a very very fast import of thousands and thousands rows in a SQLite3 DB? :)

Thanks a lot :)

Francesco

Edited by FrancescoDiMuro

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

Can you post an example of your input file format, its field types, along with the full DB schema?

Also is it a one time process or something which will run often?

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Good morning @jchd! :) I wish you replied to me! :D

The process could run once a day or more, but the file could become very large...
Everyday are stored lines and lines of text... And this could be for years...

The .txt file has this format:
 

"Time_ms"   "MsgProc"   "StateAfter"    "MsgClass"  "MsgNumber" "Var1"  "Var2"  "Var3"  "Var4"  "Var5"  "Var6"  "Var7"  "Var8"  "TimeString"    "MsgText"   "PLC"   "Checksum"
42864651050.3009    1   1   3   70018                                   "2017-05-09 15:37:31"   "Importazione gestione utenti terminata senza errori."      rltVew

With tabs too...

I did also a BEGIN TRANSACTION and COMMIT at the start and at the end of all INSERTs...
So, the script should run this:
BEGIN TRANSACTION;
     INSERT 1...
     INSERT 2...
     INSERT 3...
     INSERT N...
COMMIT;

This should run faster, or not? :)

Thank you dear!

Francesco

EDIT:

I read the .txt file with _FileReadToArray, then, in a For...Next loop, I remove the tabs, insert some ; , and split the text in order to do the INSERT...
 

Edited by FrancescoDiMuro

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

16 minutes ago, AspirinJunkie said:

The sqlite3.exe knows a .import-statement which works very fast (more explanation).
For this you need the data in csv-Syntax.
If your data does not fit this requirement you can use AutoIt to convert your data into a csv-style.
Then you can import the data by using the _SQLite_SQLiteExe()-function.

Thanks for your reply @AspirinJunkie:)

Is this method faster then the normal INSERT with TRANSICTIONS?

And, can I use the .import in a query executed with _SQLite_Exec? Or I have to do it through CMD? 

Thank you :) 

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

47 minutes ago, FrancescoDiMuro said:

Is this method faster then the normal INSERT with TRANSICTIONS?

My last tests with this are long ago. But in my mind the .import-function was much faster than insert-statement - even with transaction.
Especially at big number of inserts.

47 minutes ago, FrancescoDiMuro said:

And, can I use the .import in a query executed with _SQLite_Exec? Or I have to do it through CMD? 

_SQLite_SQLiteExe() is an wrapper for the sqlite3.exe. It doesn't use the sqlite3.dll like the other _SQLITE*-functions do.
So take care that the function can find a proper sqlite3.exe and then this function can do everything what the sqlite3.exe can do.

Edited by AspirinJunkie
Link to comment
Share on other sites

@AspirinJunkie, thanks for the reply :)

But now, a question arises...

How can I, always in the fast way possible, replace @TAB with ";", and put some double quotes in the fields of the text file, in order to let recognize the .import of SQLite3? :)

Thank you! :) 

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

Why you wan't to replace the @TABs? As in the example to _SQLITE_SQLiteExe there is a .separater-statement where you can define that the separator for the data fields is a @TAB instead of a ;

Then the double quotes should only be necessary if a field contains a @TAB as data.

Link to comment
Share on other sites

@AspirinJunkie,

The double quotes are needed for let recognize the .csv from the sqlite3.exe, or not?

Example:

; This should be the format of a CSV:

"Data1";"Data2";"Data3";"DataN";     ; Keyboard and Regional Settings are Italian

If I have data without those double quotes, how can I let recognize the .csv format from sqlite3.exe?

Thank you :) 

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

If you set the separater to @TAB then you only need double quotes around a data field if the data field contains Tabs as data.
If the file structure is really like you said (@Tab as separator, column-names in first row) then the import for this file is quite simple:

#include <SQLite.au3>

Global $s_InputFile_Path = @ScriptDir & "\mydata.txt"
Global $s_DataBase_Path = @ScriptDir & "\mydatabase.db"
Global $s_Out

_SQLite_SQLiteExe($s_DataBase_Path, _
    ".separator \t" & @CRLF & _
    ".import '" & $s_InputFile_Path & "' testtable" & @CRLF _
    , $s_Out)

just put a valid sqlite3.exe (maybe from here) in the directory of your script.
The script should then create a new database with a new table named "testtable" filled with your data from "mydata.txt".
You can erase all double quotes because it seems that your data fields doesn't contain the separator as data.

Link to comment
Share on other sites

I just tried this directly in the sqlite3.exe, and it returned this message:

Quote

Error: multi-character column separators not allowed for import

Does the command .import should create the table, or have I to create it before the import?

Thanks :) 

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

I don't know what you exactly done.
I gave you an example - did you tried it?

35 minutes ago, FrancescoDiMuro said:

Does the command .import should create the table, or have I to create it before the import?

it's your choice.
If the table already exist when you use the .import statement then you have to delete the first row of the file because it would treatened as a data row and not as the header names.

Link to comment
Share on other sites

Yes, I tested your code, and it does nothing 'cause it can't find the "sqlite3.exe", even If I set it in the function:

_SQLite_SQLiteExe($sUser_DB, _
                                ".separator \t" & @CRLF & _
                                ".import '" & $sUser_TXT_File & "' ALLARMI" & @CRLF , _
                                $sOutput, _                                     
                                @ScriptDir & "\SQLite\sqlite3.exe")

I tested too this pattern directly from sqlite3.exe, and it returns the error I've posted in the post #13...

What am I doing wrong?

Thanks :) 

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

19 minutes ago, FrancescoDiMuro said:

it does nothing 'cause it can't find the "sqlite3.exe"

Why do you know that's the reason?
Because of the @error-value?

Did you tried to download the sqlite3.exe from the source i gave you and copy it directly to the script-directory?

Link to comment
Share on other sites

You can significantly speed up the process by using bulk inserts, e.g. grouping dozens of inserts into one statement:

insert into T values (<values for fields of row1>), (<values for fields of row2>), (<values for fields of row3>), ..., (<values for fields of rowN>);

Finally you can use the SQLite3 CLI (command line interpreter) like so:

C:\Users\jc\Documents\AutoMAT\tmp>sqlite3 testimport.sq3
SQLite version 3.18.0 2017-03-28 18:48:43
Enter ".help" for usage hints.
sqlite> .separator \t
sqlite> .import tstin.dsv ImportedData
Error: cannot open "tstin.dsv"
sqlite> .import tstin.csv ImportedData
sqlite> select count(*) from ImportedData;
11183
sqlite> select * from ImportedData limit 5;
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
sqlite> .quit

C:\Users\jc\Documents\AutoMAT\tmp>

Sorry for being late to the party: I started composing the answer much earlier but had to move urgently.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

20 minutes ago, AspirinJunkie said:

Why do you know that's the reason?
Because of the @error-value?

Did you tried to download the sqlite3.exe from the source i gave you and copy it directly to the script-directory?

I downloaded you sqlite3.exe even If I already had one, and put in the @ScriptDir...
I error checked the function, and it returns @error = 2, which means that it can't find the sqlite3.exe ( Help file ).

I don't know why...

Thanks for your help :) 

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

17 minutes ago, jchd said:

You can significantly speed up the process by using bulk inserts, e.g. grouping dozens of inserts into one statement:

insert into T values (<values for fields of row1>), (<values for fields of row2>), (<values for fields of row3>), ..., (<values for fields of rowN>);

Finally you can use the SQLite3 CLI (command line interpreter) like so:

C:\Users\jc\Documents\AutoMAT\tmp>sqlite3 testimport.sq3
SQLite version 3.18.0 2017-03-28 18:48:43
Enter ".help" for usage hints.
sqlite> .separator \t
sqlite> .import tstin.dsv ImportedData
Error: cannot open "tstin.dsv"
sqlite> .import tstin.csv ImportedData
sqlite> select count(*) from ImportedData;
11183
sqlite> select * from ImportedData limit 5;
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
42864651050.3009        1       1       3       70018
                2017-05-09 15:37:31     Importazione gestione utenti terminata senza errori.
rltVew
sqlite> .quit

C:\Users\jc\Documents\AutoMAT\tmp>

Sorry for being late to the party: I started composing the answer much earlier but had to move urgently.

I'm not following you, sorry...

EDIT:

I'll try to understand you...
It's better do something like:

"INSERT INTO Sample ( columns ) VALUES ( values of columns, a dozen )"

than a For...Next loop with N inserts? N = Number of rows in the text file - 1.

Thanks @jchd :) 

Edited by FrancescoDiMuro

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

1 hour ago, FrancescoDiMuro said:

I downloaded you sqlite3.exe even If I already had one, and put in the @ScriptDir...
I error checked the function, and it returns @error = 2, which means that it can't find the sqlite3.exe ( Help file ).

That's need to be analysed - to ensure it's not a bug.
Put the sqlite3.exe in the same directory as the script (no subfolder or anything else).
Then only change the value of $s_InputFile_Path and leave everything else in the script as is.
Then run the script, post the output of the msgbox and tell if the mydatabase.db is correctly created and filled:
 

#include <SQLite.au3>

Global $s_InputFile_Path = @ScriptDir & "\mydata.txt"
Global $s_DataBase_Path = @ScriptDir & "\mydatabase.db"
Global $s_SQLITE3EXE_Path = @ScriptDir & "\sqlite3.exe"
Global $s_Out

MsgBox(0,"", StringFormat("$s_SQLITE3EXE_Path = %s\nFileExists = %s\nWorkingDir = %s" , $s_SQLITE3EXE_Path, FileExists($s_SQLITE3EXE_Path), @WorkingDir))

_SQLite_SQLiteExe($s_DataBase_Path, _
        ".mode tabs" & @CRLF & _
        ".import '" & $s_InputFile_Path & "' testtable" & @CRLF _
        , $s_Out, _
        $s_SQLITE3EXE_Path)

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...