BryanWall Posted January 17, 2024 Author Posted January 17, 2024 @jchd Aside from not wiping out measurements, I ended up with something similar. The update lines are: ;Update first chunk with #ofsamples and bytes per row $HexInj = Hex($Samples) & $BytePerRow $sql = "Update measurement " & _ "set samples=X'" & $HexInj & "' " & _ "where drive_link_id = " & $Drive_link_id & " and chunk_num = " & $Chunk & " " if _SQLite_Exec($DBhndl,$sql) <> $SQLITE_OK Then wLog("Error running: " & $sql) Exit EndIf For $x = 0 to $$TCLineCount ;read TC line and convert text to hex ;Update blob with hex converted line $sql = "Update measurement " & _ "set samples=samples || X'" & $HexInj & "' " & _ "where drive_link_id = " & $Drive_link_id & " and chunk_num = " & $Chunk & " " if _SQLite_Exec($DBhndl,$sql) <> $SQLITE_OK Then wLog("Error. Loop:" & $x & ",SQL: " & $sql) EndIf ;Logic to change chunk and stuff when needed Next The main issues I'm having now is it is SSLLLOOOWWWW. As in after 25 minutes, I was only 4.26% through updating data for 1 file, and I have 2 more to go. So while it appears to be working, my next step is to go through the links @zedna was kind enough to provide for doing large blob updates and hoping I can understand enough to apply it in autoit.
jchd Posted January 17, 2024 Posted January 17, 2024 If you're inserting bulk data, use large transactions and multi-rows inserts (like in my insert example). Using bindings from AutoIt won't make it faster. The bottleneck in in using DllCalls. What is the volume and precise format of your data source? This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
BryanWall Posted January 17, 2024 Author Posted January 17, 2024 @jchd Basic values are 1 file has 1,576,991 txt rows for example. Some of those columns are converted to hex and stored in blobs with a max "chunk" size of 8,388,648 bytes or 16,777,296 hex characters. I had one code that "worked" using an update line like "Set samples = samples || X'hexdata'" but that was 1) insanely slow and 2 didn't technically work because though it looked the exact same, the || apparantly adds something that caused problems. I'm guessing it adds a CR or LF type character but not sure. What I ended up doing that is working, is I compile one long hex string, write it to a file every 1000 loops to avoid a huge slowdown of the "$hex=$hex & $newHex" causes, then read that file directly into the update statement and magically it works. I thought from someone's prior post that it wouldn't work because of the 1 mil sql size limit but.. it does. Relevant code I ended up with: ;delete hex temp file if exists FileDelete($ProjDir & "\DR\339-" & $Chunk & ".txt") ;Loop for each line in _TC.txt For $x = 0 to $TCLineCount ;read .txt ;convert txt to hex $HexInj=$HexInj & $Hexline ;occassionally output $HexInj to file and clear it if StringRight($x,3) = "000" Then FileWrite($ProjDir & "\DR\339-" & $Chunk & ".txt",$HexInj) $HexInj="" EndIf ;if end of chunk if $x = ($Samples * ($Chunk+1)) Then wLog("end of chunk") FileWrite($ProjDir & "\DR\339-" & $Chunk & ".txt",$HexInj) wLog("file written") $sql = "Update measurement " & _ "set samples = x'" & Fileread($ProjDir & "\DR\339-" & $Chunk & ".txt") & "' " & _ "where drive_link_id = " & $Drive_link_id & " and chunk_num = " & $Chunk & " " if _SQLite_Exec($DBhndl,$sql) <> $SQLITE_OK Then wLog("Error running: " & $sql) Else wLog("Rows written") ;delete hex temp file FileDelete($ProjDir & "\DR\339-" & $Chunk & ".txt") EndIf ;more stuff to transition to next chunk Log: 1/17/2024 5:22:52 PM Start 1/17/2024 5:23:19 PM end of chunk 1/17/2024 5:23:19 PM file written 1/17/2024 5:23:19 PM Rows written So far it looks like it only takes ~30 seconds for one chunk so far. I'll play around with some stuff to see if I can speed it up, smaller write parts and changing things so it isn't having to do math each loop since I assume that will be quicker.
jchd Posted January 18, 2024 Posted January 18, 2024 Sorry for being slow/heavy but I don't get it. You say you have files with about 1.5 million lines of text. Good. What is the text encoding: some ANSI codepage, UTF8, UTF16, ...? What is the typical/maximal bytesize of a single line of text? Why do you convert to hex and store as blob some part of the text in each line? Why not just store the text as string? 1 hour ago, BryanWall said: "Set samples = samples || X'hexdata'" Above you ask to concatenate two strings: the previous content of the column samples converted to text and a hex blob which also has to be converted to string for || to operate. The result is of course a string, probably not what you expect. Forcing SQLite to convert largish blobs to strings is wasting CPU cycles and can pose encoding issues. About size limits: https://www.sqlite.org/limits.html Quote Maximum length of a string or BLOB The maximum number of bytes in a string or BLOB in SQLite is defined by the preprocessor macro SQLITE_MAX_LENGTH. The default value of this macro is 1 billion (1 thousand million or 1,000,000,000). You can raise or lower this value at compile-time using a command-line option like this: -DSQLITE_MAX_LENGTH=123456789 The current implementation will only support a string or BLOB length up to 231-1 or 2147483647. And some built-in functions such as hex() might fail well before that point. In security-sensitive applications it is best not to try to increase the maximum string and blob length. In fact, you might do well to lower the maximum string and blob length to something more in the range of a few million if that is possible. During part of SQLite's INSERT and SELECT processing, the complete content of each row in the database is encoded as a single BLOB. So the SQLITE_MAX_LENGTH parameter also determines the maximum number of bytes in a row. The maximum string or BLOB length can be lowered at run-time using the sqlite3_limit(db,SQLITE_LIMIT_LENGTH,size) interface. Also when you have a column expected to hold long (bytesize) values, it's preferable to declare that column last, so that other column holding short values can be accessed without having to load a huge amount of data. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
BryanWall Posted January 18, 2024 Author Posted January 18, 2024 Quote What is the text encoding: some ANSI codepage, UTF8, UTF16, ...? UTF-8. Quote What is the typical/maximal bytesize of a single line of text? Depending on format it is either 32 or 24 bytes. Quote Why do you convert to hex and store as blob some part of the text in each line? Why not just store the text as string? Because my purpose is automating a larger process and this piece is bypassing opening GUI and importing file into database. So instead of manually importing new file, deleting old file, then setting other bits of info back to what they were for ancillary items, I am removing the contents of the prior hex blob and converting/updating with the new data. As for why it was decided to store these values, (time, longitude, latitude, and dBm values) as hex I was not involved in that. I'm guessing faster and/or smaller size? Definitely drastically reduces the number of rows in the database but that usually doesn't matter much IME.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now