Jump to content
Sign in to follow this  
MrChiliCheese

[SQLITE] Rearranging table

Recommended Posts

MrChiliCheese

Hello folks!

I have an massive amount of data and I started to sort it with Autoit, but it started to be quite slow.

After I realized how awesome SQLITE is, I started using it within Autoit.

As I didn't found a good forum in the WWW I thought I could get some help here.

post-38066-0-89591800-1359998469_thumb.p

The picture show the strucute of an array like I receive it from SQL. I want to transform it with SQL or Autoit and I'm searching for a good (and fast) way to do it.

Please note - I don't want somebody to write a script for me. Just some hints how I could do it in a good way.

I started to get the amount of different "VarNames" ans then searching for the Time and adding it to the array. But this is really slow.

Any better idea?

Share this post


Link to post
Share on other sites
jdelaney

If you query the data correctly (inner joins if the vars are always present, or outer joins if not (such as nulls exist)), you won't need to format the returned data, just print it.


IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
MrChiliCheese

Thanks! Could you give me a push into the right direction, so that I can start doing so?

Share this post


Link to post
Share on other sites
jdelaney

I would need the actual values [table column names] to do that...something like this:

select t.time, t1.datapoint1, t2.datapoint2, t3.datapoint3, t4.datapoint4

from table t

Inner Join table t2 on t2.time = t.time

Inner Join table t3 on t3.time = t.time

Inner Join table t4 on t4.time = t.time

...

This may or may not work, depends on the data constraints.

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
MrChiliCheese

Please don't think I want you to do the job, but attached the SQLITE DB with the format.

I tried to understand the query you posted.

  • "t.", "t1.", "t2." is a "pointer" to a table within the database.

  • datapoint1 etc. is so to say a column?

  • inner join table will create the "result" table where within table "t2" the column "time" fits the "time" of the table "t"
So far correct?

EDIT: with attachment now ^^

Processdata.db.txt

Edited by MrChiliCheese

Share this post


Link to post
Share on other sites
jdelaney

So, the query is taking the single table you have, and joining it's self onto it's self, where the time is equal...that way, you can have one record set for each time, rather than one record set for each variable for a time. Without knowing the values of the colored 'vars' you have, i'm not able to get more specific, or even tell you if, this is a viable option to proceed with.

I'm assuming these are kind of like key value pairs? Where the similar colored 'VarName' are equal, right?

select t.time, t.datapoint1, t2.datapoint2, t3.datapoint3, t4.datapoint4

from table t

Inner Join table t2 on t2.time = t.time and t2.varname = 'something2'

Inner Join table t3 on t3.time = t.time and t3.varname = 'something3'

Inner Join table t4 on t4.time = t.time and t4.varname = 'something4'

Yes, the .datapoint is the column name, so it will always be t#.varvalue (if that is the column name)

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
MrChiliCheese

So, the query is taking the single table you have, and joining it's self onto it's self, where the time is equal...that way, you can have one record set for each time, rather than one record set for each variable for a time. Without knowing the values of the colored 'vars' you have, i'm not able to get more specific, or even tell you if, this is a viable option to proceed with.

I like the way, you're going to handle this.

I'm assuming these are kind of like key value pairs? Where the similar colored 'VarName' are equal, right?

That is right!

select t.time, t.datapoint1, t2.datapoint2, t3.datapoint3, t4.datapoint4

from table t

Inner Join table t2 on t2.time = t.time and t2.varname = 'something2'

Inner Join table t3 on t3.time = t.time and t3.varname = 'something3'

Inner Join table t4 on t4.time = t.time and t4.varname = 'something4'

Yes, the .datapoint is the column name, so it will always be t#.varvalue (if that is the column name)

i'll try and play a little bit with my DB :-)

Share this post


Link to post
Share on other sites
jdelaney

I updated the above, to allow for the proper joins, replace the 'something#' with your key name


IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
MrChiliCheese

Thank you. So i updated your suggestion to this:

SELECT logdata.time_ms, logdata.VarName, t2.VarName, t3.VarName, t4.VarName FROM TABLE logdata
INNER JOIN TABLE t2 ON t2.time_ms = t.time_ms AND t2.VarName = 'HMI_SEND_DB.VALUES.VOLTAGE_CATENARY'
INNER JOIN TABLE t3 ON t3.time_ms = t.time_ms AND t3.VarName = 'HMI_SEND_DB.STATUS.TRAVEL_DIRECTION'
INNER JOIN TABLE t4 ON t4.time_ms = t.time_ms AND t4.VarName = 'HMI_SEND_DB.INVERTER.TRAVEL_DRIVE[1].ASS_INTERVENTION'

I guess, I need to create those tables t2...t4, or?

Edited by MrChiliCheese

Share this post


Link to post
Share on other sites
jchd

You want a kind of pivotal operation. Bare SQL isn't the right tool for that, it shouldn't be confused with a spreadsheet.

You can still obtain a result pretty close to what you want:

select 'Pivoted Array,' || group_concat(distinct "names") from mytable group by names order by rowid; -- this your result array row 0

select timestamp || ',' || group_concat("values") from mytable group by timestamp order by timestamp; -- this is the rest of your array

This assumes that columns 1..3 of mytable are named "timestamp", "names", "values".

Then use stringsplit on the rows of the result to build your final result array.

EDIT: I was distracted while composing this and discovered other posts just now.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
jdelaney

in the select statement, you wan to return the varvalue, not the name.

SELECT logdata.time_ms, logdata.VarValue, t2.VarValue, t3.VarValue, t4.VarValue

FROM TABLE logdata t

(also, add 't' after the table name above, since that is how you define how to link the table)

Also, i forgot to put in a where statement, so you only return one record:

where t.varname = 'first var name'

so the join of t2 will be where t2.varname = 'second var name'

got it?


IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
MrChiliCheese

I think I'm getting close with understanding what you are writing.

Dim $aVar, $iRows, $iColumns
_SQLite_Startup()
$sqldb = _SQLite_Open($sDB_Path)
_SQLite_GetTable2d($sqldb, 'SELECT logdata.time_ms, logdata.VarValue, t2.VarValue, t3.VarValue, t4.VarValue FROM TABLE logdata t', $aVar, $iRows, $iColumns)

_SQLite_Close($sDB_Path)
_SQLite_Shutdown()

That is what I'm using (running the current beta of AutoIt). I guess instead of _SQLite_GetTable2d I should use _SQLite_Query?

But anyway I'm getting an error in Scite running the SELECT line above:

!   SQLite.au3 Error
--> Function: _SQLite_Query
--> Query:    SELECT logdata.time_ms, logdata.VarValue, t2.VarValue, t3.VarValue, t4.VarValue FROM TABLE logdata t
--> Error:    near "TABLE": syntax error

If I remove then "table", scite reports that this:

!   SQLite.au3 Error
--> Function: _SQLite_Query
--> Query:    SELECT logdata.time_ms, logdata.VarValue, t2.VarValue, t3.VarValue, t4.VarValue FROM logdata t
--> Error:    no such column: logdata.time_ms

You want a kind of pivotal operation. Bare SQL isn't the right tool for that, it shouldn't be confused with a spreadsheet.

You can still obtain a result pretty close to what you want:

select 'Pivoted Array,' || group_concat(distinct "names") from mytable group by names order by rowid; -- this your result array row 0

select timestamp || ',' || group_concat("values") from mytable group by timestamp order by timestamp; -- this is the rest of your array

This assumes that columns 1..3 of mytable are named "timestamp", "names", "values".

Then use stringsplit on the rows of the result to build your final result array.

EDIT: I was distracted while composing this and discovered other posts just now.

I'll try this one also :-)

Share this post


Link to post
Share on other sites
jchd

Prefer _SQLite_GetTable2d over *_Query and friends.

Group_concat will work regardless how many VarNames exist.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
jdelaney

The groups are probably the way to go, but anyways, to fix for my example:

SELECT logdata.time_ms, logdata.VarValue, t2.VarValue, t3.VarValue, t4.VarValue FROM logdata t

should be:

SELECT t.time_ms, t.VarValue, t2.VarValue, t3.VarValue, t4.VarValue FROM logdata t


IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
MrChiliCheese

The groups are probably the way to go, but anyways, to fix for my example:

SELECT logdata.time_ms, logdata.VarValue, t2.VarValue, t3.VarValue, t4.VarValue FROM logdata t

should be:

SELECT t.time_ms, t.VarValue, t2.VarValue, t3.VarValue, t4.VarValue FROM logdata t

I think I am to stupid (or tired)...

This is the query, right?

SELECT t.time_ms, t.VarValue, t2.VarValue, t3.VarValue, t4.VarValue FROM logdata t
INNER JOIN TABLE t2 ON t2.time_ms = t.time_ms AND t2.VarName = 'HMI_SEND_DB.VALUES.VOLTAGE_CATENARY'
INNER JOIN TABLE t3 ON t3.time_ms = t.time_ms AND t3.VarName = 'HMI_SEND_DB.STATUS.TRAVEL_DIRECTION'
INNER JOIN TABLE t4 ON t4.time_ms = t.time_ms AND t4.VarName = 'HMI_SEND_DB.INVERTER.TRAVEL_DRIVE[1].ASS_INTERVENTION'

How do I put this into the AutoIt line?

Share this post


Link to post
Share on other sites
jchd

After adjusting names, here's what you should try:

select 'Pivoted Array,' || group_concat(distinct varname) from logdata order by rowid; -- this your result array row 0

select time_ms || ',' || group_concat(varvalue) from logdata group by time_ms order by time_ms; -- this is the rest of your array

But realize that the resulting array isn't anymore anything close to SQLish.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
MrChiliCheese

@jdelaney: I guess right now I'll go with the solution of jchd. Seems easier and more handy to me. But please keep me posted how to put the query in AutoIt!

@jchd: That is working very fine with me. I'm just testing whith some data to verify that it is correct. Sometimes I don't have all VarValues for each time_ms.

Is there a way to include this in the query? Something like this:

"SELECT Time_ms || ',' || GROUP_CONCAT(VarValue) FROM logdata_new WHERE DISTINCT VarValue = 13 GROUP BY Time_ms ORDER BY Time_ms"

Share this post


Link to post
Share on other sites
jchd

The added clause (underlined bold above) is not valid SQL and doesn't make sense. Do you want to include only VarValues being 13 or exclude them? In both cases, remove the DISTINCT keyword and use the operator you need, = or !=.

This is unrelated to the question you ask above: "Sometimes I don't have all VarValues for each time_ms."

Does this mean that those VarValues are set to NULL or that the corresponding rows don't exist?

Anyway the current DB schema doesn't seem to match the way you need to handle the data, at least from what you've said up to now. In all cases, there is way too much duplication of the same datum which drives you to use really exotic queries. How many different VarNames are you going to have and are they fixed once for all?


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
MrChiliCheese

The added clause (underlined bold above) is not valid SQL and doesn't make sense. Do you want to include only VarValues being 13 or exclude them? In both cases, remove the DISTINCT keyword and use the operator you need, = or !=.

This is unrelated to the question you ask above: "Sometimes I don't have all VarValues for each time_ms."

Does this mean that those VarValues are set to NULL or that the corresponding rows don't exist?

I know that is not a valid SQL expression, I just tried to state what I meant.

In my case sometimes the complete row is missing. So I thought that it might be possible to exclude this from the query.

Anyway the current DB schema doesn't seem to match the way you need to handle the data, at least from what you've said up to now. In all cases, there is way too much duplication of the same datum which drives you to use really exotic queries. How many different VarNames are you going to have and are they fixed once for all?

The problem is, that I receive the database as it is, and I can't change anything on the output. The amount of VarNames can be up to 40, and is fixed. Maybe later it will change, but this is not something that is happening dynamically!

Share this post


Link to post
Share on other sites
jchd

What I'd do if I were in your place is design your own DB (let's call it MainDB)using a better and more flexible schema. When you receive a new DB (let's call it Input.db), perform an ATTACH Input.db as InputDB to have both DBs inline within the same connection. Then massage your input data to fit it in your MainDB.

That's not hard to do and will make the queries more stable. Relying on group_concat to output stuff in the "correct" order can be tricky and isn't guaranteed to keep on working in future SQLite releases. Remember that contrary to a spreadsheet, SQL tables don't have any inherent order unless you explicitely force one in your SELECTs, so that may not always work over grouping. SQL doesn't even guarantee that issuing "select * from mytable" twice in a row will give the results in the same order, so it's hard to seriously rely on undocumented implementation details subject to change anytime without prior notice.

I'm willing to help you getting on the right tracks but doing so depends on your actual needs. Which queries are you going to require and what constraints do apply? There are a couple of columns that seem to be unused for now (in your sample DB). Will you need them?

I keep note that the Varnames are essentially fixed, that's a good point.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×