Jump to content

Best way to make array search faster?


Recommended Posts

I'm currently trying to write a script that will read a 3D graphics file and convert it to another format. The 3D file format i'm talking about is SMD, and it stores information about triangles in this fashion:

mat

A

B

C

mat

D

E

F

mat

B

C

G

...

where letters A,B,C,etc are vertices. In the file, xyz positions, texture coordinates, etc. are actually written in their place at each line.

So each triangle is 4 lines: 1st one is the texture used by it, and next 3 its 3 points. This file format is dumb because it doesn't index the vertices, it simply rewrites all the information of a vertex (xyz position, normal vector, etc) for every triangle even if that vertex has been used before by a previous triangle.

So i'm trying to write a script that will go through this file line by line and make an index of vertices using an array, and at the same time make an array of triangles whose elements will refer to the array with the vertices in it. That way the vertex array will be v=[A,B,C,D,E,F,G,...] and the triangle array will be t=[0,1,2,3,4,5,1,2,6,...] if i follow the example above. The problem with this is that for every line read, I have to do an array search on v to see if that particular vertex already exists or not, and if doesn't, i can add it to the end of the array. Furthermore the array search must check that 4 values match: the x,y,z position and the texture name every search (i wrote my own array search function for this, simply using a For loop). For 3D files that contain thousands of vertices, just the process of putting everything into arrays can take hours. How would I make this run faster?

Link to comment
Share on other sites

I'm sorry, but your verbal description isn't clear to me. The speed of the search will depend on the way you populate your array(s). Are you using an array of strings with delimeters (to be split later)? What is the xyz position? Are some of these values meant to be 3D coordinates? If you already have a working example, then show us.

Link to comment
Share on other sites

I'm sorry, but your verbal description isn't clear to me. The speed of the search will depend on the way you populate your array(s). Are you using an array of strings with delimeters (to be split later)? What is the xyz position? Are some of these values meant to be 3D coordinates? If you already have a working example, then show us.

yeah..the file i'm reading is like this:

submesh0

0 -0.050707 -0.04374 0.136470 0.352282 0.866719 -0.353123 0.636740 0.285197 0

0 -0.050707 -0.04374 0.136470 0.352282 0.866719 -0.353123 0.636740 0.285197 0

0 -0.039387 -0.038175 0.168731 0.065637 -0.046263 -0.996771 0.636742 0.277249 0

submesh0

0 -0.050707 -0.04374 0.136470 0.352282 0.866719 -0.353123 0.636740 0.285197 0

0 -0.039387 -0.038175 0.168731 0.065637 -0.046263 -0.996771 0.636742 0.277249 0

0 -0.078713 -0.031276 0.132812 0.588935 0.728068 -0.350817 0.658227 0.285197 0

submesh0

0 -0.078713 -0.031276 0.132812 0.588935 0.728068 -0.350817 0.658227 0.285197 0

0 -0.039387 -0.038175 0.168731 0.065637 -0.046263 -0.996771 0.636742 0.277249 0

0 -0.06478 -0.026463 0.166590 0.111458 -0.160764 -0.980679 0.658230 0.277249 0

etc...

"submesh0" is the texture image used, the 3 lines under it are the 3 vertex that make up a triangle. Each vertex line starts with 0, then the next 3 numbers are the xyz positions, the next 3 are the ijk values of the normal vector, the next two are the uv texture coordinates, and each line ends with 0.

So here's what i basically have (function _Array2D_PutValue is in Array2D.au3):

#include <Array.au3>
#include <Array2D.au3>

$SMDfile = FileOpen($SMDFilePath,0)

Local $v[1][4] 
Local $triangles[1]
Local $currentMat
Local $lineID = 0
While 1
    $currentLine = FileReadLine($SMDfile)
    $SplitLine = StringSplit($currentLine, " ", 1)
    If @error = -1 Then ExitLoop
    
    If $lineID = 0 Then
        $currentMat = $currentLine
        $lineID = 1
    Else
        $VsearchResult = _VSearch($v, $SplitLine[2], $SplitLine[3], $SplitLine[4], $currentMat)
        If $VsearchResult = -1 Then
            _Array2D_PutValue($v, $SplitLine[2] , UBound($v,1)-1, Default, 0)
            _Array2D_PutValue($v, $SplitLine[3] , UBound($v,1)-1, Default, 1)
            _Array2D_PutValue($v, $SplitLine[4] , UBound($v,1)-1, Default, 2)
            _Array2D_PutValue($v, $currentMat, UBound($v,1)-1, Default, 3)
            Redim $v[UBound($v,1)+1][4]
            $vcount = $vcount + 1
            
            $triangles[UBound($triangles)-1] = $vcount-1
        Else
            $triangles[UBound($triangles)-1] = $VsearchResult
        EndIf
        Redim $triangles[UBound($triangles)+1]
        
        If $lineID = 3 Then
            $lineID = 0
        Else
            $lineID = $lineID + 1
        EndIf
    EndIf
Wend


Func _VSearch($array, $x, $y, $z, $t)
    Local $Result = -1
    Local $i
    For $i = 0 to UBound($array,1)-1
        If $a[$i][0] = $x AND $array[$i][1] = $y, AND $array[$i][2] = $z AND $array[$i][3] = $t Then 
            $Result = $i
            ExitLoop
        EndIf
    Next
    Return $Result
EndFunc

This runs very slowly

Edited by sloth85
Link to comment
Share on other sites

I can't get this code to run at all. I downloaded Array2D.au3 from and tried to run your code. I get too many errors. Firstly $vcount not declared, then $a not declared, then _StringSplit undefined function. I don't know why you need this UDF. I took a quick look inside and I thought it looked overly complicated.

Because of these problems, I'm still not clear about the search criteria. Would the search have to match all of the following lines for an exact duplicate?

submesh0

0 -0.050707 -0.04374 0.136470 0.352282 0.866719 -0.353123 0.636740 0.285197 0

0 -0.050707 -0.04374 0.136470 0.352282 0.866719 -0.353123 0.636740 0.285197 0

0 -0.039387 -0.038175 0.168731 0.065637 -0.046263 -0.996771 0.636742 0.277249 0

How do you want the final data to be formatted?

Edit

I dunno, there's different options. Have you thought of using _FileReadToArray and testing every fourth line using For ... To ... Step 4? That could speed up the seach.

Edited by czardas
Link to comment
Share on other sites

Think the best for this kind and amount of data to compare is to use SQLite(UDF) to speed up things. (Or switch to a faster language.)

There are some nice topics on the subject. (Link up search to user "jchd" to limit the junk.)

Edited by iEvKI3gv9Wrkd41u

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

Here's a possible schema design for use with SQLite:

CREATE TABLE [Textures] (

[id] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,

[Name] CHAR NOT NULL COLLATE NOCASE DEFAULT ('default'));

CREATE UNIQUE INDEX [ixTexture] ON [Textures] ([Name] COLLATE NOCASE);

CREATE TABLE [Vertices] (

[id] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,

[X] FLOAT NOT NULL,

[Y] FLOAT NOT NULL,

[Z] FLOAT NOT NULL);

CREATE UNIQUE INDEX [ixVertices] ON [Vertices] ([X], [Y], [Z]);

CREATE TABLE [Triangles] (

[id] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,

[A] INTEGER NOT NULL CONSTRAINT [fkVertexA] REFERENCES [Vertices]([id]) ON DELETE CASCADE ON UPDATE CASCADE NOT DEFERRABLE INITIALLY DEFERRED,

INTEGER NOT NULL CONSTRAINT [fkVertexB] REFERENCES [Vertices]([id]) ON DELETE CASCADE ON UPDATE CASCADE NOT DEFERRABLE INITIALLY DEFERRED,

[C] INTEGER NOT NULL CONSTRAINT [fkVertexC] REFERENCES [Vertices]([id]) ON DELETE CASCADE ON UPDATE CASCADE NOT DEFERRABLE INITIALLY DEFERRED,

FLOAT NOT NULL,

[J] FLOAT NOT NULL,

[K] FLOAT NOT NULL,

[Texture] INTEGER NOT NULL CONSTRAINT [fkTexture] REFERENCES [Textures]([id]) ON DELETE SET DEFAULT ON UPDATE CASCADE NOT DEFERRABLE INITIALLY DEFERRED,

FLOAT NOT NULL,

[V] FLOAT NOT NULL,

CONSTRAINT [ukTriangle] UNIQUE([A], , [C]) ON CONFLICT REPLACE);

I'm unsure about the usage of individual normal vectors for _each_ vertex (to me a normal vector is unique for a given triangle).

I'm also unsure about the unique constraint I placed on triangles.

Apart from that, you should be able to insert and identify easily every new vertex/triangle using this schema.

If you're not familiar with SQL, I should be able to help you figure out how to proceed.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...