marko001 Posted September 22, 2011 Share Posted September 22, 2011 (edited) Hi all,I have this script, inside my program, that : a) take a source file (.csv) (around 5.000 lines), internal.csvb ) converts it so that I can separate each single part (sometimes the first item is i.e. '"Legend of, Zelda",PC,..'. so that there is an extra "," to consider)b1) "Normalizes" the names found so that they can be compared to a .xml structurec) create a string for every line in .xml format (<z:row c0='aaaaa' It2='UZ' Avail='No' fold='E...)d) checks the string with another "arrayed" file (50.000 lines), $mydbe) if _arraysearch returns OK then add to an array the complete string (all the parts of b ) and 2 fields of the .xml, c7=... and c8=...)f) creates a file and uploads it to a server with the code shown below it takes aroud 15 mins to finish. Is there a way to optimize the code to reduce this time? Thanks, Marco P.s. The code works, so no problem if some var/const is missing, probably they have been called by the main .au3 that has all the include needed. expandcollapse popup#include <FTPEx.au3> $exportfile = @ScriptDir & "\DB\internal.csv" Dim $array2db[1] Dim $mydb Global $file2ftp = @TempDir & "\temp2db.txt" Func _exp2db($user) Local $user2db Dim $csv_list[1][8] Dim $csv_array If $user= "" Then $user2db = "Test_User" Else $user2db = $user EndIf _status(0, "Converting now .csv") _FileReadToArray($exportfile, $csv_array) For $i = 2 To UBound($csv_array) - 1 If StringInStr($csv_array[$i], ", ,") Then $string = StringReplace($csv_array[$i], ", ,", ",@,", 1) Else $string = $csv_array[$i] EndIf If StringInStr($string, ", ") Then $string = StringReplace($string, ", ", "^", 1) $name_s = _StringBetween($string, "", ",") $name = _cleanfordb($name_s[0]) ;name cleaned (remove " and ^ I used before and replace , if exists) $string = StringReplace($string, ",", "*", 1) ; comma to get the next item $string = StringReplace($string, ",", "#", 1) ; comma to get the next item $item2 = _StringBetween($string, "*", "#") ;another item $item3 = _StringBetween($string, "#", ",") ;another item $string = StringReplace($string, ",", "_", 1) ; comma to get the next item $item4 = _StringBetween($string, "_", ",") ;another item $item5 = _StringBetween($string, ",", ",") ;another item If $item4[0] <> "TRASH" And $item4[0] <> "PC" And $item4[0] <> "PSX" And $item4[0] <> "WII" And $item4[0] <> "DS" Then $item5[0] = "TRASH" ; $item4[0] = "X" EndIf If $item4[0] = "TRASH" Then $item5[0] = $item4[0] $item4[0] = "None" EndIf $item6= StringRight($string, 2) ;another item If $item6= "es" Then $item6= "Yes" ReDim $csv_list[$i + 1][8] $csv_list[$i][0] = $name $csv_list[$i][1] = $item4[0] If $item5[0] <> "@" Then $csv_list[$i][2] = $item5[0] $csv_list[$i][3] = $item6 $csv_list[$i][4] = $item2[0] $csv_list[$i][5] = $item3[0] _FileReadToArray($originalDB, $mydb) $string = "<z:row c0='" & _normalize_db($csv_list[$i][0]) & "' c1='" & $csv_list[$i][2] & "' c2='" & $csv_list[$i][3] & "' c3='" & $csv_list[$i][1] [b]$check= _ArraySearch($mydb, $string, 39, UBound($mydb), 0, 1)[/b] if $check <> -1 Then $csv_list[$i][6] = _find1($check_1) $csv_list[$i][7] = _find2($check_1) EndIf $string2db = $user2db & "|" & $csv_list[$i][0] & "|" & $csv_list[$i][1] & "|" & $csv_list[$i][2] & "|" & $csv_list[$i][3] & "|" & $csv_list[$i][4] & "|" & $csv_list[$i][5] & "|" & $csv_list[$i][6] & "|" & $csv_list[$i][7] _ArrayAdd($array2db, $string2db) Next _FileWriteFromArray($file2ftp, $array2db) _status(0, "Uploading now") _upload2server($user2db) EndFunc ;==>_exp2db Func _cleanfordb($a) If StringInStr($a, "^") Then $a= StringReplace($a, "^", ", ") ; ^ If StringInStr($a, '"') Then $a= StringMid($a, 2, StringLen($a) - 2) Return $a EndFunc ;==>_cleanfordb Func _upload2server($us) Local $server = "myserver.com" Local $username = "mylogin" Local $password = "mypw" $i_Passive = '1' $port = '21' $open = _FTP_Open("MyFTP") $connect = _FTP_Connect($open, $server, $username, $password, $i_Passive, $port) If $connect = 0 Then _status(1, "Unable to connect") Else $send = _FTP_FilePut($connect, $file2ftp, "/in/" & $us & "_db.txt") If $send = 0 Then _status(1, "Error in uploading file") Else _status(0, "Updated") EndIf EndIf $close = _FTP_Close($open) ; FileDelete($file2ftp) EndFunc ;==>_upload2server and Func _normalize_db($name) If StringInStr($name, "&") Then $name= StringReplace($name, "&", "&") ; & If StringInStr($name, "'") Then $name= StringReplace($name, "'", "'") ; ' If StringInStr($name, '"') Then $name= StringReplace($name, '"', """) ; " Return $name EndFunc ;==>_normalize_db Func _find1($cc) $p = _StringBetween($mydb[$cc], "c4='", "'") If $p[0] = 0 Then $p = _StringBetween($mydb[$cc], "c5='", "'") Return $p[0] EndFunc ;==> _find1 Func _find2($cc) $p = _StringBetween($mydb[$cc], "c6='", "'") If $p[0] = 0 Then $p = _StringBetween($mydb[$cc], "c7='", "'") Return $p[0] EndFunc ;==>_find2 Edited September 22, 2011 by marko001 Link to comment Share on other sites More sharing options...
BrewManNH Posted September 22, 2011 Share Posted September 22, 2011 Can the array you're searching be sorted without causing problems with the data in it? If you can sort the array without it causing issues with your program, use _ArrayBinarySearch after running _ArraySort on the array, it searches a lot faster. Also, don't use _ArrayAdd to add something to your array as it uses ReDim which can cause speed issues. If you are looping through 5000 times and adding one line to the array at a time, this can take a considerable amount of time. Dimension your array to be able to hold the contents of your csv file, and add the lines you need to one at a time by manually adding them yourself and then ReDim the $mydb array after you're done to make it so that it's just the size of the original information, plus the added strings from the csv file. If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays. - ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script. - Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label. - _FileGetProperty - Retrieve the properties of a file - SciTE Toolbar - A toolbar demo for use with the SciTE editor - GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI. - Latin Square password generator Link to comment Share on other sites More sharing options...
marko001 Posted September 22, 2011 Author Share Posted September 22, 2011 Nope, the DB, in xml format cannot be sorted (at least... don't know what index to use to sort it, it stars with <z:row c0='... and the first 39 lines are the usual xml header (that's for i start from 39) the final array "should" be the same dimension of the initial unless some _arraysearch() return 0. You mean to add with a $array[$i] = $var ? Link to comment Share on other sites More sharing options...
BrewManNH Posted September 22, 2011 Share Posted September 22, 2011 The $mydb array can be sorted, you're not writing it back to the disk, you're just using it to compare the strings from the .csv file if I'm reading your script right. _ArrayBinarySearch will definitely help speed up some of the process. If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays. - ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script. - Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label. - _FileGetProperty - Retrieve the properties of a file - SciTE Toolbar - A toolbar demo for use with the SciTE editor - GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI. - Latin Square password generator Link to comment Share on other sites More sharing options...
Spiff59 Posted September 22, 2011 Share Posted September 22, 2011 (edited) I'm surprised to not see a StringSplit() in a script processing a comma-delimited file. I'd think you could replace all those StringReplace() and StringBetween() statements with something like "$item = StringSplit($csv_array[$i] , ",") ? Do you need to be reading in the original database for every line of the csv file ( _FileReadToArray($originalDB, $mydb)) ? Edit: oops, quoted the worng line of code PS - I'd be interested in hearing how much of the 15 minutes is taken up by the upload portion... Edited September 22, 2011 by Spiff59 Link to comment Share on other sites More sharing options...
marko001 Posted September 22, 2011 Author Share Posted September 22, 2011 but _arraybinarysearch can't do a partial search, isnt'it? the $mydb "constructed" doesn't have the last c6=...c7=... so I need to partial search for them in the big file Link to comment Share on other sites More sharing options...
BrewManNH Posted September 22, 2011 Share Posted September 22, 2011 Here's a way of doing it that should speed up part of the process. I didn't change anything in regards to the reformatting of the input lines, just the searching and adding to the array parts. I don't know what the lines you're generating look like, or what's in the $mydb array so the _ArrayBinarySearch section probably won't work without tweaking to match the data in both areas so you can search on them. Trust me in this, once you see how fast ArrayBinarySearch is over ArraySearch on a large array, you will figure out a way to get the search to work for you. Look for the comments in the code for changes. expandcollapse popup#include <FTPEx.au3> $exportfile = @ScriptDir & "\DB\internal.csv" ;~ Dim $array2db[1] Dim $mydb Global $file2ftp = @TempDir & "\temp2db.txt" Func _exp2db($user) Local $user2db, $array2db[10000] Dim $csv_list[1][8] Dim $csv_array If $user = "" Then $user2db = "Test_User" Else $user2db = $user EndIf _status(0, "Converting now .csv") _FileReadToArray($exportfile, $csv_array) _FileReadToArray($originalDB, $mydb) ; moved out of the loop _ArraySort($mydb) ; sort the array Local $iCounter = 0 For $i = 2 To UBound($csv_array) - 1 If StringInStr($csv_array[$i], ", ,") Then $string = StringReplace($csv_array[$i], ", ,", ",@,", 1) Else $string = $csv_array[$i] EndIf If StringInStr($string, ", ") Then $string = StringReplace($string, ", ", "^", 1) $name_s = _StringBetween($string, "", ",") $name = _cleanfordb($name_s[0]) ;name cleaned (remove " and ^ I used before and replace , if exists) $string = StringReplace($string, ",", "*", 1) ; comma to get the next item $string = StringReplace($string, ",", "#", 1) ; comma to get the next item $item2 = _StringBetween($string, "*", "#") ;another item $item3 = _StringBetween($string, "#", ",") ;another item $string = StringReplace($string, ",", "_", 1) ; comma to get the next item $item4 = _StringBetween($string, "_", ",") ;another item $item5 = _StringBetween($string, ",", ",") ;another item If $item4[0] <> "TRASH" And $item4[0] <> "PC" And $item4[0] <> "PSX" And $item4[0] <> "WII" And $item4[0] <> "DS" Then $item5[0] = "TRASH" ; $item4[0] = "X" EndIf If $item4[0] = "TRASH" Then $item5[0] = $item4[0] $item4[0] = "None" EndIf $item6 = StringRight($string, 2) ;another item If $item6 = "es" Then $item6 = "Yes" ReDim $csv_list[$i + 1][8] $csv_list[$i][0] = $name $csv_list[$i][1] = $item4[0] If $item5[0] <> "@" Then $csv_list[$i][2] = $item5[0] $csv_list[$i][3] = $item6 $csv_list[$i][4] = $item2[0] $csv_list[$i][5] = $item3[0] $string = "<z:row c0='" & _normalize_db($csv_list[$i][0]) & "' c1='" & $csv_list[$i][2] & "' c2='" & $csv_list[$i][3] & "' c3='" & $csv_list[$i][1] $check = _ArrayBinarySearch($mydb, StringLeft($string, 100), 39, 0) ; depending on how many characters you need to check against $mydb, change the 100 appropriately If $check <> -1 Then $csv_list[$i][6] = _find1($check_1) $csv_list[$i][7] = _find2($check_1) EndIf $string2db = $user2db & "|" & $csv_list[$i][0] & "|" & $csv_list[$i][1] & "|" & $csv_list[$i][2] & "|" & $csv_list[$i][3] & "|" & $csv_list[$i][4] & "|" & $csv_list[$i][5] & "|" & $csv_list[$i][6] & "|" & $csv_list[$i][7] $array2db[$iCounter] = $string2db ; add string to the array $iCounter += 1 ; increment the counter by 1 If $iCounter = UBound($array2db) - 2 Then ReDim $array2db[UBound($array2db) * 2) ; if the counter = Ubound($array2db) double the size of the array Next ReDim $array2db[$iCounter] _FileWriteFromArray($file2ftp, $array2db) _status(0, "Uploading now") _upload2server($user2db) EndFunc ;==>_exp2db Func _cleanfordb($a) If StringInStr($a, "^") Then $a = StringReplace($a, "^", ", ") ; ^ If StringInStr($a, '"') Then $a = StringMid($a, 2, StringLen($a) - 2) Return $a EndFunc ;==>_cleanfordb Func _upload2server($us) Local $server = "myserver.com" Local $username = "mylogin" Local $password = "mypw" $i_Passive = '1' $port = '21' $open = _FTP_Open("MyFTP") $connect = _FTP_Connect($open, $server, $username, $password, $i_Passive, $port) If $connect = 0 Then _status(1, "Unable to connect") Else $send = _FTP_FilePut($connect, $file2ftp, "/in/" & $us & "_db.txt") If $send = 0 Then _status(1, "Error in uploading file") Else _status(0, "Updated") EndIf EndIf $close = _FTP_Close($open) ; FileDelete($file2ftp) EndFunc ;==>_upload2server If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays. - ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script. - Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label. - _FileGetProperty - Retrieve the properties of a file - SciTE Toolbar - A toolbar demo for use with the SciTE editor - GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI. - Latin Square password generator Link to comment Share on other sites More sharing options...
marko001 Posted September 22, 2011 Author Share Posted September 22, 2011 (edited) Mainly this is the thing:FIle A: is a .csv that contains (real example:) Line1: Fifa2011,PC,2011,Y,1,2007LineN: "Zeldá, Legend of", WII,2009,N,2,2008 And this is the .xml (from line 39 and without the last 2 lines that close the .xml) Line 45: <z:row c0='Fifa2011' C1='2011' C2='PC' C3='Y' c4='35' c5='43' c6='26' c7='33'/>Line 133: <z:row c0='Zeldâ, Legend of' C1='2009' C2='Wii' C3='N' c4='0' c5='11' c6='0' c7='23'/> What I need to have is a file that contains this: Line1: Fifa2011|PC,2011|Y|1|35|33...LineN: Zeldá, Legend of|WII|2009|N|2|11|23 (if c4=0 i get c5 and if c6=0 i get c7 values) .csv is around 5/6.000 lines and .xml can be around 45.000 Upload is nothing, around 10 secs, that's not the problem. M. Edited September 22, 2011 by marko001 Link to comment Share on other sites More sharing options...
marko001 Posted September 22, 2011 Author Share Posted September 22, 2011 [00:28:25] Converting now .csv [00:31:31] Uploading now to Server [00:31:36] Updated to server! 00:03:06 - Good result (and I just put the _FileReadToArray() outside the loop...bad mistake I did!) I'm keeping on with arraysort now. Link to comment Share on other sites More sharing options...
BrewManNH Posted September 23, 2011 Share Posted September 23, 2011 That's a lot better result than 15 minutes. You probably don't need the binary searching in this case. If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays. - ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script. - Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label. - _FileGetProperty - Retrieve the properties of a file - SciTE Toolbar - A toolbar demo for use with the SciTE editor - GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI. - Latin Square password generator Link to comment Share on other sites More sharing options...
Beege Posted September 23, 2011 Share Posted September 23, 2011 P.s. The code works, so no problem if some var/const is missing, probably they have been called by the main .au3 that has all the include needed. After reading that and then seeing this, Im wonding if $check_1 is a var in the main .au3? if so, ignore the rest this post. $check= _ArraySearch($mydb, $string, 39, UBound($mydb), 0, 1) if $check <> -1 Then $csv_list[$i][6] = _find1($check_1) $csv_list[$i][7] = _find2($check_1) EndIf As an alternative to _arraysearch(), a dictionary object (PERL associative array) might work too. They can be very fast. After you read your originaldb to an array, build the dictionary with the db strings as keys, and the array index as items. Global $o_mydb = ObjCreate("Scripting.Dictionary") _FileReadToArray($originalDB, $mydb) For $i = 1 to $mydb[0] If Not $o_$mydb.Exists($mydb[$i]) Then $o_$mydb.Add($mydb[$i], $i) Next Then instead of using _arraysearch(): $string = "<z:row c0='" & _normalize_db($csv_list[$i][0]) & "' c1='" & $csv_list[$i][2] & "' c2='" & $csv_list[$i][3] & "' c3='" & $csv_list[$i][1] If $o_mydb.exists($string) Then $csv_list[$i][6] = _find1($o_mydb.item($string)) $csv_list[$i][7] = _find2($o_mydb.item($string)) EndIf Assembly Code: fasmg . fasm . BmpSearch . Au3 Syntax Highlighter . Bounce Multithreading Example . IDispatchASMUDFs: Explorer Frame . ITaskBarList . Scrolling Line Graph . Tray Icon Bar Graph . Explorer Listview . Wiimote . WinSnap . Flicker Free Labels . iTunesPrograms: Ftp Explorer . Snipster . Network Meter . Resistance Calculator Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now