myspacee Posted March 8, 2010 Posted March 8, 2010 Hello to all, using StringReplace i encountered strange problem. Have thousand files to check to create a report. All these files have some ascii control codes on head (propetary format) eg: NULNULSTXDC2 some text find also number then again text is possible to : - open file - store all in a var - keep only text/numbers (remove all symbol/ascii codes/etc) Thank you for reading and any info, m. ps: can post some files if need
Moderators Melba23 Posted March 8, 2010 Moderators Posted March 8, 2010 myspeacee,Not sure you need a SRE for this:; Create a "binary" string as you would get from a file read in Binary format $sText = "0x" For $i = 0 To 127 $sText &= Hex($i, 2) Next MsgBox(0,"Binary String", $sText) ; Move through the "binary" string and remove all characters below 32 $sNewText = "0x" For $i = 0 to 127 $sChar = BinaryMid($sText, 1 + $i, 1) ConsoleWrite($sChar & @CRLF) If $sChar > 31 Then $sNewText &= StringTrimLeft($sChar, 2) Next MsgBox(0, "", $sNewText)If you want to remove other symbols, just change the If in the second loop to a Switch and use as many Case statements as you need. M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Reveal hidden contents ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area
martin Posted March 8, 2010 Posted March 8, 2010 On 3/8/2010 at 6:15 PM, 'myspacee said: Hello to all, using StringReplace i encountered strange problem. Have thousand files to check to create a report. All these files have some ascii control codes on head (propetary format) eg: NULNULSTXDC2 some text find also number then again text is possible to : - open file - store all in a var - keep only text/numbers (remove all symbol/ascii codes/etc) Thank you for reading and any info, m. ps: can post some files if need I'm not sure what you want to remove since an ascii code might be representing a character you want to keep. Anyway, the easiest way to do it might be to decide what you want to keep, Supposing you want to keep all numbers, all letters a to z, spaces and any vertical or horizontal whitespace character. Then you could remove everything else from the string $s like this $sStripped = StringRegExpReplace($s,"[^0-9,a-z,A-Z, ,\h,\v]","") Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
dani Posted March 8, 2010 Posted March 8, 2010 (edited) @martin Actually, you include the comma now You cannot use the comma to separate character classes. Just omit them and it will work. Compare: $s = "123a,bc*^@),." $sStripped_1 = StringRegExpReplace($s,"[^0-9,a-z,A-Z, ,\h,\v]","") $sStripped_2 = StringRegExpReplace($s,"[^0-9a-zA-Z\s]","") ; Space is included in \h btw -- \h = tabs & spaces so I left " " out. Also, as far as I know \h\v == \s ConsoleWrite($sStripped_1 & @CR) ConsoleWrite($sStripped_2 & @CR) Edited March 8, 2010 by dani
myspacee Posted March 8, 2010 Author Posted March 8, 2010 Thank you for reply, but can't solve and going mad. Extract little part of my script : #include <Array.au3> #Include <File.au3> ;Gather files list into an array $fileList = _FileListToArray(@ScriptDir, "*.", 1) if @Error = 0 then ;if some files exist ;Loop through array from 1 to last file For $X = 1 to $fileList[0] ToolTip("",0,0) ;read file $foo = FileOpen ($fileList[$X], 0) $bar = FileRead ($foo) MsgBox(0,$fileList[$X],$bar) FileClose($foo) next EndIf post zipped folder with 204 'txt' files, for test. http://www.webalice.it/t.bavaro/pvv47p.zip Can't solve this riddle.... m.
martin Posted March 8, 2010 Posted March 8, 2010 @dani. Yes, thanks for pointing out my mistake. @myspacee I expect it does seem a bit strange but the files you are reading start with ascii code 00 which is used to mark the end of a string so the string you try to display will be "". Try this #include <Array.au3> #Include <File.au3> ;Gather files list into an array $fileList = _FileListToArray(@ScriptDir, "*.", 1) if @Error = 0 then ;if some files exist ConsoleWrite("No. of files = " & $fileList[0] & @CRLF) ;Loop through array from 1 to last file For $X = 1 to $fileList[0] ToolTip("",0,0) $bar = StringRegExpReplace(FileRead ($fileList[$X]),"[^0-9a-zA-Z \h\v]","") MsgBox(0,$fileList[$X],$bar) next EndIf Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
myspacee Posted March 8, 2010 Author Posted March 8, 2010 Thank ou Martin, incredible but doesn't work with posted files... I see ascii code 00 and can't find way to avoid it, if manually remove (eg: notepad) it works. Can't do anything to mass of file there are too many, I must find a solution... m.
MvGulik Posted March 8, 2010 Posted March 8, 2010 (edited) whatever Edited February 7, 2011 by MvGulik "Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions.""The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014) "Believing what you know ain't so" ... Knock Knock ...
martin Posted March 8, 2010 Posted March 8, 2010 On 3/8/2010 at 7:57 PM, 'myspacee said: Thank ou Martin, incredible but doesn't work with posted files... I see ascii code 00 and can't find way to avoid it, if manually remove (eg: notepad) it works. Can't do anything to mass of file there are too many, I must find a solution... m. This works for me with the posted files, which I checked before I posted the last time. #include <Array.au3> #Include <File.au3> dircreate(@scriptDir & "\Atemp") ;Gather files list into an array $fileList = _FileListToArray(@ScriptDir, "*.", 1) if @Error = 0 then ;if some files exist ConsoleWrite("No. of files = " & $fileList[0] & @CRLF) ;Loop through array from 1 to last file For $X = 1 to $fileList[0] ToolTip($x,0,0) $bar = StringRegExpReplace(FileRead ($fileList[$X]),"[^0-9a-zA-Z \h\v]","") filewrite("Atemp\" & $fileList[$X],$bar) ; MsgBox(0,$fileList[$X],$bar) next EndIf It converts the posted files in about 2 seconds. Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
myspacee Posted March 8, 2010 Author Posted March 8, 2010 (edited) thank you again Martin, but in folder that last script create, i find zeroed files... Maybe language setting can influence results ? Ansi/unicode/utf issue ? Some binary result using fileopen forcing binary(byte) reading: #include <Array.au3> #Include <File.au3> dircreate(@scriptDir & "\Atemp") ;Gather files list into an array $fileList = _FileListToArray(@ScriptDir, "*.", 1) if @Error = 0 then ;if some files exist ConsoleWrite("No. of files = " & $fileList[0] & @CRLF) ;Loop through array from 1 to last file For $X = 1 to $fileList[0] ToolTip($x,0,0) ConsoleWrite("file = " & $fileList[$X] & @CRLF) $foo = FileOpen (@ScriptDir & "\" & $fileList[$X], 16) $bar = StringRegExpReplace(FileRead ($foo),"[^0-9a-zA-Z \h\v]","") if @Error then msgbox(0,"a",@Error) filewrite("Atemp\" & $fileList[$X], $bar ) if @Error then msgbox(0,"b",@Error) fileclose($foo) next EndIf !? m. Edited March 8, 2010 by myspacee
martin Posted March 8, 2010 Posted March 8, 2010 (edited) On 3/8/2010 at 10:26 PM, 'myspacee said: thank you again Martin, but in folder that last script create, i find zeroed files... Maybe language setting can influence results ? Ansi/unicode/utf issue ? Some binary result using fileopen forcing binary(byte) reading: #include <Array.au3> #Include <File.au3> dircreate(@scriptDir & "\Atemp") ;Gather files list into an array $fileList = _FileListToArray(@ScriptDir, "*.", 1) if @Error = 0 then ;if some files exist ConsoleWrite("No. of files = " & $fileList[0] & @CRLF) ;Loop through array from 1 to last file For $X = 1 to $fileList[0] ToolTip($x,0,0) ConsoleWrite("file = " & $fileList[$X] & @CRLF) $foo = FileOpen (@ScriptDir & "\" & $fileList[$X], 16) $bar = StringRegExpReplace(FileRead ($foo),"[^0-9a-zA-Z \h\v]","") if @Error then msgbox(0,"a",@Error) filewrite("Atemp\" & $fileList[$X], $bar ) if @Error then msgbox(0,"b",@Error) fileclose($foo) next EndIf !? m. There is probably something I don't understand. When I use the script I posted to get new files in Atemp. I get files that I can open and read in SciTE or notepad. Here is a screenshot showing the binary values and the text for the first converted file. Does that look like what you get?(I forgot to make sure the cursor wasn't in the screenshot.) EDIT: But you aren't using the code I posted! What happens if you do use the code I posted? Edited March 8, 2010 by martin Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
myspacee Posted March 9, 2010 Author Posted March 9, 2010 Martin, I use code you post and return directory with zeroed files. I add some error control and fileopen/fileclose func, but obtain always zeroed files, until i forcing binary(byte) reading. when i'm in office post zipped drectory with your code and my file so you can test your script with my 'setting'. Thank you again for your time, m.
martin Posted March 9, 2010 Posted March 9, 2010 On 3/9/2010 at 6:54 AM, 'myspacee said: Martin,I use code you post and return directory with zeroed files.I add some error control and fileopen/fileclose func,but obtain always zeroed files, until i forcing binary(byte) reading.when i'm in office post zipped drectory with your code and my file so you can testyour script with my 'setting'.Thank you again for your time,m.Ok, but if when you try with the posted files it doesn't work but when I try with the posted files it does then I don't know how to make any progress. If you try with exactly the code I posted using exactly the same files you gave and the first file created doesn't look exactly like the result I showed then I'm a bit lost. The code I posted removes NULLs so I don't understand the need to read binary. If you are going to read the whole file then FileRead is sufficient; FileOpen, FileRead, FileClose is not needed.I'm using AUtoIt version 3.3.6.0 and Beta 3.3.5.6. Both seem to give the same results. Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
trancexx Posted March 9, 2010 Posted March 9, 2010 There was a change in RegExp behavior regarding NULL character. I'm not sure this is documented. Some time ago I made a ticket regarding NULL and RegExp but that went nowhere. Nevertheless, seems it was not for nothing after all.myspacee should really say what version of AutoIt she/he is using. It make no sense helping something that is outdated. ♡♡♡ . eMyvnE
martin Posted March 9, 2010 Posted March 9, 2010 On 3/9/2010 at 8:04 AM, 'trancexx said: There was a change in RegExp behavior regarding NULL character. I'm not sure this is documented. Some time ago I made a ticket regarding NULL and RegExp but that went nowhere. Nevertheless, seems it was not for nothing after all.myspacee should really say what version of AutoIt she/he is using. It make no sense helping something that is outdated.Thanks trancexx, that could explain it. Let's see what myspacee says. Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
myspacee Posted March 9, 2010 Author Posted March 9, 2010 (edited) Upgrade from 3.3.0.0 to 3.3.6.0 solve problem. Scary to upgrade AI version, this upgrade solve RegExp problem but broken my FTP_Ex.au3. D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(10,40) : ERROR: $GENERIC_READ previously declared as a 'Const' Global Const $GENERIC_READ = 0x80000000 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(11,41) : ERROR: $GENERIC_WRITE previously declared as a 'Const' Global Const $GENERIC_WRITE = 0x40000000 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(22,268) : ERROR: $tagWIN32_FIND_DATA previously declared as a 'Const' Global Const $tagWIN32_FIND_DATA = "DWORD dwFileAttributes; dword ftCreationTime[2]; dword ftLastAccessTime[2]; dword ftLastWriteTime[2]; DWORD nFileSizeHigh; DWORD nFileSizeLow; dword dwReserved0; dword dwReserved1; CHAR cFileName[260]; CHAR cAlternateFileName[14];" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ D:\Prj\myScript\103_indagine_territoriale\FTP_20.au3 - 3 error(s), 0 warning(s) Rollback m. Edited March 9, 2010 by myspacee
trancexx Posted March 9, 2010 Posted March 9, 2010 If you insist (no matter how stupid that is) on using the old version of AutoIt then change related code to: $bar = StringReplace(FileRead ($fileList[$X]), Chr(0), "") $bar = StringRegExpReplace($bar,"[^0-9a-zA-Z \h\v]","") ♡♡♡ . eMyvnE
myspacee Posted March 9, 2010 Author Posted March 9, 2010 On 3/9/2010 at 10:27 AM, 'trancexx said: If you insist (no matter how stupid that is) on using the old version of AutoIt then change related code to: $bar = StringReplace(FileRead ($fileList[$X]), Chr(0), "") $bar = StringRegExpReplace($bar,"[^0-9a-zA-Z \h\v]","") trancexx, stupid or not i've some AI script in production, not only on my office. Develop environment -> test -> production chain can't be broken in my case. Rollback to 3.3.0.0 solve problem, choose minor malus. Thank you again, m.
martin Posted March 9, 2010 Posted March 9, 2010 On 3/9/2010 at 10:04 AM, 'myspacee said: Upgrade from 3.3.0.0 to 3.3.6.0 solve problem. Scary to upgrade AI version, this upgrade solve RegExp problem but broken my FTP_Ex.au3. D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(10,40) : ERROR: $GENERIC_READ previously declared as a 'Const' Global Const $GENERIC_READ = 0x80000000 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(11,41) : ERROR: $GENERIC_WRITE previously declared as a 'Const' Global Const $GENERIC_WRITE = 0x40000000 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(22,268) : ERROR: $tagWIN32_FIND_DATA previously declared as a 'Const' Global Const $tagWIN32_FIND_DATA = "DWORD dwFileAttributes; dword ftCreationTime[2]; dword ftLastAccessTime[2]; dword ftLastWriteTime[2]; DWORD nFileSizeHigh; DWORD nFileSizeLow; dword dwReserved0; dword dwReserved1; CHAR cFileName[260]; CHAR cAlternateFileName[14];" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ D:\Prj\myScript\103_indagine_territoriale\FTP_20.au3 - 3 error(s), 0 warning(s) Rollback m. I'm glad you fixed your problem but I am dismayed that you would rather roll back to 3.3.0.0 than add 3 semicolons to comment out the constants that are now already defined for you. Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
MvGulik Posted March 9, 2010 Posted March 9, 2010 (edited) whatever Edited February 7, 2011 by MvGulik "Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions.""The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014) "Believing what you know ain't so" ... Knock Knock ...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now