Jump to content

Recommended Posts

Posted

Hello to all,

using StringReplace i encountered strange problem.

Have thousand files to check to create a report.

All these files have some ascii control codes on head (propetary format) :mellow:

eg:

NULNULSTXDC2 some text find also number then again text

is possible to :

- open file

- store all in a var

- keep only text/numbers (remove all symbol/ascii codes/etc)

Thank you for reading and any info,

m.

ps: can post some files if need

  • Moderators
Posted

myspeacee,

Not sure you need a SRE for this:

; Create a "binary" string as you would get from a file read in Binary format
$sText = "0x"
For $i = 0 To 127
    $sText &= Hex($i, 2)
Next
MsgBox(0,"Binary String", $sText)

; Move through the "binary" string and remove all characters below 32
$sNewText = "0x"
For $i = 0 to 127
    $sChar = BinaryMid($sText, 1 + $i, 1)
    ConsoleWrite($sChar & @CRLF)
    If $sChar > 31 Then $sNewText &= StringTrimLeft($sChar, 2)
Next
MsgBox(0, "", $sNewText)

If you want to remove other symbols, just change the If in the second loop to a Switch and use as many Case statements as you need. :mellow:

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

  Reveal hidden contents

 

Posted

  On 3/8/2010 at 6:15 PM, 'myspacee said:

Hello to all,

using StringReplace i encountered strange problem.

Have thousand files to check to create a report.

All these files have some ascii control codes on head (propetary format) :mellow:

eg:

NULNULSTXDC2 some text find also number then again text

is possible to :

- open file

- store all in a var

- keep only text/numbers (remove all symbol/ascii codes/etc)

Thank you for reading and any info,

m.

ps: can post some files if need

I'm not sure what you want to remove since an ascii code might be representing a character you want to keep.

Anyway, the easiest way to do it might be to decide what you want to keep,

Supposing you want to keep all numbers, all letters a to z, spaces and any vertical or horizontal whitespace character. Then you could remove everything else from the string $s like this

$sStripped = StringRegExpReplace($s,"[^0-9,a-z,A-Z, ,\h,\v]","")
Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Posted (edited)

@martin

Actually, you include the comma now :mellow: You cannot use the comma to separate character classes. Just omit them and it will work.

Compare:

$s = "123a,bc*^@),."
$sStripped_1 = StringRegExpReplace($s,"[^0-9,a-z,A-Z, ,\h,\v]","") 
$sStripped_2 = StringRegExpReplace($s,"[^0-9a-zA-Z\s]","") ; Space is included in \h btw -- \h = tabs & spaces so I left " " out. Also, as far as I know \h\v == \s

ConsoleWrite($sStripped_1 & @CR)
ConsoleWrite($sStripped_2 & @CR)
Edited by dani
Posted

Thank you for reply,

but can't solve and going mad.

Extract little part of my script :

#include <Array.au3>
#Include <File.au3>
        
        
;Gather files list into an array
$fileList = _FileListToArray(@ScriptDir, "*.", 1)
if @Error = 0 then ;if some files exist


    ;Loop through array from 1 to last file
    For $X = 1 to $fileList[0]
        ToolTip("",0,0)

            ;read file
            $foo = FileOpen ($fileList[$X], 0)
            $bar = FileRead ($foo)
            
            MsgBox(0,$fileList[$X],$bar)

            FileClose($foo)

    next
EndIf

post zipped folder with 204 'txt' files, for test.

http://www.webalice.it/t.bavaro/pvv47p.zip

Can't solve this riddle....

m.

Posted

@dani. Yes, thanks for pointing out my mistake.

@myspacee

I expect it does seem a bit strange but the files you are reading start with ascii code 00 which is used to mark the end of a string so the string you try to display will be "".

Try this

#include <Array.au3>
#Include <File.au3>


;Gather files list into an array
$fileList = _FileListToArray(@ScriptDir, "*.", 1)
if @Error = 0 then ;if some files exist
ConsoleWrite("No. of files = " & $fileList[0] & @CRLF)

    ;Loop through array from 1 to last file
    For $X = 1 to $fileList[0]
    ToolTip("",0,0)

    
    $bar = StringRegExpReplace(FileRead ($fileList[$X]),"[^0-9a-zA-Z \h\v]","")

    MsgBox(0,$fileList[$X],$bar)

    
    next
EndIf
Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Posted

Thank ou Martin,

incredible but doesn't work with posted files...

I see ascii code 00 and can't find way to avoid it,

if manually remove (eg: notepad) it works.

Can't do anything to mass of file there are too many,

I must find a solution...

m.

Posted

  On 3/8/2010 at 7:57 PM, 'myspacee said:

Thank ou Martin,

incredible but doesn't work with posted files...

I see ascii code 00 and can't find way to avoid it,

if manually remove (eg: notepad) it works.

Can't do anything to mass of file there are too many,

I must find a solution...

m.

This works for me with the posted files, which I checked before I posted the last time.

#include <Array.au3>
#Include <File.au3>

dircreate(@scriptDir & "\Atemp")
;Gather files list into an array
$fileList = _FileListToArray(@ScriptDir, "*.", 1)
if @Error = 0 then ;if some files exist
ConsoleWrite("No. of files = " & $fileList[0] & @CRLF)

    ;Loop through array from 1 to last file
    For $X = 1 to $fileList[0]
    ToolTip($x,0,0)


    $bar = StringRegExpReplace(FileRead ($fileList[$X]),"[^0-9a-zA-Z \h\v]","")
    filewrite("Atemp\" & $fileList[$X],$bar)
    ; MsgBox(0,$fileList[$X],$bar)

    next
EndIf

It converts the posted files in about 2 seconds.

Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Posted (edited)

thank you again Martin,

but in folder that last script create, i find zeroed files...

Maybe language setting can influence results ?

Ansi/unicode/utf issue ?

Some binary result using fileopen forcing binary(byte) reading:

#include <Array.au3>
#Include <File.au3>

dircreate(@scriptDir & "\Atemp")
;Gather files list into an array
$fileList = _FileListToArray(@ScriptDir, "*.", 1)
if @Error = 0 then ;if some files exist
    ConsoleWrite("No. of files = " & $fileList[0] & @CRLF)

    ;Loop through array from 1 to last file
    For $X = 1 to $fileList[0]
        ToolTip($x,0,0)
        ConsoleWrite("file = " & $fileList[$X] & @CRLF)
        
        
        $foo = FileOpen (@ScriptDir & "\" & $fileList[$X], 16)
        $bar = StringRegExpReplace(FileRead ($foo),"[^0-9a-zA-Z \h\v]","")
        if @Error then msgbox(0,"a",@Error)
            
        filewrite("Atemp\" & $fileList[$X], $bar )
        if @Error then msgbox(0,"b",@Error)



        fileclose($foo)
    next
EndIf

!?

m.

Edited by myspacee
Posted (edited)

  On 3/8/2010 at 10:26 PM, 'myspacee said:

thank you again Martin,

but in folder that last script create, i find zeroed files...

Maybe language setting can influence results ?

Ansi/unicode/utf issue ?

Some binary result using fileopen forcing binary(byte) reading:

#include <Array.au3>
#Include <File.au3>

dircreate(@scriptDir & "\Atemp")
;Gather files list into an array
$fileList = _FileListToArray(@ScriptDir, "*.", 1)
if @Error = 0 then ;if some files exist
    ConsoleWrite("No. of files = " & $fileList[0] & @CRLF)

 ;Loop through array from 1 to last file
 For $X = 1 to $fileList[0]
        ToolTip($x,0,0)
        ConsoleWrite("file = " & $fileList[$X] & @CRLF)
        
        
        $foo = FileOpen (@ScriptDir & "\" & $fileList[$X], 16)
        $bar = StringRegExpReplace(FileRead ($foo),"[^0-9a-zA-Z \h\v]","")
        if @Error then msgbox(0,"a",@Error)
            
        filewrite("Atemp\" & $fileList[$X], $bar )
        if @Error then msgbox(0,"b",@Error)



        fileclose($foo)
 next
EndIf

!?

m.

There is probably something I don't understand. When I use the script I posted to get new files in Atemp. I get files that I can open and read in SciTE or notepad.

Here is a screenshot showing the binary values and the text for the first converted file.

post-3602-12680889017326_thumb.png

Does that look like what you get?(I forgot to make sure the cursor wasn't in the screenshot.)

EDIT: But you aren't using the code I posted! What happens if you do use the code I posted?

Edited by martin
Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Posted

Martin,

I use code you post and return directory with zeroed files.

I add some error control and fileopen/fileclose func,

but obtain always zeroed files, until i forcing binary(byte) reading.

when i'm in office post zipped drectory with your code and my file so you can test

your script with my 'setting'.

Thank you again for your time,

m.

Posted

  On 3/9/2010 at 6:54 AM, 'myspacee said:

Martin,

I use code you post and return directory with zeroed files.

I add some error control and fileopen/fileclose func,

but obtain always zeroed files, until i forcing binary(byte) reading.

when i'm in office post zipped drectory with your code and my file so you can test

your script with my 'setting'.

Thank you again for your time,

m.

Ok, but if when you try with the posted files it doesn't work but when I try with the posted files it does then I don't know how to make any progress. If you try with exactly the code I posted using exactly the same files you gave and the first file created doesn't look exactly like the result I showed then I'm a bit lost. The code I posted removes NULLs so I don't understand the need to read binary. If you are going to read the whole file then FileRead is sufficient; FileOpen, FileRead, FileClose is not needed.

I'm using AUtoIt version 3.3.6.0 and Beta 3.3.5.6. Both seem to give the same results.

Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Posted

There was a change in RegExp behavior regarding NULL character. I'm not sure this is documented. Some time ago I made a ticket regarding NULL and RegExp but that went nowhere. Nevertheless, seems it was not for nothing after all.

myspacee should really say what version of AutoIt she/he is using. It make no sense helping something that is outdated.

♡♡♡

.

eMyvnE

Posted

  On 3/9/2010 at 8:04 AM, 'trancexx said:

There was a change in RegExp behavior regarding NULL character. I'm not sure this is documented. Some time ago I made a ticket regarding NULL and RegExp but that went nowhere. Nevertheless, seems it was not for nothing after all.

myspacee should really say what version of AutoIt she/he is using. It make no sense helping something that is outdated.

Thanks trancexx, that could explain it. Let's see what myspacee says.

Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Posted (edited)

Upgrade from 3.3.0.0 to 3.3.6.0 solve problem.

Scary to upgrade AI version, this upgrade solve RegExp problem

but broken my FTP_Ex.au3.

D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(10,40) : ERROR: $GENERIC_READ previously declared as a 'Const'
Global Const $GENERIC_READ = 0x80000000
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(11,41) : ERROR: $GENERIC_WRITE previously declared as a 'Const'
Global Const $GENERIC_WRITE = 0x40000000
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(22,268) : ERROR: $tagWIN32_FIND_DATA previously declared as a 'Const'
Global Const $tagWIN32_FIND_DATA = "DWORD dwFileAttributes; dword ftCreationTime[2]; dword ftLastAccessTime[2]; dword ftLastWriteTime[2]; DWORD nFileSizeHigh; DWORD nFileSizeLow; dword dwReserved0; dword dwReserved1; CHAR cFileName[260]; CHAR cAlternateFileName[14];"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
D:\Prj\myScript\103_indagine_territoriale\FTP_20.au3 - 3 error(s), 0 warning(s)

Rollback :mellow:

m.

Edited by myspacee
Posted

  On 3/9/2010 at 10:27 AM, 'trancexx said:

If you insist (no matter how stupid that is) on using the old version of AutoIt then change related code to:

$bar = StringReplace(FileRead ($fileList[$X]), Chr(0), "")
$bar = StringRegExpReplace($bar,"[^0-9a-zA-Z \h\v]","")

trancexx,

stupid or not i've some AI script in production, not only on my office.

Develop environment -> test -> production chain can't be broken in my case.

Rollback to 3.3.0.0 solve problem, choose minor malus.

Thank you again,

m.

Posted

  On 3/9/2010 at 10:04 AM, 'myspacee said:

Upgrade from 3.3.0.0 to 3.3.6.0 solve problem.

Scary to upgrade AI version, this upgrade solve RegExp problem

but broken my FTP_Ex.au3.

D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(10,40) : ERROR: $GENERIC_READ previously declared as a 'Const'
Global Const $GENERIC_READ = 0x80000000
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(11,41) : ERROR: $GENERIC_WRITE previously declared as a 'Const'
Global Const $GENERIC_WRITE = 0x40000000
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
D:\Prj\myScript\103_indagine_territoriale\FTP_Ex.au3(22,268) : ERROR: $tagWIN32_FIND_DATA previously declared as a 'Const'
Global Const $tagWIN32_FIND_DATA = "DWORD dwFileAttributes; dword ftCreationTime[2]; dword ftLastAccessTime[2]; dword ftLastWriteTime[2]; DWORD nFileSizeHigh; DWORD nFileSizeLow; dword dwReserved0; dword dwReserved1; CHAR cFileName[260]; CHAR cAlternateFileName[14];"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
D:\Prj\myScript\103_indagine_territoriale\FTP_20.au3 - 3 error(s), 0 warning(s)

Rollback :mellow:

m.

I'm glad you fixed your problem but I am dismayed that you would rather roll back to 3.3.0.0 than add 3 semicolons to comment out the constants that are now already defined for you.
Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...