Jump to content
Chimp

how to "clean" a string by regexp?

Recommended Posts

Good evening,
is there an (easy?) way to do something like:
pass to a regular expression two strings:
first string: an input string
second string: a string of selected chars arbitrarily choosed

the returned string should be the first string keeping or removing only the chars included in the second string
the second string should be considered in one case as allowed characters, and instead in another case as illegal characters

example (first case):
input string "% 987,(465) -/abc\- [788.9100]*"
second string "1234567890.," ; allowed chars
returned string "987,465788.9100" ; kept only allowed chars

and also the opposite:
the returned string should be the first string cleaned by the unallowed chars listed in the second string

example (second case)
input string  "% 987,(465) -/abc\- [788.9100]*"
second string "1234567890.," ; discard chars
returned string "% () -/abc\- []*" ; unallowed chars removed

thanks to those who want to help :)

a little template for testing:

Local $sOriginal = "% 987,(465) -/abc\- [788.9100]*"
Local $sArbitrary = "1234567890.,"

MsgBox(0, "Keept", _StringKeep($sOriginal, $sArbitrary)) ; should show --> 987,465788.9100

MsgBox(0, "discarded", _StringDiscard($sOriginal, $sArbitrary)) ; should show --> % () -/abc\- []*

Func _StringKeep($sInput, $sAllow)
    Local $sFiltered = '????' ; how to keep ?
    Return $sFiltered
EndFunc   ;==>_StringKeep

Func _StringDiscard($sInput, $sDisallow)
    Local $sFiltered = '????' ; how to discard ?
    Return $sFiltered
EndFunc   ;==>_StringDiscard

 


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

I think this should work

Local $sOriginal = "% 987,(465) -/abc\- [788.9100]*"
Local $sArbitrary = "1234567890.,"

MsgBox(0, "Keept", _StringKeep($sOriginal, $sArbitrary)) ; should show --> 987,465788.9100

MsgBox(0, "discarded", _StringDiscard($sOriginal, $sArbitrary)) ; should show --> % () -/abc\- []*


Func _StringDiscard($sInput, $sDiscard, $sRegExp = "")

    Return StringRegExpReplace($sInput, "[\Q" & $sDiscard & "\E" & $sRegExp & "]", "")

EndFunc

Func _StringKeep($sInput, $sKeep, $sRegExp = "")

    Return StringRegExpReplace($sInput, "[^\Q" & $sKeep & "\E" & $sRegExp & "]", "")

EndFunc

Edit: The \Q and \E start and end a literal quote to prevent anything like \d being expanded into matching all numbers. You could add RegExp option to expand these if you would rather.

Edited by seadoggie01

All my code provided is Public Domain... but it may not work. ;) Use it, change it, break it, whatever you want.

Share this post


Link to post
Share on other sites
$in =  "% 987,(465) -/abc\- [788.9100]*"
$out = "1234567890.," ; match chars


msgbox(0, '' , StringRegExpReplace($in , "[" & $out & "]" , ""))  ;disallow

msgbox(0, '' , StringRegExpReplace($in , "[^" & $out & "]" , ""))  ;only allow

 

edit: yall are fast, and I suppose the literals are probably better if there is control character potential

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
4 minutes ago, iamtheky said:

edit: yall are fast, and I suppose the literals are probably better if there is control character potential

I was nervous I would see a "<username> has replied. Click to show" message

5 minutes ago, Chimp said:

So simple and so useful, cute!

Thanks! I was super proud of myself for that :D


All my code provided is Public Domain... but it may not work. ;) Use it, change it, break it, whatever you want.

Share this post


Link to post
Share on other sites
17 minutes ago, iamtheky said:
$in =  "% 987,(465) -/abc\- [788.9100]*"
$out = "1234567890.," ; match chars


msgbox(0, '' , StringRegExpReplace($in , "[" & $out & "]" , ""))  ;disallow

msgbox(0, '' , StringRegExpReplace($in , "[^" & $out & "]" , ""))  ;only allow

 

edit: yall are fast, and I suppose the literals are probably better if there is control character potential

..even simpler!? nice!

Thanks @iamtheky  :)


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites
1 hour ago, iamtheky said:

and I suppose the literals are probably better if there is control character potential

Your Suppose is right ;).

Local $sOriginal, $sArbitrary

$sOriginal = "% 987,(465) -/abc\- ?[788.9100]?*??"
$sArbitrary = "1234567890.,?"
ConsoleWrite("> -------------------------------------------- " & @CRLF)
ConsoleWrite("+ $sOriginal --> " & $sOriginal & @CRLF)
ConsoleWrite("+ $sArbitrary -> " & $sArbitrary & @CRLF)
ConsoleWrite("> --- (V1=@seadoggie01  /  V2=@iamtheky) : --- " & @CRLF)
ConsoleWrite("+ Allow V1 ----> " & _StringKeep($sOriginal, $sArbitrary) & @CRLF)
ConsoleWrite("+ Allow V2 ----> " & StringRegExpReplace($sOriginal , "[^" & $sArbitrary & "]" , "") & @CRLF)
ConsoleWrite("< Disallow V1 -> " & _StringDiscard($sOriginal, $sArbitrary) & @CRLF)
ConsoleWrite("< Disallow V2 -> " & StringRegExpReplace($sOriginal , "[" & $sArbitrary & "]" , "") & @CRLF)


$sArbitrary = "1234567890.,?\"  ; ==> added a backslash at the end
ConsoleWrite("> -------------------------------------------- " & @CRLF)
ConsoleWrite("+ $sOriginal --> " & $sOriginal & @CRLF)
ConsoleWrite("+ $sArbitrary -> " & $sArbitrary & @CRLF)
ConsoleWrite("> --- (V1=@seadoggie01  /  V2=@iamtheky) : --- " & @CRLF)
ConsoleWrite("+ Allow V1 ----> " & _StringKeep($sOriginal, $sArbitrary) & @CRLF)
ConsoleWrite("+ Allow V2 ----> " & StringRegExpReplace($sOriginal , "[^" & $sArbitrary & "]" , "") & @CRLF)
ConsoleWrite("< Disallow V1 -> " & _StringDiscard($sOriginal, $sArbitrary) & @CRLF)
ConsoleWrite("< Disallow V2 -> " & StringRegExpReplace($sOriginal , "[" & $sArbitrary & "]" , "") & @CRLF)

Func _StringKeep($sInput, $sKeep)
    Return StringRegExpReplace($sInput, "[^\Q" & $sKeep & "\E]", "")
EndFunc
Func _StringDiscard($sInput, $sDiscard)
    Return StringRegExpReplace($sInput, "[\Q" & $sDiscard & "\E]", "")
EndFunc
Spoiler

> --------------------------------------------
+ $sOriginal --> % 987,(465) -/abc\- ?[788.9100]?*??
+ $sArbitrary -> 1234567890.,?
> --- (V1=@seadoggie01  /  V2=@iamtheky) : ---
+ Allow V1 ----> 987,465?788.9100???
+ Allow V2 ----> 987,465?788.9100???

< Disallow V1 -> % () -/abc\- []*
< Disallow V2 -> % () -/abc\- []*

> --------------------------------------------
+ $sOriginal --> % 987,(465) -/abc\- ?[788.9100]?*??
+ $sArbitrary -> 1234567890.,?\
> --- (V1=@seadoggie01  /  V2=@iamtheky) : ---
+ Allow V1 ----> 987,465\?788.9100???
+ Allow V2 ----> % 987,(465) -/abc\- ?[788.9100]?*??
< Disallow V1 -> % () -/abc- []*
< Disallow V2 -> % 987,(465) -/abc\- ?[788.9100]?*??

Edited by Musashi

Musashi-C64.png

"In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."

Share this post


Link to post
Share on other sites

indeed, kind friend.  'suppose' was probably the wrong word as there was little uncertainty.  I was more just reaffirming my motto:

Quote

I cant write you better code, but I can write it slower.

 


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...