Nubie

How to detect know and unknow strings with StringRegExp?

22 posts in this topic

#1 ·  Posted (edited)

Sorry I didn't understand about StringRegExp at all, I have a example to ask

strings: "???123???456???"

??? are unknow strings and unknow number of characters.123 and 456 are already know strings

Then how can I replace 123 to other string I want? How can detect where's 123 in example string for replace unknow strings at before or after it? (not detect only 123, because 123 maybe repeat at anywhere in unknow strings)

Thanks

Edited by Nubie

Share this post


Link to post
Share on other sites



I don't get it. You say 123 may be found elsewhere in unknown strings, so the question is: "How would you process the following string?"

aaaaaa123bbb456cccc123dddddd123456eeeeee123fffffff456zzzzz

Do you want to replace all occurences of the 123 and 456 substrings to, say, respectively 777 and 888? If so, StringReplace is all you need.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

No, with my example string  "???123???456???", it must have unknow at between. ??? are unknow, maybe they're 123 or 123456

If use StringInStr or  StringReplace , I'll have problem with example "123123123456123", result I want "123abc123456123" . Or with orher example "a123bb456ccc789123567456321", result I want "aabcbb456ccc789abc567456321" (abc is replacement string)

 

Edited by Nubie

Share this post


Link to post
Share on other sites

Then please explain the set of strict rules governing such processing in plain english, clearly and rigorously.

Because the two examples in your last post seem contradictory: in the first you seem to replace only the first occurence of 123 appearing after the first non-empty string not containing 123, but you perform two replacements in the second "example".

1 person likes this

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

I said this

Quote

??? are unknow strings and unknow number of characters.123 and 456 are already know strings

and this

Quote

(not detect only 123, because 123 maybe repeat at anywhere in unknow strings)

??? maybe are anything, can't foreknow

Edited by Nubie

Share this post


Link to post
Share on other sites

Try this.

Local $String = "123123123456123" & @CRLF & _ ; "123abc123456123"
        "a123bb456ccc789123567456321" ; "aabcbb456ccc789abc567456321"
        
ConsoleWrite(StringRegExpReplace($String, "(123)(.{1,5}456)", "abc$2") & @LF) ; Characters between "123" and "456" - minimum one, maximum five.

#cs ; Returns:-
123abc123456123
aabcbb456ccc789abc567456321

Note: If R.E. pattern were "(123)(.{1,6}456)", then the first test string would return "abc123123456123" because there are six characters between "123" and "456".
#ce

 

1 person likes this

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

yes, it's work fine what I want. But about this 

Quote
.{1,6}

With my understand, It'll check from 1 to 6 unknow characters lenght. Then how can check infinite characters lenght in this way?

And how can detect where that 123 standing for detect unknow stings before or after it?

Edited by Nubie

Share this post


Link to post
Share on other sites

Giving up.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
2 hours ago, Nubie said:

With my understand, It'll check from 1 to 6 unknow characters lenght. Then how can check infinite characters lenght in this way?

Quantifiers (or repetition specifiers)
1 or more (infinity) is {1,} or +
0 or more (infinity) is {0,} or *

2 hours ago, Nubie said:

And how can detect where that 123 standing for detect unknow stings before or after it?

A Regular Expression (RE). pattern starts trying to match the test string from left to right.
Using the test string "123123123456123" and the RE pattern "(123)(.{1,6}456)", or "(123)(.+456)",
Starting at the far left of the test string the first "123" match the start of the RE pattern.  Continuing to the right of the test string if there is 1 to 6, or many characters before "456", the entire RE pattern matches.


When  the RE pattern "(123)(.{1,5}456)" is used,  the first "123" does not match the entire RE pattern.  Because after "123" and counting 1 to 5 characters there is no "456". So continuing searching in the right direction along the test string the RE pattern tries to match again.   The second "123" matches and there are 3 characters before "456" - an RE pattern complete match. The search continues on from the "6" in "456" without a complete RE pattern match (only 3 characters left).

 

So the data in the test string that matches the first capture group of the RE pattern, "(123)" is replaced with "abc".   The data in the test string that matches the second capture group of the RE pattern, "(.{1,5}456)" is back-referenced "$2 " or "${2}" or "\2", which, from the test string would be "123456".
The position of the opening bracket of a capture group from left to right in the RE pattern defines its back-reference number. 

Example - In the RE pattern "(123)(.{1,5}456)",  the "(.{1,5}456)" part has the second open bracket,"(" when searching left to right.   So what "(.{1,5}456)" matches in the test string is what can be back-referenced, or what is in the variable "$2", namely "123456".

 

 

1 person likes this

Share this post


Link to post
Share on other sites

@Jos

Maybe, but I'd still like a clear plain english rigorous explanation of how the OP is wanting the thing to work. I don't know of any black magic pattern element in PCRE.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Clear and I am not either able to perform a dentist role.. (as in feels like pulling teeth).  
It's just sometimes "funny" how people aren't able to clearly define what they want.

Jos


Visit the SciTE4AutoIt3 Download page for the latest versions        Beta files                                                          Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

:)

Share this post


Link to post
Share on other sites

Please see this on regex?

https://regex101.com/

I have had to search for "dates" in the format dd-mm-yycc in text files, extract and convert to "ccyymmdd" and write to file.  Regex is the way to go.  Especially if you do not know where they text will be.  @Malkey gave a very nice exposition.


Skysnake

Why is the snake in the sky?

Share this post


Link to post
Share on other sites

#15 ·  Posted (edited)

@Malkey , thanks very much, I'm trying learn your tutorial. I tried detect unknow strinsg before already know strings like you said, but it's not correct what I want. Maybe I got wrong at somewhere

Local $String = "123123123456123" & "a123bbbbbb456ccc789123567456321"
MsgBox(0, "", StringLeft((StringRegExp($String, "(123)(.+456)")), 1))

@Skysnake, thanks, but I want learn how to write code on autoit only

@all: sorry all my english isn't good. Maybe I write hard to understand or make misunderstanding, I'm sorry. My example is just simple. My project doing with Hex, alway have unknow strings and repetition

Edited by Nubie

Share this post


Link to post
Share on other sites

all comments good

Share this post


Link to post
Share on other sites
9 hours ago, Nubie said:

....

Local $String = "123123123456123" & "a123bbbbbb456ccc789123567456321"
MsgBox(0, "", StringLeft((StringRegExp($String, "(123)(.+456)")), 1))

....

From your example, it appears you are not aware you are using the default Flag parameter of the StringRegExp function.

Also, if you are going to be using ".+", it would be best to understand the use of  the "?" in ".+?".

The Autoit help and the comments in the following example are there to help.

Local $String = "123123123456123" & "a123bbbbbb456ccc789123567456321" ; A previous posted example had the 2 strings separated
;   by "@CRLF" which are newline characters. In the RE pattern, the dot, ".", by dafault matches all characters except newline characters.
;   Two strings concatenated, or joined by an ampersand, "&" (a logogram, or symbol for the word "and"), simply gives you one big string.

; See "StringRegExp" in AutoIt help. The third parameter, "Flag", when absent, defaults to zero, "0".
; The zero Flag returns a "1" for a match, or, a "0" for a no match of the RE pattern in the string.
; StringRegExp with a zero Flag can be used as an expression in "If...Then" statements because True (1) or False (0) is returned.
MsgBox(0, 'StringRegExp($String, "(123)(.+456)")', StringRegExp($String, "(123)(.+456)")) ; Returns a "1"
MsgBox(0, 'StringLeft((StringRegExp($String, "(123)(.+456)")), 1)', StringLeft((StringRegExp($String, "(123)(.+456)")), 1)) ; Returns a "1"


; Also under "StringRegExp" in AutoIt help, see the mention of greedy and lazy.  Here are examples of each - Without question mark (greedy),
; and with question mark, "?" (lazy).
MsgBox(0, 'StringRegExpReplace($String, "(123)(.+456)","abc$2")', "Greedy" & @CRLF & _
        StringRegExpReplace($String, "(123)(.+456)", "abc$2")) ; ".+" is greedy by default (takes as much matches as possible
;       including intermediate "456"'s - will find the last "456") $2 = "123123456123a123bbbbbb456ccc789123567456"

MsgBox(0, 'StringRegExpReplace($String, "(123)(.+?456)","abc$2")', "Lazy" & @CRLF & _
        StringRegExpReplace($String, "(123)(.+?456)", "abc$2")) ; ".+?" is lazy (not greedy - will find the first occurrence of "456")
;       $2 = "123123456" and "a123bbbbbb456" and "567456" (3 complete RE pattern matches found in string)

 

Share this post


Link to post
Share on other sites

:/ I think you miss the point.  AutoIt uses a built in Regex library.  Even the StringRegExp you reference in your example is a Regex function.

See here:

https://www.autoitscript.com/autoit3/docs/tutorials/regexp/regexp.htm and

https://www.autoitscript.com/autoit3/docs/functions/StringRegExp.htm

Regex is a tool.  The link is simply to a site which make learning easier.  You comment is alike to saying "I want to learn to speak a language, but without learning the pronunciation."  You want AutoIt to work for you, make sure to unleash all its power.


Skysnake

Why is the snake in the sky?

Share this post


Link to post
Share on other sites

#19 ·  Posted (edited)

Hmmm, I'm still unclear with StringRegExp magics, confused. I 'll edit my topic name to correct my meaning

Ok here my practices Edit binary  to...

The basic: find any 85 C0 74 1A 68 then edit to 85 C0 EB 1A 68

My way, work correct what I want

$engine = @ScriptDir & "\engine.dll"
$hex_read = FileOpen($engine, 16)
$data = FileRead($hex_read)
FileClose($hex_read)
$moded = StringReplace($data, "85C0741A68", "85C0EB1A68")
$hex_read = FileOpen($engine, 18)
FileWrite($hex_read, Binary($moded))
FileClose($hex_read)

 

Next, harder. Find any 85 C0 74 1A 68 ?? ?? ?? 10 68 ?? ?? ?? 10 68 ?? ?? ?? 10 8B C8 E8 (?? are unknow strings) then edit to 85 C0 EB 1A 68 ?? ?? ?? 10 68 ?? ?? ?? 10 68 ?? ?? ?? 10 8B C8 E8

My way, work correct  what I want

$moded = StringRegExpReplace($data, "(85C0741A68)(.{6}1068)(.{6}1068)(.{6}108BC8E8)", "85C0EB1A68$2")

Next, harder, I'm stucking don't know how to do. Find 85 C0 74 1A 68 ?? ?? ?? 10 68 ?? ?? ?? 10 68 ?? ?? ?? 10 8B C8 E8, but only edit to 85 C0 EB 1A 68 ?? ?? ?? 10 68 ?? ?? ?? 10 68 ?? ?? ?? 10 8B C8 E8 at 3rd time (this strings have repeat more than 3rd times).

$moded = StringRegExp($data, "(85C0741A68)(.{6}1068)(.{6}1068)(.{6}108BC8E8)")

; loop  ???

 

Edited by Nubie

Share this post


Link to post
Share on other sites

#20 ·  Posted (edited)

#include<array.au3>

$data = "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8" & _
        "85C0741A68??????1068??????1068??????108BC8E8"


$aRex = StringRegExp($data, "(85C0741A68.{6}1068.{6}1068.{6}108BC8E8)" , 3)

_ArrayDisplay($aRex)

for $i = 2 to ubound($aRex) - 1
    $aRex[$i] =  "85C0EB1A68" & stringmid($aRex[$i] , 11)
    $i+=2
next

_ArrayDisplay($aRex)

 

This does it EVERY 3rd time.  If you only want it done at the 3rd time then remove the loop and only edit $aRex[2]

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now