Jump to content

StringRegExp - find two words with exact number of lines between them


Zedna
 Share

Recommended Posts

I have got text file and I want to find two words with exact number of lines between them

#include <Array.au3>

; first text
$txt = '111' & @CRLF & '222' & @CRLF & '333' & @CRLF & '444' & @CRLF & '555'

; second text
;$txt = '111' & @CRLF & '222' & @CRLF & 'x' & @CRLF & 'y' & @CRLF & '333' & @CRLF & '444' & @CRLF & '555'

$txt = StringRegExp($txt, '(?s)111(.*n{1,2}.*)333', 3)

;~ _ArrayDisplay($txt)

For $i = 0 To UBound($txt) - 1
    ConsoleWrite($i & ': ' & $txt[$i] & @CRLF)
Next

Here I try to find 111 and 333 and all between them but only in case when there are 1 or 2 @CRLF (rn) between them.

So in first text it should find it

and in second text (uncomment it) it shouldn't.

But I'm not RegExp guru so my RegExp pattern is not OK and it finds my text no matter of number of @CRLFs.

Of course I tried many combinations like

(?s)111(.*)n{1,2}(.*)333

(?s)111.*n{1,2}.*333

(?s)111(.*rn{1,2}.*)333

...

but I need help from some RegExp guru with this ...

Edited by Zedna
Link to comment
Share on other sites

I'm pretty sure that those numbers are not what you are really trying to find so it's a bit difficult to test but this returns what you asked for.

"(?sU)111.+v{1,2}.+333"

If you don't want to show what you are actually working with you can PM it to me and I'll test it for you.

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

"(?sU)111.+v{1,2}.+333"

Thanks.

Anyway for me it doesn't work because when you uncomment

$txt = '111' & @CRLF & '222' & @CRLF & 'x' & @CRLF & 'y' & @CRLF & '333' & @CRLF & '444' & @CRLF & '555'

then StringRegExp should find nothing because between 111 and 333 is more than two ends of line

but your pattern returns

111

222

x

y

333

Can you explain in short what's for?

(?U) Invert greediness of quantifiers.

My $txt with numbers is for simple not complicated example. I need to find in my work problematic parts of texts inside our source files (PowerBuilder) by keywords. Today I found some bug in some our sources and I want to go through all our sources to find possible similar bug (two keywords not so far each other).

I can post some peace of real source here later ...

Edited by Zedna
Link to comment
Share on other sites

If you send me some of the actual file I can test better but give this a try.

"(?U)111v{1,2}.+v{1,2}333"

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

If you send me some of the actual file I can test better but give this a try.

"(?U)111v{1,2}.+v{1,2}333"

YES! :D

This one works great for me just with little modification

$txt = StringRegExp($txt, '(?U)111v{0,2}.+v{0,2}333', 3)

Many thanks for help.

EDIT:

In fact there is another little modification which better suits my needs

$txt = StringRegExp($txt, '(?U)111.*v{1,2}.*v{0,2}.*333', 3)
Edited by Zedna
Link to comment
Share on other sites

If you send me some of the actual file I can test better but give this a try.

*** bad code ***

update admin.table_name
   set column_name = :variable_name
where key_column_name = :key_parameter ;
if SQLCA.SqlCode <> 0 then
    rollback;
    MessageBox('Error', 'Error in UPDATE~n' + SQLCA.SqlErrText, Exclamation!)
    return
end if

*** good code ***

update admin.table_name
   set column_name = :variable_name
where key_column_name = :key_parameter ;
if SQLCA.SqlCode <> 0 then
    ls_error = SQLCA.SqlErrText
    rollback;
    MessageBox('Error', 'Error in UPDATE~n' + ls_error, Exclamation!)
    return
end if

I need to find each piece of bad code where is "SQLCA.SqlErrText" after "rollback" (but no more than 1 or 2 lines far).

My final pattern for this is:

(?iU)rollback.*v{1,2}.*v{0,2}.*SQLCA.SqlErrText

I'm just curious about (?U) in pattern and also why (?s) needn't be there and it works over more lines correctly without it.

EDIT: it seems I can use (?iU) instead of (?i)(?U)

Once more thanks for your help Geosoft.

Edited by Zedna
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...