Jump to content
Sign in to follow this  
AndyS01

Regular expression absorbes @CRLFs

Recommended Posts

I have a regular expression that I use in a StringRegExpReplace() function that replaces CRLFs if the matching text is the last text on a line.
Here is my example script:

test()
Exit (1)

Func test()
    Local $sfor, $pat, $sTextBefore = "", $sTextFixed

    $sTextBefore &= "Line 1 MF Midget vs NE Midget" & @CR
    $sTextBefore &= "Line 2 Midget  1 vs VYY Stars" & @CR

    $sfor = "Midget"

    $pat = "(?i)Midget\s{0,}[s]{0,1}\s{0,}[0-9]?+"
    $sTextFixed = StringRegExpReplace($sTextBefore, $pat, "+++")

    ConsoleWrite("+++: $pat ====>" & $pat & "<==" & @CRLF)
    ConsoleWrite("+++: $sTextBefore ==>" & @CRLF & $sTextBefore & @CRLF)
    ConsoleWrite("+++: $sTextFixed ===>" & @CRLF & $sTextFixed & @CRLF)
EndFunc

The output looks like this (I can't figure out how to remove the strikeout in this example):

Quote

+++: $pat ====>(?i)Midget\s{0,}{0,1}\s{0,}[0-9]?+<==
+++: $sTextBefore ==>
Line 1 MF Midget vs NE Midget
Line 2 Midget  1 vs VYY Stars

+++: $sTextFixed ===>
Line 1 MF +++vs NE +++Line 2 +++ vs VYY Stars
 

 

Edited by AndyS01

Share this post


Link to post
Share on other sites

Almost.  Adding '?' after a match prevents regexp from being greedy?  The concept of greedy, lazy and possessive is quite confusing.  I looked at some on line tutorials and got my head swimming.

My example didn't supply a plural for Midget.  I meed to match 'Midget', 'Midgets', 'Midget 1', 'Midgets 1':

   $sTextBefore &= "Line 1 MF Midgets vs NE Midgets" & @CR
    $sTextBefore &= "Line 2 Midgets 1 vs VYY Stars" & @CR

When I do that, I only match 'Midget', not 'Midgets', like so:

test()
Exit (1)

Func test()
    Local $sfor, $pat, $sTextBefore = "", $sTextFixed

    $sTextBefore &= "Line 1 MF Midgets vs NE Midgets" & @CR
    $sTextBefore &= "Line 2 Midgets  1 vs VYY Stars" & @CR

    $sfor = "Midget"

    $pat = "(?i)Midget\s{0,}?[s]{0,1}?\s{0,}?[0-9]?+"
    $sTextFixed = StringRegExpReplace($sTextBefore, $pat, "+++")

    ConsoleWrite("+++: $pat ====>" & $pat & "<==" & @CRLF)
    ConsoleWrite("+++: $sTextBefore ==>" & @CRLF & $sTextBefore & @CRLF)
    ConsoleWrite("+++: $sTextFixed ===>" & @CRLF & $sTextFixed & @CRLF)
EndFunc

The output is this (notice the 's' characters remaining in the output):
 

Quote

Line 1 MF +++s vs NE +++s
Line 2 +++s  1 vs VYY Stars

 

Share this post


Link to post
Share on other sites

even though it doesnt catch the second instance of midget in Line 1?

Quote

+++: $pat ====>(?i)(?m)Midgets*[[:blank:]]+[0-9]*[[:blank:]]*<==


+++: $sTextBefore ==>
Line 1 MF Midget vs NE Midget Line 2 Midget  1 vs VYY Stars


+++: $sTextFixed ===>
Line 1 MF +++vs NE Midget Line 2 +++vs VYY Stars

 

Here's one minus the regex, just hammers the pieces into place

test()
Exit (1)

Func test()
    Local $sfor, $pat, $sTextBefore = "", $sTextFixed

    $sTextBefore &= "Line 1 MF Midget vs NE Midget" & @CR
    $sTextBefore &= "Line 2 Midget  1 vs VYY Stars" & @CR

    consolewrite(StringReplace(StringStripCR(StringReplace($sTextBefore , "Midgets" , "+++")) , "Midget" , "+++"))

EndFunc

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Oh-Oh  I see what you mean iamtheky. This didn't work:

test()
Exit (1)

Func test()
    Local $sfor, $pat, $sTextBefore = "", $sTextFixed

    $sTextBefore &= "Line 1 MF Midget vs NE Midgets" & @CR
    $sTextBefore &= "Line 2 Midgets  1 vs VYY Stars" & @CR

    $sfor = "Midget"

    $pat = "(?i)(?m)Midgets*[[:blank:]]+[0-9]*[[:blank:]]*"
    $sTextFixed = StringRegExpReplace($sTextBefore, $pat, "+++")

    ConsoleWrite("+++: $pat ====>" & $pat & "<==" & @CRLF)
    ConsoleWrite("+++: $sTextBefore ==>" & @CRLF & $sTextBefore & @CRLF)
    ConsoleWrite("+++: $sTextFixed ===>" & @CRLF & $sTextFixed & @CRLF)
EndFunc

The output was:

Quote

+++: $pat ====>(?i)(?m)Midgets*[[:blank:]]+[0-9]*[[:blank:]]*<==
+++: $sTextBefore ==>
Line 1 MF Midget vs NE Midgets
Line 2 Midgets  1 vs VYY Stars

+++: $sTextFixed ===>
Line 1 MF +++vs NE Midgets
Line 2 +++vs VYY Stars

 

Share this post


Link to post
Share on other sites

The 's' in Midgets' is optional.  As I said before, I want to replace "Midget", "Midgets", "Midget 1", "Midgets 1" with the "+++" string.  Also, the '1' could be [0-9].

Share this post


Link to post
Share on other sites

Then that should work for you:

_test()
Exit (1)

Func _test()
    Local $sfor, $pat, $sTextBefore = "", $sTextFixed

    $sTextBefore &= "Line 1 MF Midget vs NE Midgets" & @CR
    $sTextBefore &= "Line 2 Midgets  1 vs VYY Stars" & @CR

    $sfor = "Midget"

    $pat = "(?im)Midgets?(?: *\d*)?"
    $sTextFixed = StringRegExpReplace($sTextBefore, $pat, "+++")

    ConsoleWrite("+++: $pat ====>" & $pat & "<==" & @CRLF)
    ConsoleWrite("+++: $sTextBefore ==>" & @CRLF & $sTextBefore & @CRLF)
    ConsoleWrite("+++: $sTextFixed ===>" & @CRLF & $sTextFixed & @CRLF)
EndFunc

 


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
2 hours ago, mikell said:

An alternative way allowing variations

$sTextBefore = "Line 1 MF Midget vs NE Midgets" & @CR & _
        "Line 2 Midgets -1a vs VYY Stars" & @CR

$pat = "(?im)Midget.*?(?=\hvs|$)"
Msgbox(0,"", StringRegExpReplace($sTextBefore, $pat, "+++"))

 

In real live production, the 'vs' might not be there.

 

2 hours ago, mikell said:

An alternative way allowing variations

$sTextBefore = "Line 1 MF Midget vs NE Midgets" & @CR & _
        "Line 2 Midgets -1a vs VYY Stars" & @CR

$pat = "(?im)Midget.*?(?=\hvs|$)"
Msgbox(0,"", StringRegExpReplace($sTextBefore, $pat, "+++"))

 

 

Share this post


Link to post
Share on other sites
5 hours ago, AndyS01 said:

In real live production, the 'vs' might not be there.

So the question was badly asked. No way to guess that "real life" might be different than the provided example  :ermm:

Share this post


Link to post
Share on other sites

True enough.  The original question was why the CRLF was being absorbed, joining two lines,  The eventual fix was the (?m) part of the RE.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By barkeeper
      Hi there, not sure if this is the right place, but I'm building a script for using premade answers in chat. It works and outputs the answers (stored in an ini file based on:  answer name = answer it also works with a new line if there is a tag <enter> in the answer text in the ini file. Now the problem is, that after the answer is pasted in the chat (you bring up the interface with ctrl+1) it gives an enter, thus sending the output straight away. I would like to be able to review the answer before sending it. can anyone help me by telling what's wrong in the script, I tried replacing the @crlf in the script with other options, no luck. Thanks in advance! 
      antwoorden.au3
      antwoorden.ini
    • By Wicked_Caty
      My program generates a huge lot of numbers that has to be displayed in a label. Usually there are around 100 thousand and 1 million numbers, so I obviously have certain problems with space. I get the data into the label, but it's all in one line and simply exceeds the label (by a couple of million pixels, but never mind that)... Of course it won't fit into a 200x200 big label either, but it'd look a lot nicer. I don't even need to scroll in it.
      The only thing in need is an automatic @CRLF at the border of the label.
      Something like the style $ES_MULTILINE for the input box.
      Here's the most important stuff from the GUI.
      Local $gui = GUICreate("Crypt", 630, 440) Local $in = GUICtrlCreateInput("", 10, 10, 300, 300, $ES_MULTILINE) ; need something like $ES_MULTILINE just one line later in this code Local $out = GUICtrlCreateLabel("", 320, 10, 300, 300) ; instead of "", please imagine a random number between 1e5 and 1e6 here please ; a lot of buttons that don't matter right now GUISetBkColor(0x111111, $gui) GUICtrlSetBkColor($in, 0xEEEEEE) GUICtrlSetBkColor($out, 0xEEEEEE) ; again a lot of button configuartion that's completely unnecessary right now GUICtrlSetColor($in, 0x111111) GUICtrlSetColor($out, 0x111111) ; same as before GUISetFont(13, 200, 0, "Candara", $gui) GUICtrlSetFont($in, 13, 200, 0, "Candara") GUICtrlSetFont($out, 13, 200, 0, "Candara") ; still configuation for the buttons Local $data0 = "" ; some unimportant variables GUISetState(@SW_SHOW, $gui) ; switch from GUIGetMsg() in an infinite while-loop Thanks!
    • By Zedna
      I have got text file and I want to find two words with exact number of lines between them


      #include <Array.au3> ; first text $txt = '111' & @CRLF & '222' & @CRLF & '333' & @CRLF & '444' & @CRLF & '555' ; second text ;$txt = '111' & @CRLF & '222' & @CRLF & 'x' & @CRLF & 'y' & @CRLF & '333' & @CRLF & '444' & @CRLF & '555' $txt = StringRegExp($txt, '(?s)111(.*n{1,2}.*)333', 3) ;~ _ArrayDisplay($txt) For $i = 0 To UBound($txt) - 1 ConsoleWrite($i & ': ' & $txt[$i] & @CRLF) Next
      Here I try to find 111 and 333 and all between them but only in case when there are 1 or 2 @CRLF (rn) between them.
      So in first text it should find it
      and in second text (uncomment it) it shouldn't.

      But I'm not RegExp guru so my RegExp pattern is not OK and it finds my text no matter of number of @CRLFs.

      Of course I tried many combinations like
      (?s)111(.*)n{1,2}(.*)333
      (?s)111.*n{1,2}.*333
      (?s)111(.*rn{1,2}.*)333
      ...
      but I need help from some RegExp guru with this ...
×
×
  • Create New...