Jump to content
Sign in to follow this  
ChrisN

Regular Expressions :(

Recommended Posts

ChrisN

:idea:Someone should figure out an easier way to match things than using regular expressions. :unsure:

I need some help matching things -- I can't seem to get my regular expressions to work. So here is what I am trying to do:

(***************)

N100 (FINISH PASS)

(150MM SAW BLADE)

M6 T0 S12000 M3

D11

( - - - - - - - - - - - - -)

(STARTMOP)

some

text

goes

here

123

456

(ENDMOP)

(***************)

N100 (Roughing PASS)

(12MM Rougher)

M6 T0 S12000 M3

D11

( - - - - - - - - - - - - -)

(STARTMOP)

some

more

text

goes

here

123

456

789

asdf

(ENDMOP)

Basically, I want to have just the hilighted parts extracted. I am trying to match them using "N100 (" and "(STARTMOP)" and "(ENDMOP)" to define what to extract, but I am having trouble getting a regular expression that matches anything - I just get errors :( Can anyone help?

(BTW, I am using GEOSoft's PCRE Toolkit to test my regular expressions)

Share this post


Link to post
Share on other sites
DicatoroftheUSA

It would help if you posted what you tried. For example, are you loading the entire file into one string? Did you set the regex option not to individually test each line? Plus regular expressions can check for things that are ... regular, what you want to collection looks somewhat random to me.

Share this post


Link to post
Share on other sites
jdelaney

gotta be better ways, but this works for the given example

$string = "(***************)" & @CRLF & _
"N100 (FINISH PASS)" & @CRLF & _
"(150MM SAW BLADE)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"(ENDMOP)" & @CRLF & _
"(***************)" & @CRLF & _
"N100 (Roughing PASS)" & @CRLF & _
"(12MM Rougher)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"more" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"789" & @CRLF & _
"asdf" & @CRLF & _
"(ENDMOP)" & @CRLF

$array = StringRegExp($string,"(?U)\((\d.*)\)[\r\n\W\w\s]+(\(STARTMOP\)[\r\n\W\w\s]+\(ENDMOP\))",3)
For $i = 0 To UBound ($array)-1
ConsoleWrite($array[$i] & @CRLF)
Next

or, group them using 4:

$array = StringRegExp($string,"(?U)\((\d.*)\)[\r\n\W\w\s]+(\(STARTMOP\)[\r\n\W\w\s]+\(ENDMOP\))",4)

For $i = 0 To UBound ($array)-1
$aTemp = $array[$i]
For $j = 1 To UBound ($aTemp) - 1
  ConsoleWrite("MatchGroup=[" & $i+1 & "], subGroup=[" & $j & "]: " & $aTemp[$j] & @CRLF)
Next
Next
Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
ChrisN

@jdelaney: That works for my example - I'll have to test it at work tomorrow & see if it works on more things. Thanks!

Edit: It doesn't work if the tool name (150MM SAW BLADE) doesn't start with a number. I modified it and it works now

$array = StringRegExp($string,"(?U)N100 (.*)s+((.*))[rnWws]+((STARTMOP)[rnWws]+(ENDMOP))",3)

Edited by ChrisN

Share this post


Link to post
Share on other sites
PhoenixXL

This looks more promising in this case

Have a look

#include <Array.au3>

;Dividing the keypoint will keep the pattern clear though it may require more regexes.

;Working - If the start is either (STARTMOP) match till (ENDMOP)
; orelse match the characters excluding the brackets
; if brackets are not there the match fails

$string = "(***************)" & @CRLF & _
"N100 (FINISH PASS)" & @CRLF & _
"(150MM SAW BLADE)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"(ENDMOP)" & @CRLF & _
"(***************)" & @CRLF & _
"N100 (Roughing PASS)" & @CRLF & _
"(12MM Rougher)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"more" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"789" & @CRLF & _
"asdf" & @CRLF & _
"(ENDMOP)" & @CRLF

Local $Condition = "(?sm)^(?:\(STARTMOP\).*?\(ENDMOP\)" & _ ;Either (STARTMOB) ... (ENDMOB)
"|" & "\(([\w ]+)\))" ;or (SOME OTHER TEXT)

ConsoleWrite( "Pattern: " & $Condition & @CRLF & "+--------------------------------------------" & @CRLF )

$array = StringRegExp($string,  $Condition , 3) ;Global Match

For $i = 0 To UBound($array) - 1
ConsoleWrite( $array[$i] & @CR )
Next
Output:
150MM SAW BLADE
(STARTMOP)
some
text
goes
here
123
456
(ENDMOP)
12MM Rougher
(STARTMOP)
some
more
text
goes
here
123
456
789
asdf
(ENDMOP)
Edited by PhoenixXL

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • nend
      By nend
      This is a program that I made to help my self learn better regular expressions.
      There are a lot of other programs/website with the similar functions.
      But the main advantage of this program is that you don't have to click a button after every changes.
      The program detected changes and react on it.
      Function:
      Match Match of arrays Match and replace Load source data from website Load source data from a website with GET/POST Load text data from file Clear fields Export and Import settings (you can finish the expression a other time, just export/import it) Cheat sheet Generate AutoIt code The source code is not difficult and I think most user will understand it.
      In the zip file there are 2 export files (POST and a reg back example), you can drag and drop these files on the gui to import them.
      Download Regex Toolkit Regex toolkit.zip (Sourcode, exmaple and exe file)
      EDIT: Updated to version V1.2.0
      Changes are:
      Expand and collapse of the cheat sheet (Thanks to Melba23 for the Guiextender UDF) Usefull regular expressions websites links included in the program Text data update time EDIT: Updated to version V1.3.0
      Changes are:
       Automatic generate AutoIt code  Icons on the tab  Few minor bug fixes EDIT: Updated to version V1.4.0
      Changes are:
      Link to AutoIt regex helpfile If the regular expression has a error than the text becomes red Option Offset with Match and array of Matches Option Count with Match and replace Some small minor bug fixed EDIT: Updated to version V1.4.1
      Changes are:
      Small bug in "create AutoIt" code fixed
    • therks
      By therks
      So I have this pattern: 
      ^(?:(\d+)|(\d+):(\d+)|(\d+):(\d+):(\d+))$ And I'm expecting (depending on input) to get a 1, 2 or 3 index array (or @error for invalid input).
      But instead I get this:
      #include <Debug.au3> Func Test($String) _DebugArrayDisplay(StringRegExp($String, '^(?:(\d+)|(\d+):(\d+)|(\d+):(\d+):(\d+))$', 1)) EndFunc Test('10') ; Results (normal, expected): ; Row 0|10 Test('10:20') ; Results (extra blank index): ; Row 0| ; Row 1|10 ; Row 2|20 Test('10:20:30') ; Results (three blank indices): ; Row 0| ; Row 1| ; Row 2| ; Row 3|10 ; Row 4|20 ; Row 5|30 Is this normal? Should I just code around it, or is there a better way to do what I'm looking for?
      I also tried reversing my regex, but it was even uglier results:
      #include <Debug.au3> Func Test($String) _DebugArrayDisplay(StringRegExp($String, '^(?:(\d+):(\d+):(\d+))|(\d+):(\d+)|(\d+)$', 1)) EndFunc Test('10') ; Results (yuck): ; Row 0| ; Row 1| ; Row 2| ; Row 3| ; Row 4| ; Row 5|10 Test('10:20') ; Results (slightly better): ; Row 0| ; Row 1| ; Row 2| ; Row 3|10 ; Row 4|20 Test('10:20:30') ; Results (nice): ; Row 0|10 ; Row 1|20 ; Row 2|30  
    • Deye
      By Deye
      Hi,
      I want to add any needed conditions to the StringRegExp command so it can pull out only  "File.au3", "WinAPIFiles.au3", "Test.bmp" into the array
      #include <FileConstants.au3> #include <MsgBoxConstants.au3> #include 'WinAPIFiles.au3' #include "File.au3" ; Script Start - Add your code below here Local $bFileInstall = False ; Change to True and ammend the file paths accordingly. ; This will install the file C:\Test.bmp to the script location. If $bFileInstall Then FileInstall("C:\Test.bmp", @ScriptDir & "\Test.bmp") $sFile = FileRead(@ScriptFullPath) $aResults = StringRegExp($sFile, "(?i)(FileInstall\s*|include\s*)(.*)", 3) _ArrayDisplay($aResults) Thanks In Advance
      Deye
    • FroVN
      By FroVN
      i have a text : <Name>Jonh</Name>.<Age>15</Age>
      how i can get Jonh and 15 in one stringregexp? pls give me example
    • therks
      By therks
      I'm looking for a regex genius, cus I'm stumped when it comes to assertions.
      So what I have now, is this regular expression: ([^|=]+)=([^|]+)
      It takes a string (user input) of keys=values separated by pipes (ie: "param=value|param=value") and splits them into an array.
      Example:
      $vParamData = 'example=value|fruit=apple|phrase=Hello world' $aRegEx = StringRegExp($vParamData, '([^|=]+)=([^|]+)', 3) ; Result ; [0] => example ; [1] => value ; [2] => fruit ; [3] => apple ; [4] => phrase ; [5] => Hello world So that's working fine, but I'm wondering if there's also a way I could have this capture escaped pipes instead of splitting by them.
      ie:
      $vParamData = 'pipe test=this \| is a pipe|example=value' $aRegEx = StringRegExp($vParamData, '([^|=]+)=([^|]+)', 3) ; I'm getting this: ; [0] => pipe test ; [1] => this \ ; [2] => example ; [3] => value ; But I'd like a result like this: ; [0] => pipe test ; [1] => this \| is a pipe ; [2] => example ; [3] => value Is there some pattern that would accomplish this, or am I better off parsing it some other way?
×