Jump to content
Sign in to follow this  
ChrisN

Regular Expressions :(

Recommended Posts

ChrisN

:idea:Someone should figure out an easier way to match things than using regular expressions. :unsure:

I need some help matching things -- I can't seem to get my regular expressions to work. So here is what I am trying to do:

(***************)

N100 (FINISH PASS)

(150MM SAW BLADE)

M6 T0 S12000 M3

D11

( - - - - - - - - - - - - -)

(STARTMOP)

some

text

goes

here

123

456

(ENDMOP)

(***************)

N100 (Roughing PASS)

(12MM Rougher)

M6 T0 S12000 M3

D11

( - - - - - - - - - - - - -)

(STARTMOP)

some

more

text

goes

here

123

456

789

asdf

(ENDMOP)

Basically, I want to have just the hilighted parts extracted. I am trying to match them using "N100 (" and "(STARTMOP)" and "(ENDMOP)" to define what to extract, but I am having trouble getting a regular expression that matches anything - I just get errors :( Can anyone help?

(BTW, I am using GEOSoft's PCRE Toolkit to test my regular expressions)

Share this post


Link to post
Share on other sites
DicatoroftheUSA

It would help if you posted what you tried. For example, are you loading the entire file into one string? Did you set the regex option not to individually test each line? Plus regular expressions can check for things that are ... regular, what you want to collection looks somewhat random to me.

Share this post


Link to post
Share on other sites
jdelaney

gotta be better ways, but this works for the given example

$string = "(***************)" & @CRLF & _
"N100 (FINISH PASS)" & @CRLF & _
"(150MM SAW BLADE)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"(ENDMOP)" & @CRLF & _
"(***************)" & @CRLF & _
"N100 (Roughing PASS)" & @CRLF & _
"(12MM Rougher)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"more" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"789" & @CRLF & _
"asdf" & @CRLF & _
"(ENDMOP)" & @CRLF

$array = StringRegExp($string,"(?U)\((\d.*)\)[\r\n\W\w\s]+(\(STARTMOP\)[\r\n\W\w\s]+\(ENDMOP\))",3)
For $i = 0 To UBound ($array)-1
ConsoleWrite($array[$i] & @CRLF)
Next

or, group them using 4:

$array = StringRegExp($string,"(?U)\((\d.*)\)[\r\n\W\w\s]+(\(STARTMOP\)[\r\n\W\w\s]+\(ENDMOP\))",4)

For $i = 0 To UBound ($array)-1
$aTemp = $array[$i]
For $j = 1 To UBound ($aTemp) - 1
  ConsoleWrite("MatchGroup=[" & $i+1 & "], subGroup=[" & $j & "]: " & $aTemp[$j] & @CRLF)
Next
Next
Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites
ChrisN

@jdelaney: That works for my example - I'll have to test it at work tomorrow & see if it works on more things. Thanks!

Edit: It doesn't work if the tool name (150MM SAW BLADE) doesn't start with a number. I modified it and it works now

$array = StringRegExp($string,"(?U)N100 (.*)s+((.*))[rnWws]+((STARTMOP)[rnWws]+(ENDMOP))",3)

Edited by ChrisN

Share this post


Link to post
Share on other sites
PhoenixXL

This looks more promising in this case

Have a look

#include <Array.au3>

;Dividing the keypoint will keep the pattern clear though it may require more regexes.

;Working - If the start is either (STARTMOP) match till (ENDMOP)
; orelse match the characters excluding the brackets
; if brackets are not there the match fails

$string = "(***************)" & @CRLF & _
"N100 (FINISH PASS)" & @CRLF & _
"(150MM SAW BLADE)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"(ENDMOP)" & @CRLF & _
"(***************)" & @CRLF & _
"N100 (Roughing PASS)" & @CRLF & _
"(12MM Rougher)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"more" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"789" & @CRLF & _
"asdf" & @CRLF & _
"(ENDMOP)" & @CRLF

Local $Condition = "(?sm)^(?:\(STARTMOP\).*?\(ENDMOP\)" & _ ;Either (STARTMOB) ... (ENDMOB)
"|" & "\(([\w ]+)\))" ;or (SOME OTHER TEXT)

ConsoleWrite( "Pattern: " & $Condition & @CRLF & "+--------------------------------------------" & @CRLF )

$array = StringRegExp($string,  $Condition , 3) ;Global Match

For $i = 0 To UBound($array) - 1
ConsoleWrite( $array[$i] & @CR )
Next
Output:
150MM SAW BLADE
(STARTMOP)
some
text
goes
here
123
456
(ENDMOP)
12MM Rougher
(STARTMOP)
some
more
text
goes
here
123
456
789
asdf
(ENDMOP)
Edited by PhoenixXL

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • FroVN
      By FroVN
      i have a text : <Name>Jonh</Name>.<Age>15</Age>
      how i can get Jonh and 15 in one stringregexp? pls give me example
    • therks
      By therks
      I'm looking for a regex genius, cus I'm stumped when it comes to assertions.
      So what I have now, is this regular expression: ([^|=]+)=([^|]+)
      It takes a string (user input) of keys=values separated by pipes (ie: "param=value|param=value") and splits them into an array.
      Example:
      $vParamData = 'example=value|fruit=apple|phrase=Hello world' $aRegEx = StringRegExp($vParamData, '([^|=]+)=([^|]+)', 3) ; Result ; [0] => example ; [1] => value ; [2] => fruit ; [3] => apple ; [4] => phrase ; [5] => Hello world So that's working fine, but I'm wondering if there's also a way I could have this capture escaped pipes instead of splitting by them.
      ie:
      $vParamData = 'pipe test=this \| is a pipe|example=value' $aRegEx = StringRegExp($vParamData, '([^|=]+)=([^|]+)', 3) ; I'm getting this: ; [0] => pipe test ; [1] => this \ ; [2] => example ; [3] => value ; But I'd like a result like this: ; [0] => pipe test ; [1] => this \| is a pipe ; [2] => example ; [3] => value Is there some pattern that would accomplish this, or am I better off parsing it some other way?
    • Chimp
      By Chimp
      regex and iso escape sequences
      Hi, I would like to extract all ISO escape squences embedded in a string and separate them from the rest of the string, still keeping the information about their position, so that, for exemple, a string like this one (or even more complex):
      (the string could start with normal text or iso sequences)
       
      '\u001B[4mUnicorn\u001B[0m' should be 'transformed' in an array like this
      $a[0] = '\u001B[4m' ; first iso escape sequence $a[1] = 'Unicorn' ; normal text $a[2] = '\u001B[4m' ; second iso escape sequence ... and so on (note: the above escape sequence has 'control codes' marked as "\u001B' for the asc "esc" char for exemple and a similar notation is also used for other control chars, but in the real string to be parsed those control chars  are embedded  as a single byte with a value from 01 to 31). at this link (http://artscene.textfiles.com/ansi/) there are many example of real ANSI text files .
      searching on the web I've found some possible solutions that make use of regexp to achieve similar purpose, and above some others, the regexp pattern posted in the following link by kfir (https://stackoverflow.com/questions/14693701/how-can-i-remove-the-ansi-escape-sequences-from-a-string-in-python) seems to be able to catch a wider range of ISO escape sequences (not only color sequences), but my lack of skills on regexp, prevents me from evaluating and testing such patterns
      I would be very grateful if some regexp guru could come to my rescue...
      thanks everybody  for reading...
    • ur
      By ur
      I am trying to identify the window based on the window title and text.
      The title will be the "erwin DM - filename"

      It is working till date, but some operating systems our application is displaying window as "erwin DM - [filename]"
       
      I tried  "erwin DM - *filename*" But this regular expression is not working.
      Any suggestion?
       
      $sModelFile = "C:\Users\Administrator\Documents\My Models\eMovies.erwin" $wdModel = _WinWaitActivate1("erwin DM - "&FileNameOnly($sModelFile),"") Func _WinWaitActivate1($title,$text,$timeout=0);Will Return the window Handler Logging("Waiting for "&$title&":"&$text) $dHandle = WinWait($title,$text,$timeout) if not ($dHandle = 0) then If Not WinActive($title,$text) Then WinActivate($title,$text) return WinWaitActive($title,$text,$timeout) Else Logging("Timeout occured while waiting for the window...") Exit EndIf EndFunc Func FileNameOnly($sFilePath) Local $sDrive = "", $sDir = "", $sFileName = "", $sExtension = "" Local $aPathSplit = _PathSplit($sFilePath, $sDrive, $sDir, $sFileName, $sExtension) ;_ArrayDisplay($aPathSplit, "_PathSplit of " & @ScriptFullPath) return $sFileName EndFunc  
×