Sign in to follow this  
Followers 0
ChrisN

Regular Expressions :(

5 posts in this topic

:idea:Someone should figure out an easier way to match things than using regular expressions. :unsure:

I need some help matching things -- I can't seem to get my regular expressions to work. So here is what I am trying to do:

(***************)

N100 (FINISH PASS)

(150MM SAW BLADE)

M6 T0 S12000 M3

D11

( - - - - - - - - - - - - -)

(STARTMOP)

some

text

goes

here

123

456

(ENDMOP)

(***************)

N100 (Roughing PASS)

(12MM Rougher)

M6 T0 S12000 M3

D11

( - - - - - - - - - - - - -)

(STARTMOP)

some

more

text

goes

here

123

456

789

asdf

(ENDMOP)

Basically, I want to have just the hilighted parts extracted. I am trying to match them using "N100 (" and "(STARTMOP)" and "(ENDMOP)" to define what to extract, but I am having trouble getting a regular expression that matches anything - I just get errors :( Can anyone help?

(BTW, I am using GEOSoft's PCRE Toolkit to test my regular expressions)

Share this post


Link to post
Share on other sites



It would help if you posted what you tried. For example, are you loading the entire file into one string? Did you set the regex option not to individually test each line? Plus regular expressions can check for things that are ... regular, what you want to collection looks somewhat random to me.

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

gotta be better ways, but this works for the given example

$string = "(***************)" & @CRLF & _
"N100 (FINISH PASS)" & @CRLF & _
"(150MM SAW BLADE)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"(ENDMOP)" & @CRLF & _
"(***************)" & @CRLF & _
"N100 (Roughing PASS)" & @CRLF & _
"(12MM Rougher)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"more" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"789" & @CRLF & _
"asdf" & @CRLF & _
"(ENDMOP)" & @CRLF

$array = StringRegExp($string,"(?U)\((\d.*)\)[\r\n\W\w\s]+(\(STARTMOP\)[\r\n\W\w\s]+\(ENDMOP\))",3)
For $i = 0 To UBound ($array)-1
ConsoleWrite($array[$i] & @CRLF)
Next

or, group them using 4:

$array = StringRegExp($string,"(?U)\((\d.*)\)[\r\n\W\w\s]+(\(STARTMOP\)[\r\n\W\w\s]+\(ENDMOP\))",4)

For $i = 0 To UBound ($array)-1
$aTemp = $array[$i]
For $j = 1 To UBound ($aTemp) - 1
  ConsoleWrite("MatchGroup=[" & $i+1 & "], subGroup=[" & $j & "]: " & $aTemp[$j] & @CRLF)
Next
Next
Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

@jdelaney: That works for my example - I'll have to test it at work tomorrow & see if it works on more things. Thanks!

Edit: It doesn't work if the tool name (150MM SAW BLADE) doesn't start with a number. I modified it and it works now

$array = StringRegExp($string,"(?U)N100 (.*)s+((.*))[rnWws]+((STARTMOP)[rnWws]+(ENDMOP))",3)

Edited by ChrisN

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

This looks more promising in this case

Have a look

#include <Array.au3>

;Dividing the keypoint will keep the pattern clear though it may require more regexes.

;Working - If the start is either (STARTMOP) match till (ENDMOP)
; orelse match the characters excluding the brackets
; if brackets are not there the match fails

$string = "(***************)" & @CRLF & _
"N100 (FINISH PASS)" & @CRLF & _
"(150MM SAW BLADE)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"(ENDMOP)" & @CRLF & _
"(***************)" & @CRLF & _
"N100 (Roughing PASS)" & @CRLF & _
"(12MM Rougher)" & @CRLF & _
"M6 T0 S12000 M3" & @CRLF & _
"D11" & @CRLF & _
"( - - - - - - - - - - - - -)" & @CRLF & _
"(STARTMOP)" & @CRLF & _
"some" & @CRLF & _
"more" & @CRLF & _
"text" & @CRLF & _
"goes" & @CRLF & _
"here" & @CRLF & _
"123" & @CRLF & _
"456" & @CRLF & _
"789" & @CRLF & _
"asdf" & @CRLF & _
"(ENDMOP)" & @CRLF

Local $Condition = "(?sm)^(?:\(STARTMOP\).*?\(ENDMOP\)" & _ ;Either (STARTMOB) ... (ENDMOB)
"|" & "\(([\w ]+)\))" ;or (SOME OTHER TEXT)

ConsoleWrite( "Pattern: " & $Condition & @CRLF & "+--------------------------------------------" & @CRLF )

$array = StringRegExp($string,  $Condition , 3) ;Global Match

For $i = 0 To UBound($array) - 1
ConsoleWrite( $array[$i] & @CR )
Next
Output:
150MM SAW BLADE
(STARTMOP)
some
text
goes
here
123
456
(ENDMOP)
12MM Rougher
(STARTMOP)
some
more
text
goes
here
123
456
789
asdf
(ENDMOP)
Edited by PhoenixXL

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • Robinson1
      By Robinson1
      Well the plan is to use the power of regular expressions engine of AutoIT for patching binary data.
      Something like this: StringRegExp( $BinaryData,  "(?s)\x55\x8B.."
       
      <cut> ... Okay straight to question/problem
      ... certain bytes that are in the range from 0x80 to 0xA0 won't match.
      Hmm seem to be a char encoding problem. In detail these are 27 chars: 0x80, 0x82~8C, 0x8E, 0x91~9C, 0x9E,0x9F
      Here's a small code snippet to explore / explain this problem:
      #include "StringConstants.au3" $TestData = BinaryToString("0x7E7F808182") ;Okay $match = StringRegExp( $TestData ,'\x7E' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Okay $match = StringRegExp( $TestData ,'\x7F' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Error no match $match = StringRegExp( $TestData ,'\x80' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Okay $match = StringRegExp( $TestData ,'\x81' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Error no match $match = StringRegExp( $TestData ,'\x82' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;~ output: ;~ @extended = 2 $match = ;~ @extended = 3 $match = ;~ @extended = 0 $match = 1 ;~ @extended = 5 $match = ;~ @extended = 0 $match = 1 Hmm what to do? Go back and use the 'numberstring monster' implementation or just omit that range of 'unsafe bytes'. What is the root of this problem?
      Any idea how to fix this?
       
      Update: Okay I know a byte is not a character.
      But StringRegExp operates on String and so character level.
      Okay as long as you stay at Ansi encoding and only use /x00 - /X7F in the search pattern using  StringRegExp works well to search for binary data.
      What bytes can be matched that are in the range from /X7F - /xFF is also depending on the code page.
      So this avoid to search for bytes in the range from 0x80-0xa0 only applies to Germany.
      I just change this country setting:

      to Thai and now near all bytes from /X7F - /xFF fails to match.
    • Carm01
      By Carm01
      Hello,
      I have spent the past day fooling with StringRegExp to no avail attempting to get what would be a simple solution to an issue using StringRegExp.
      I will post the code in a sec. The string 'Java x Update y' where x and y are numeric values ONLY if a letter is mixed in anywhere then it should fail. I have been able to successfully deal with the x value so if x = 1234 or a1234 or 1a234 or 1234a would result in a fail if 'a' was in the string. However, when y = 1a234 then I get an output of 1 and when y = 1234a then the output = 1234 when both should fail. I am probably overlooking something simple and in looking through all the material and experimenting I am unable to figure it out and my experience with stringregexp and trying to find examples of this proved difficult. If someone could assist or point me to a thread ? Here is my code ; prob a simple fix. I am also trying to avoid white spaces.
      Thanks in advance
      #include <array.au3> $aArray = StringRegExp('Java 3009 Update 1a21', '(?i)Java (\d+) Update (\d+)', $STR_REGEXPARRAYGLOBALMATCH) If @error Then Exit _ArrayDisplay($aArray)  
    • VIP
      By VIP
      Need help to make function better  with full infomation
      #include <Array.au3> #include <File.au3> _TEST(@ScriptFullPath) _TEST("A:") _TEST("A:\B.c") _TEST("D:\E\F\") _TEST("G:\H/../J.k/") _TEST("M:\N\k..J.k") _TEST("D:\E\F\..\G\G\I..J.K.M") Func _TEST($sFilePath) Local $sDrive = "", $sFullPathDir = "", $sDirPath = "", $sDirName = "", $sFileName = "", $sFileNameExt = "", $sExtension = "", $sExt = "" Local $aPathSplit = _PathSplitByRef($sFilePath, $sDrive, $sFullPathDir, $sDirPath, $sDirName, $sFileName, $sFileNameExt, $sExtension, $sExt) ConsoleWrite("!Path IN : " & $sFilePath & @CRLF) ; C:\Windows\System32\etc\hosts.exe ConsoleWrite("- Driver : " & $sDrive & @CRLF) ; C: ConsoleWrite("- DirPath : " & $sFullPathDir & @CRLF) ; C:\Windows\System32\etc\etc ConsoleWrite("- DirPath : " & $sDirPath & @CRLF) ; \Windows\System32\etc\ ConsoleWrite("- DirName : " & $sDirName & @CRLF) ; etc ConsoleWrite("- FileName : " & $sFileName & @CRLF) ; hosts ConsoleWrite("- FileNameExt: " & $sFileNameExt & @CRLF) ; hosts.exe ConsoleWrite("- Extension : " & $sExtension & @CRLF) ; .exe ConsoleWrite("- Ext : " & $sExt & @CRLF & @CRLF) ; exe ;~ ConsoleWrite("!Path IN : " & $aPathSplit[0] & @CRLF) ; C:\Windows\System32\etc\hosts.exe ;~ ConsoleWrite("- Driver : " & $aPathSplit[1] & @CRLF) ; C: ;~ ConsoleWrite("- DirPath : " & $aPathSplit[2] & @CRLF) ; C:\Windows\System32\etc\etc ;~ ConsoleWrite("- DirPath : " & $aPathSplit[3] & @CRLF) ; \Windows\System32\etc\ ;~ ConsoleWrite("- DirName : " & $aPathSplit[4] & @CRLF) ; etc ;~ ConsoleWrite("- FileName : " & $aPathSplit[5] & @CRLF) ; hosts ;~ ConsoleWrite("- FileNameExt: " & $aPathSplit[6] & @CRLF) ; hosts.exe ;~ ConsoleWrite("- Extension : " & $aPathSplit[7] & @CRLF) ; .exe ;~ ConsoleWrite("- Ext : " & $aPathSplit[8] & @CRLF) ; exe ;~ _ArrayDisplay($aPathSplit, "_PathSplit of " & $sFilePath) EndFunc ;==>_TEST Func _PathSplitByRef($sFilePath, ByRef $sDrive, ByRef $sFullPathDir, ByRef $sDirPath, ByRef $sDirName, ByRef $sFileName, ByRef $sFileNameExt, ByRef $sExtension, ByRef $sExt) If StringInStr($sFilePath,"..") Then $sFilePath=_PathFull($sFilePath) Local $aPartOfPath=StringRegExp($sFilePath, "^\h*((?:\\\\\?\\)*(\\\\[^\?\/\\]+|[A-Za-z]:)?(.*[\/\\]\h*)?((?:[^\.\/\\]|(?(?=\.[^\/\\]*\.)\.))*)?([^\/\\]*))$", $STR_REGEXPARRAYMATCH) ;~ If @error Then ReDim $aPartOfPath[9] ;~ $aPartOfPath[0] = $sFilePath ;~ EndIf $aPartOfPath[0] = $sFilePath ; C:\Windows\System32\etc\hosts.exe $sDrive = $aPartOfPath[1] ; C: $sFullPathDir = $aPartOfPath[1] & $aPartOfPath[2] ; C:\Windows\System32\etc If StringLeft($aPartOfPath[2], 1) == "/" Then $sDirPath = StringRegExpReplace($aPartOfPath[2], "\h*[\/\\]+\h*", "\/") Else $sDirPath = StringRegExpReplace($aPartOfPath[2], "\h*[\/\\]+\h*", "\\") EndIf $aPartOfPath[2] = $sFullPathDir ; C:\Windows\System32\etc $sDirName=StringReplace($sDirPath,"\","") $sDirName=StringReplace($sDirPath,"/","") $sFileName = $aPartOfPath[3] ; hosts $aPartOfPath[5] = $sFileName ; hosts $sExtension = $aPartOfPath[4] ; .exe $aPartOfPath[7] = $sExtension ; .exe $aPartOfPath[3] = $sDirPath ; \Windows\System32\etc\ $aPartOfPath[4] = $sDirName ; etc $aPartOfPath[6] = $sFileName & $sExtension ; hosts.exe $sFileNameExt = $aPartOfPath[6] ; hosts.exe $sExt = StringReplace($sExtension,".","") ; exe $aPartOfPath[8] = $sExt ; exe Return $aPartOfPath EndFunc ;==>_PathSplitByRef  
    • nend
      By nend
      This is a program that I made to help my self learn better regular expressions.
      There are a lot of other programs/website with the similar functions.
      But the main advantage of this program is that you don't have to click a button after every changes.
      The program detected changes and react on it.

      Function:
      - Match - Match of arrays - Match and replace - Load source data from website - Load source data from a website with POST - Load text data from file - Clear fields - Export and Import settings (you can finish the expression a other time, just export/import it) - Cheat sheet The source code is not difficult and I think most user will understand it.
      This program does need the winhttp udf https://www.autoitscript.com/forum/topic/84133-winhttp-functions/ (it's also include in the zip file)
      In the zip file there is also a export file (POST example), this is a example of website source code with POST.
      You can download it here Regex toolkit.zip
       
      EDIT: Updated to version V1.2.0
      Change are:
      - expand and collapse of the cheat sheet (Thanks to Melba23 for the Guiextender UDF) - usefull regular expressions websites links included in the prgram - text data update time
    • nikink
      By nikink
      Hi all, it's been a while since I last used regular expressions and I find myself out of time to experiment with this particular issue, so I throw myself upon your mercy and expertise.
      I am looking to create a function that will say whether or not a supplied string is a valid UUID or not.
      Local $sTestF = '4C4C4544-004A-4C10-8054-B7C04F46343' Local $sTestT = '4C4C4544-004A-4C10-8054-B7C04F463432' ConsoleWrite('False = ' & _IsValidUUID($sTestF) & @CRLF) ConsoleWrite('True = ' & _IsValidUUID($sTestT) & @CRLF) Func _IsValidUUID($sUUID) ;[\p{XDigit}]{8}-[\p{XDigit}]{4}-[34][\p{XDigit}]{3}-[89ab][\p{XDigit}]{3}-[\p{XDigit}]{12} ; Test UUID = '4C4C4544-004A-4C10-8054-B7C04F463432' Local $sRegExp = '([:xdigit:]){8}\-([:xdigit:]){4}\-([34])([:xdigit:]){3}\-([89ab])([:xdigit:]){3}\-([:xdigit:]){12}' ConsoleWrite(StringRegExp($sUUID, $sRegExp) & @CRLF) Local $Result = StringRegExp($sUUID, $sRegExp) ConsoleWrite($Result & @CRLF) If @error Then ConsoleWrite('Error: [' & @error & ']' & @CRLF) Return 'False' Else ConsoleWrite('Error2: [' & @error & ']' & @CRLF) Return 'True' EndIf EndFunc In the line under the Function call, you'll see the regex I found to do this from a google search. That was my starting point, and I'm trying to get it to work in Au3 and failing miserably.
      $sTestF is a known invalid String
      $sTestT is a known valid String
      Everything I've tried so far has produced the same results for both.
      Any help you could provide me is greatly appreciated. Thanks for your time!