van_renier Posted April 17, 2007 Posted April 17, 2007 I know this isn't a REGEX (regular expressions) forum, but we have a very bright and knowledgable group, whom I have no doubt could answer this for me.(not pertaining to any limitations that may exist Autoit, but REGEX itself)I'm trying to use regular expressions to extract the first 500 chars of a XML field, storing it into the \n (REGEX's \1-\9).I AM able to extract upto 254 characters, but nothing over that.Does anyone know of REGEX having limitations of 254, and not being able to work with anything over that?Here's the description from server's Application event, that we want to extract the 1st 500 chars of the "exception" field:(the length of text WITHIN the following sample is 450chars.) ...<Exception>Slm.PrivateConsolidation.PreQualifyException: 10: Illegal Characters Borrower Employer^Illegal Characters in Borrower annualIncomeSchedule at Slm.Interfaces.PCW.OvationConsolidationManager.PreQualifyApplication(Application app) in C:\Projects\PCC\Release\src\slm\interfaces\PCW\OvationConsolidationManager.cs:line 149 at Slm.Batch.PCC.PrequalifyApplications.PreQualify() in C:\Projects\PCC\Release\src\slm\batch\PCC\PrequalifyApplications.cs:line 118</Exception>...I'm using the following REGEX expression (Against the above sample to catch upto the 1st 254 chars of the exception field): .*<Exception>\(\(.\{1,254\}\)</Exception>\).*But the REGEX fails if I change the above range to anything above 254.I'm being told by a software vendor (which we've addressed this concern to them) that this is a limitation of REGEX itself, but a colleague of mine has shown me tests he's performed indicating it isn't a REGEX-imposed limitation, but the vendors implementation of it in their product.Does anyone know of any such limitations on REGEX 'range' functionality?Any assistance provided would be greatly appreciated!(even if just an URL where such a limitation may be confirmed.)Thanks!Van Renier
Uten Posted April 18, 2007 Posted April 18, 2007 (edited) I think it is a vendor limitation. It will work with AutoIt's PCRE implementation as this sample shows. Also note that I have changed your original regexp to match AutoIt syntax. And I do think you have a logical error in your original regexp (could be because you provided it as an sample?) ;NOTE: Requires SciTe for debugging testRegexp() Func testRegexp() $data = '...<Exception>Slm.PrivateConsolidation.PreQualifyException: 10: Illegal Characters Borrower Employer^Illegal Characters in Borrower annualIncomeSchedule at Slm.Interfaces.PCW.OvationConsolidationManager.PreQualifyApplication(Application app) in C:\Projects\PCC\Release\src\slm\interfaces\PCW\OvationConsolidationManager.cs:line 149 at Slm.Batch.PCC.PrequalifyApplications.PreQualify() in C:\Projects\PCC\Release\src\slm\batch\PCC\PrequalifyApplications.cs:line 118 This text is added to islustrate a lengthy match</Exception>... ' ;The original regexp could have a logical error as it requires a certain number of chars between two identifiers. $regexp = '.*<Exception>\(\(.\{1,254\}\)</Exception>\).*' $regexp = '.*<Exception>(.{0,270}).*</Exception>' $res = StringRegExp($data, $regexp, 3) dbgarr($res) dbg("StringLen($res[0]) = " & Stringlen($res[0])) EndFunc Func dbgarr($arr, $line=@ScriptLineNumber, $err=@error, $ext=@extended) Local $i If IsArray($arr) Then For $i = 0 to UBound($arr, 0) - 1 dbg("dbgarr[" & $i & "]:=" & $arr[$i]) Next EndIf EndFunc Func dbg($msg, $line=@ScriptLineNumber, $err=@error, $ext=@extended) ConsoleWrite("(" & $line & ") := (" & $err & ")(" & $ext & ") : " & $msg & @CRLF) EndFunc My output form this is: >"D:\portableapps\PortableApps\autoit-v3.2.3.0\SciTe\..\autoit3.exe" /ErrorStdOut "C:\slettes\test.au3" (28) := (0)(0) : dbgarr[0]:=Slm.PrivateConsolidation.PreQualifyException: 10: Illegal Characters Borrower Employer^Illegal Characters in Borrower annualIncomeSchedule at Slm.Interfaces.PCW.OvationConsolidationManager.PreQualifyApplication(Application app) in C:\Projects\PCC\Release\src\slm\interfa (9) := (0)(0) : StringLen($res[0]) = 270 >Exit code: 0 Time: 0.299 Edited April 18, 2007 by Uten Please keep your sig. small! Use the help file. Search the forum. Then ask unresolved questions :) Script plugin demo, Simple Trace udf, TrayMenuEx udf, IOChatter demo, freebasic multithreaded dll sample, PostMessage, Aspell, Code profiling
van_renier Posted April 19, 2007 Author Posted April 19, 2007 I think it is a vendor limitation. It will work with AutoIt's PCRE implementation as this sample shows.Also note that I have changed your original regexp to match AutoIt syntax. And I do think you have a logical error in your original regexp (could be because you provided it as an sample?)Thanks. alot, Uten!I thought it was a vendor-imposed limitation, and not REGEX itself.Thanks for pointing out the error in the logical expression, too.I was cutting/pasting a lot during my testing, and forgot to remove that last '\)'.Appreciate the time,effort, and support!Van
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now