Jump to content

RegExp to parse out text outside of ' or "


Recommended Posts

Hello all,

Trying to split a string into logical components.

I have string = "@testand='and or string' and .='and or string2' or contains(.='test or quote') and @testand='or string2'"

I want to split out the data by any 'or' that is not inside of a quote, or double quote.

Currently, the best/only way I know how, is to split by ALL 'or's, and then reconstruct based on counting ' or " to determin if inside or outside.

I know there is an easier way through regexp, hoping someone would help me out.

In the above string, i'm expecting an array with:

1) @testand='and or string' and .='and or string2'

2) contains(.='test or quote') and @testand='or string2'

Currently, the way I know how, is something like this:

#include <Array.au3>
; split by all or's
$atest = StringSplit ( "@testand='and or string' and .='and or string2' or contains(.='test or quote') and @testand='or string2'", "or", 1 )
_ArrayDisplay ( $atest )
Dim $asHolder[1][2]
; count the amount of ['|"] in each array element...create a new array to hold it
For $i = 1 to UBound ( $atest ) - 1
 ReDim $asHolder[$i][2]
 $iCount = 0
 $iStart = StringRegExp (  $atest[$i], "['""]", 3 )
 If IsArray ( $iStart ) Then
  $iCount = UBound ( $iStart )
 EndIf
 ConsoleWrite ( $iCount & @CRLF )
 $asHolder[$i-1][0] = $atest[$i]
 $asHolder[$i-1][1] = $iCount
Next
_ArrayDisplay ($asHolder )
;;;would then add code to logically re-construct pieces of the above array
Dim $aFinal[1]
$iCurrent = 0
$iPrevQuoteCounts = 0
$iQuoteCount = 0
$iLast = 0
$sString = ""
$bSkipLogical = False
Dim $asComplete[1]
For $i = 0 To UBound ( $asHolder ) - 1
 ReDim $asComplete[$iCurrent+1]
 $iQuoteCount = $asHolder[$i][1]
 If StringLen ( $sString ) = 0 Then
  $sString = $asHolder[$i][0]
 Else
  $sString = $sString & $asHolder[$i][0]
 EndIf
 If IsInt (($iQuoteCount + $iPrevQuoteCounts)/2) Then
  $iPrevQuoteCounts = 0 ; Reset
  $asComplete[$iCurrent] = $sString
  $iCurrent = $iCurrent  + 1
  $sString = "" ; Reset
 Else
  $sString = $sString & " Or "
  $iPrevQuoteCounts = $iPrevQuoteCounts + $iQuoteCount
 EndIf
Next
_ArrayDisplay ( $asComplete )

Thanks, for any assistance.

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

Here's a sketch showing how you can do that. Sorry about the empty captures, I sincerely don't know how to avoid them!

#include <Array.au3>
; split by all or's
Local $atest = "@testand='and or string' and .='and or string2' or contains(.='test or quote') and @testand='or string2'"
Local $aCond = StringRegExp($atest, "(?ix) (?(DEFINE) (?<string> '[^']*') (?<var> . | @[a-z]w*) (?<equality> (?&var)=(?&string)) (?<expr> (?&equality) | contains((?&equality)))) ((?&expr)(?:s and s(?&expr))+) s or s ((?&expr)(?:s and s(?&expr))+)", 1)
For $i = 1 To 4
_ArrayDelete($aCond, 0)
Next
_ArrayDisplay($aCond)

Depending on the allowed input syntax, you may have to add whitespace metacharacters (like s* or [[:blank:]]* at some places to match all possible inputs.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

This works for the above example with single quotes, might be another route...

#Include <Array.au3>
$string = "@testand='and or string' and .='and or string2' or contains(.='test or quote') and @testand='or string2'"
$Acapture = stringsplit($string , "' or " , 3)
for $i = 0 to ubound ($Acapture) - 2
$Acapture[$i] = $Acapture[$i] & "'"
next
_ArrayDisplay ($Acapture)

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

@jchd That works for the string provided, but not when I modify the string, such as changeing one of the and conditions to an or, it doesn't return the strings.

Thanks for the example though, I'm going to work on disecting it now.

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

If you put two " in a surrounded quote, that is read as a single "...is that what you are asking?

"test and somestring""continued"

that is the same as a string, that is: test and somestring"continued

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

The escape character is but look in the help file for x. in the case of the double quote (Chr(34)) you would use x22. In most cases where it could be either a double quote or a single quote you will be safe just using the class [:punct:] but in some situations that can produce unpredictable results.

There is a PCRE tool in my signature that will allow you to test against an actual file (Local file tab).

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

well, i made it ('it' being my code...much more simplified) much better, but still not fully getting how to do recursive regular expressions...so i did a loop instead

#include <Array.au3>
; split by all or's
Local $atest = "@testand='and or string' and .='and or string2' or contains(.='test or quote') and @testand='or string2' or @test='value' and test='orvalue or '"

$iCounter = 0
Dim $aHolder[1]
$aCond = StringRegExp($atest, "(?i)(?s)(.*'.*')sors", 3)
While IsArray ( $aCond )
ReDim $aHolder[$iCounter+1]
$temp = StringRegExp($atest, "(?i)(?s).*'.*'sors(.*)", 3)
If Not IsArray ( $temp ) Then
  $aHolder[$iCounter] = $aCond[0]
  ExitLoop
Else
  $aHolder[$iCounter] = $temp[0]
EndIf
$aCond = StringRegExp($atest, "(?i)(?s)(.*'.*')sors", 3)
$atest = $aCond[0]
$iCounter = $iCounter + 1
WEnd
_ArrayDisplay ( $aHolder)
Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

i believe that, and my stringsplit example return the same array. Am i missing something as to the want for a regular expression?

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

there may be a condition where there is a ') or ... scenario. Other then that, yours works... here is my update to consider that:

#include <Array.au3>
; split by all or's
Local $atest = "@testand='and or string' and .='and or string2' or contains(.='test or quote') or @testand='or string2' or @test='value' and test=' orvalue or '"

$iCounter = 0
Dim $aHolder[1]
$sBefore = "(?i)(?s)(.*'.*['[:punct:]])sors"
$sAfter = "(?i)(?s).*'.*['[:punct:]]sors(.*)"
$aCond = StringRegExp($atest, $sBefore, 3)
While IsArray ( $aCond )
 ReDim $aHolder[$iCounter+1]
 $temp = StringRegExp($atest, $sAfter, 3)
 If Not IsArray ( $temp ) Then
  $aHolder[$iCounter] = $aCond[0]
  ExitLoop
 Else
  $aHolder[$iCounter] = $temp[0]
 EndIf
 $aCond = StringRegExp($atest, $sBefore, 3)
 $atest = $aCond[0]
 $iCounter = $iCounter + 1
WEnd
_ArrayDisplay ( $aHolder)
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

Ok, this is the real deal...

#include <Array.au3>
; split by all or's
Local $atest = "@testand='and or string' and .='and or string2' or contains(.='test or quote') or @testand='or string2' or @test='value' and test=' orvalue '"
; split out ALL or conditions
#include <Array.au3>
$sRegExpOrSplit = "(?i)(?U)(.*'.*[')])(?:sors)|.{1,}?"; experiment combining
$aCond = StringRegExp($atest, $sRegExpOrSplit, 3)
_ArrayDisplay ( $aCond )
Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

This modified version works as expected on your new example.

#include <Array.au3>
; split by all or's conditions
Local $atest = "@testand='and or string' and .='and or string2' or contains(.='test or quote') or @testand='or string2' or @test='value' and test=' orvalue '"
Local $aCond = StringRegExp($atest, "(?ix) (?(DEFINE) (?<string> '[^']*') (?<var> . | @?[a-z]w*) (?<equality> (?&var)=(?&string)) (?<expr> (?&equality) | contains((?&equality))) (?<andCond> (?&expr)(?: s and s (?&expr) )*)) ((?&andCond)) (?= s or s |$)", 3)
For $i = UBound($aCond) - 1 To 0 Step -1
If Mod($i + 1, 6) Then _ArrayDelete($aCond, $i)
Next
_ArrayDisplay($aCond)

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Is this working too when you change some values?

#include <Array.au3>
Local $atest = "@testand='and or string' and .='and or string2' or contains(.='test or quote') and @testand='or string2'"
Local $sBool = "or"
Local $aCond = StringRegExp($atest, "(?i)(@.*')s+" & $sBool & "s+([a-z]+.*)", 3)
_ArrayDisplay($aCond)

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Link to comment
Share on other sites

@ UEZ, there may be a '@' following the or, so that only split a few times. @ jchd, i need a regexp that can parse out dynamically. Thanks for all the help though, this works (though can be simplified further):

$sRegExpOrSplit = "(?i)(?U)(.*'.*[')])(?:sors)|.{1,}?"

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...