Jump to content

[Solved] StringRegExp, weird case.


MvGulik
 Share

Recommended Posts

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

  • Replies 65
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Does this do what you expect?

$aSRE = "word1,word2,,,word3,"
$sPattern = "([\w\s]+?)(?:,|$)"

Edit:

Tip: Don't use \1,\2,\3 &etc in a replace unless you have to. Use $1, $2, $3 instead.

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

This example shows a method of changing the ",,,"part of a source string into a ",[0:],[0:],"part of the return string from a modified version of the RE_Debug function at Post #1.

It might give you some ideas.

MAIN()

Func MAIN()
    Local $sPattern = '[^,]+|(?:,{2,})'
    Local $sSource = 'word1,,word2,,,word3,word4'
    ConsoleWrite('1: = ' & RE_Debug($sSource, $sPattern) & @CRLF)
EndFunc ;==>MAIN

;; Support function.
Func RE_Debug($sSource, $sPattern)
    $sSource = StringRegExpReplace($sSource, $sPattern, _
            '[0:\0][1:\1][2:\2][3:\3][4:\4][5:\5][6:\6][7:\7][8:\8][9:\9]')
    If @error Then Return SetError(@error, @extended, '')
    $sSource = StringRegExpReplace($sSource, '\[[1-9]:\]', '\1')
    If @error Then Return SetError(@error, @extended, '')
    $sSource = Execute('"' & StringRegExpReplace($sSource, _
            "\[(\d+):(,{2,})\]", '" & __Rep($1,"$2") & "') & '"')
    If @error Then Return SetError(@error, @extended, '')
    Return $sSource
EndFunc ;==>RE_Debug

;Internal function called from RE_Debug()
Func __Rep($No, $RE)
    Local $sRet = "", $iNum
    StringReplace($RE, ",", ",")
    $iNum = @extended - 1
    For $i = 1 To $iNum
        $sRet &= ",[" & $No & ":]"
    Next
    Return $sRet & ","
EndFunc ;==>__Rep
Link to comment
Share on other sites

Interesting.

Although its not doing what I'm after. Its for example completely skipping word 2 in the string "word1,word2,,,word3".

It shows a (to me) unexpected result. (So I'm not done yet with your suggestion)

It also shows a result that shows that I need to recall that RE_Debug() function for its own debug session.

Thanks. :D

I have no idea how it could skip word2. Did it get word1 and word3 and the blanks? Edited by Richard Robertson
Link to comment
Share on other sites

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

3) I Think its a bug. So I'll focus on finding some proof so it might be fixed.

Reread my post #10 using the pcretest utility. Pretty close to a proof and using the very same engine (on the paper). I didn't try with the latest pcretest version, this one is a bit outdated.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

I was initially confused as to what the problem you were having, but I think I understand now. 

I'm coming round to the point of view that there possibly is some sort of bug here, possibly with the way the PCRE dll has been incorperated into AutoIt.

I would expect these 2 bits of code to give the same result.

#include <array.au3>
    $sPattern = '[^,]+'
    $sSource = 'word1,word2,word3'
    $aArray = StringRegExp($sSource,$sPattern,3)
    _ArrayDisplay($aArray)

#include <array.au3>
    $sPattern = '[^,]*'
    $sSource = 'word1,word2,word3'
    $aArray = StringRegExp($sSource,$sPattern,3)
    _ArrayDisplay($aArray)
Edited by Bowmore

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Link to comment
Share on other sites

I would expect these 2 bits of code to give the same result.

Exactly, and that's surprising since the pcre test utility doesn't show this unexpected behavior. As I said, it's posibly due to build options for the PCRE module that AutoIt is using.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Code:

$result = _StringSplitRegExp('data1,data2,,data3,,,data4', ',')
For $X = 1 to $result[0]
    ConsoleWrite('[' & $X & ']: ' & $result[$X] & @CRLF)
Next

#cs ----------------------------------------------------------------------------

 AutoIt Version: 3.2.10.0
 Author: WeaponX

 Script Function:
    Split string on regular expression

 Parameters:
    String = String to be split
    Pattern = Pattern to split on
    IncludeMatch = True / False - Indicates whether or not to include the match in the return (back-reference)
    Count = Number of splits to perform

#ce ----------------------------------------------------------------------------
Func _StringSplitRegExp($sString, $sPattern, $sIncludeMatch = false, $iCount = 0)

    ;All matches will be replaced with this string
    Local $sReservedPattern = Chr(0)
    Local $sReservedPattern = "#"
    Local $sReplacePattern = $sReservedPattern

    ;Modify the reserve pattern to include back-reference
    If $sIncludeMatch Then $sReplacePattern = "$0" & $sReplacePattern

    ;Replace all occurences of the search pattern with a replace string
    $sTemp = StringRegExpReplace($sString, $sPattern, $sReplacePattern, $iCount)

    ;Consolewrite($sTemp & @CRLF)

    ;Strip trailing character if it matches the reserved pattern
    If StringRight($sTemp, 1) = $sReservedPattern Then $sTemp = StringTrimRight($sTemp, 1)

    ;Split string using entire reserved string
    $aResult = StringSplit($sTemp, $sReservedPattern, 1)

    Return $aResult
EndFunc

Output:

[1]: data1
[2]: data2
[3]: 
[4]: data3
[5]: 
[6]: 
[7]: data4
Link to comment
Share on other sites

Warm thanks Sniper, but the point now boils down to the "is it an AutoIt bug" question rather than "how to get around".

Oops, I'm putting words in the OP's mouth about his thread!

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

I don't think it's a bug. It might has something to do with what kind of regex flavor we are using. Go to here, and try this example:

String = word1,word2,word3

Pattern = [^,]*

Use different dialect and it will return different results depending on the dialect.

Hi ;)

Link to comment
Share on other sites

Silly thing: my composed message went in the clouds...

Yes, it's likely a distinct set of build options. Look at PCRE sources and .configure and you'll see that there are various options that condition the flavor or mix of flavors that will be built.

The issue discussed here may not qualify as bug, but rather as side-effect of the build options that the AutoIt team decided to use. These options are likely a bit specific to AutoIt.

They aren't being building Perl but AutoIt. Regexps are fundamental in Perl and it can be safe to assume that Perl users will learn a lot about regexps. Also regexps efficiency is paramount in Perl. This isn't the same with AutoIt, where most users are kept out of using regexp by their ignorance of how to use them and little programming experience. Regexp can always be replaced by other string manipulation functions and that's what you see in most forum posts. Don't see anything pejorative here, that's just facts.

Depending on the engine used, it can be very easy to write an innocent-looking regexp that will backtrack as hell and grab 100% of one CPU for years when trying to match on some innocent string. If that occurs in a Perl program, there's always someone around to explain how to fix it. In the AutoIt community, things are very different and the focus is rather to avoid build options that can cause problems in laymen hands.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...