Chefito

stringregexp search for text reverse

11 posts in this topic

Hi,

 

How could I search for text starting from the end to the beginning with StringRegExp? that is, a search backwards.

Thank you.

Share this post


Link to post
Share on other sites



StringReverse() the subject and reverse the pattern accordingly.

Caveat emptor: Unicode codepoints outside the BMP (plane 0) will be spoiled by regular StringReverse. If you don't know what I'm talking about you're most probably not concerned.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

jchd, thanks for answering, .
But I think that it was not what I wanted. Maybe I misspoke. My English is bad. Sorry.


I made a small sample function to understand what I want to do with regular expressions. I want use the options, power and speed of the regular expressions.

$text="hi, today we are fine, we are writting about AutoIt language, and search text backwards."
MsgBox(0,"",searchBackwards($text, ",", ","))

Func searchBackwards($string, $subStringStart, $subStringEnd, $casesense = 0, $includeSubString = True)
   Local $result=StringInStr($string, $subStringEnd, default, -1)
   If @error Then Return SetError(1)
   If $result > 0 Then
      Local $posEnd = $result + StringLen($subStringEnd)
      Local $posStart = StringInStr($string, $subStringStart, Default, -1, $result - 1)
      If @error Then Return SetError(1)
      If $posStart >0 Then
         If $includeSubString = False Then
            $posStart += StringLen($subStringStart)
            $posEnd -= StringLen($subStringEnd)
         EndIf
         Return StringMid($string, $posStart, $posEnd - $posStart)
      Else
         Return SetError(2,Default,0)
      EndIf
   Else
      Return SetError(3,default,0)
   EndIf
EndFunc


Is there any way to do this with regular expressions, find the string that there is between two strings, touring the text from the end of the text at the beginning?
I remember I read something, but I'm not sure.

 

Thank you.

Edited by Chefito

Share this post


Link to post
Share on other sites

Try this.

Local $text = "hi, today we are fine, we are writting about AutoIt language, and search text backwards."

MsgBox(0, "", _SearchBackwards($text, ",", ","))
MsgBox(0, "Case sensitive, include start & end substrings", _SearchBackwards($text, "e", "A", 0))
MsgBox(0, "Case in-sensitive, exclude start & end substrings", _SearchBackwards($text, "e", "A", 1, 0))


Func _SearchBackwards($string, $subStringStart, $subStringEnd, $casesense = 0, $includeSubString = 1)
    If $includeSubString Then
        Return StringRegExpReplace($text, ($casesense ? "(?i)" : "") & ".*(" & $subStringStart & ".+" & $subStringEnd & ").+$", "\1")
    Else
        Return StringRegExpReplace($text, ($casesense ? "(?i)" : "") & ".*" & $subStringStart & "(.+)" & $subStringEnd & ".+$", "\1")
    EndIf
EndFunc   ;==>_SearchBackwards

 

Share this post


Link to post
Share on other sites

Malkey,
.+?   ?   :)

Share this post


Link to post
Share on other sites

Thanks for answering.
This code is fine, but is not what I need. Again it's my fault for putting text on a single line. The text where you have to use is the html code of a web page, so this is a multiline text, and the code you have set does not work for multi-line.
Example:

#include <inet.au3>
$text=_INetGetSource("https://www.autoitscript.com/site/")
MsgBox(0,"",searchBackwards($text, '<a href="http://www.autoitconsulting.com/site/', "</a>"))

Func searchBackwards($string, $subStringStart, $subStringEnd, $casesense = 0, $includeSubString = True)
   Local $result=StringInStr($string, $subStringEnd, default, -1)
   If @error Then Return SetError(1)
   If $result > 0 Then
      Local $posEnd = $result + StringLen($subStringEnd)
      Local $posStart = StringInStr($string, $subStringStart, Default, -1, $result - 1)
      If @error Then Return SetError(1)
      If $posStart >0 Then
         If $includeSubString = False Then
            $posStart += StringLen($subStringStart)
            $posEnd -= StringLen($subStringEnd)
         EndIf
         Return StringMid($string, $posStart, $posEnd - $posStart)
      Else
         Return SetError(2,Default,0)
      EndIf
   Else
      Return SetError(3,default,0)
   EndIf
EndFunc

I know that in this case can be solved with a normal search using '<a href="http://www.autoitconsulting.com/site/cookie-policy/">' which is unique, but is just one example.
Is it possible that with these arguments I get the same result that my previous function but with regular expressions?
It is more for interest, because my role is going more or less well.

Greetings and thanks.

Share this post


Link to post
Share on other sites
6 hours ago, mikell said:

Malkey,
.+?   ?   :)

@mikell
An explanation:-
$subStringStart and $subStringEnd are positioned in the text left to right - start followed by end, forward.
The search for the contents of the variables is from right to left, backwards.  
So looking at the last example,

MsgBox(0, "Case in-sensitive, exclude start & end substrings", _SearchBackwards($text, "e", "A", 1, 0))

The contents of $subStringEnd, or the first "a" found, from right to left is the "a" in "wards" - returning "xt backw".
The contents of $subStringStart, or the first "e" found, from right to left after the found $subStringEnd  is the "e" in "text".  

If  ".+?" was used in the RE pattern, the second last "a" in "back"  would be incorrectly found - returning "xt b ".
So, we want to capture is all the characters, including any intermediate "a"s,  from the last "e" to the last "a", left to right.

 

@Chefito
 Single-line or DotAll, "(?s)", option added so that  the dots, ".", in the RE pattern matches anything including a newline character.

#include <inet.au3>

$text = _INetGetSource("https://www.autoitscript.com/site/")

MsgBox(0, "", _SearchBackwards($text, '<a href="http://www.autoitconsulting.com/site/', "</a>"))
MsgBox(0, "", _SearchBackwards($text, '<a href="http://www.autoitconsulting.com/site/', "</a>", 0, 0))


Func _SearchBackwards($string, $subStringStart, $subStringEnd, $casesense = 0, $includeSubString = 1)
    If $includeSubString Then
        Return StringRegExpReplace($text, ($casesense ? "(?is)" : "(?s)") & ".*(" & $subStringStart & ".+" & $subStringEnd & ").+$", "\1")
    Else
        ;Return StringRegExpReplace($text, ($casesense ? "(?is)" : "(?s)") & ".*" & $subStringStart & "(.+)" & $subStringEnd & ".+$", "\1")
        Return StringRegExpReplace($text, ($casesense ? "(?is)" : "(?s)") & ".*" & $subStringStart & ".+?>(.+)" & $subStringEnd & ".+$", "\1") ; Returns only the text between ">" and "<"
    EndIf
EndFunc   ;==>_SearchBackwards

 

Share this post


Link to post
Share on other sites

If it's for disecting html, then _IE* functions are better.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
3 hours ago, Malkey said:

If  ".+?" was used in the RE pattern, the second last "a" in "back"  would be incorrectly found - returning "xt b ".
So, we want to capture is all the characters, including any intermediate "a"s,  from the last "e" to the last "a", left to right.

OK. Reading post #3  I thought that It was about some kind of StringBetween beginning from the right side, where "xt b " should be the wanted result with "e" and "a" used as delimiters
Several levels of understanding  :)

Share this post


Link to post
Share on other sites

Thanks to all.

Thank Malkey. Your function is great :thumbsup:.

jchd I know to use quite well the udf IE. Even DOM in general. But I always try to avoid the use of the object InternetExplorer. I prefer to use the udfs winhttp and inet, and try the html code with regular expression. This is faster and give less fails.

This is solved. Thanks.

Share this post


Link to post
Share on other sites

Of course you're free to live a dangerous life.

1 person likes this

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now