Jump to content

SRE Non-capturing group


Recommended Posts

Good afternoon. Its that time again where I need a reg expression and again I can't figure out why my Non-capturing group wont work. Here is the code

#include <Array.au3>
#include <File.au3>
Local $a, $b, $c
_FileReadToArray(@ScriptDir & "\a.txt", $a)
For $x = 1 To UBound($a)-1
    If StringInStr($a[$x], '94" href="/watch?v=') Then
        $b = StringRegExp($a[$x], '(?:94" href="/watch\?v=)\S{10}', 1)
        _ArrayDisplay($b)
        $c &= $b[0] & @LF
    EndIf
Next
FileWrite(@ScriptDir & "\b.txt", $c)

and some of the text its searching

<DIV class=playnav-video-thumb><A id=video-thumb-WC9nUBUTQbw-7840901 class="video-thumb ux-thumb-94" href="/watch?v=WC9nUBUTQbw"><SPAN class=img><IMG title="AWESOME SAUCE!" onclick="playnav.playVideo('uploads','0','WC9nUBUTQbw');return false;" src="http://i4.ytimg.com/vi/WC9nUBUTQbw/default.jpg"></SPAN> <SPAN class=video-time>4:13</SPAN></A> </DIV>
<DIV class=playnav-video-info><A id=playnav-video-title-play-uploads-0-WC9nUBUTQbw class="playnav-item-title ellipsis" onclick="playnav.playVideo('uploads','0','WC9nUBUTQbw');return false;" href="/watch?v=WC9nUBUTQbw"><SPAN dir=ltr>AWESOME SAUCE!</SPAN></A>
<DIV class=metadata><SPAN dir=ltr>1,994,873 views - 2 days ago</SPAN> </DIV>
<DIV style="DISPLAY: none" id=playnav-video-play-uploads-0>WC9nUBUTQbw</DIV></DIV></DIV></DIV>
<DIV id=playnav-video-play-uploads-1-ah7hxQIuwD8 class="playnav-item playnav-video">
<DIV style="DISPLAY: none" class=encryptedVideoId>ah7hxQIuwD8</DIV>
<DIV id=playnav-video-play-uploads-1-ah7hxQIuwD8-selector class=selector></DIV>
<DIV class=content>
<DIV class=playnav-video-thumb><A id=video-thumb-ah7hxQIuwD8-4533421 class="video-thumb ux-thumb-94" href="/watch?v=ah7hxQIuwD8"><SPAN class=img><IMG title="I LIKE TURTLES!!" onclick="playnav.playVideo('uploads','1','ah7hxQIuwD8');return false;" src="http://i2.ytimg.com/vi/ah7hxQIuwD8/default.jpg"></SPAN> <SPAN class=video-time>4:10</SPAN></A> </DIV>
<DIV class=playnav-video-info><A id=playnav-video-title-play-uploads-1-ah7hxQIuwD8 class="playnav-item-title ellipsis" onclick="playnav.playVideo('uploads','1','ah7hxQIuwD8');return false;" href="/watch?v=ah7hxQIuwD8"><SPAN dir=ltr>I LIKE TURTLES!!</SPAN></A>
<DIV class=metadata><SPAN dir=ltr>3,547,697 views - 1 week ago</SPAN> </DIV>
<DIV style="DISPLAY: none" id=playnav-video-play-uploads-1>ah7hxQIuwD8</DIV></DIV></DIV></DIV>
<DIV id=playnav-video-play-uploads-2-4cqFbEDsZqE class="playnav-item playnav-video">
<DIV style="DISPLAY: none" class=encryptedVideoId>4cqFbEDsZqE</DIV>
<DIV id=playnav-video-play-uploads-2-4cqFbEDsZqE-selector class=selector></DIV>
<DIV class=content>
<DIV class=playnav-video-thumb><A id=video-thumb-4cqFbEDsZqE-6070002 class="video-thumb ux-thumb-94" href="/watch?v=4cqFbEDsZqE"><SPAN class=img><IMG title="In Soviet Russia..." onclick="playnav.playVideo('uploads','2','4cqFbEDsZqE');return false;" src="http://i1.ytimg.com/vi/4cqFbEDsZqE/default.jpg"></SPAN> <SPAN class=video-time>3:48</SPAN></A> </DIV>
<DIV class=playnav-video-info><A id=playnav-video-title-play-uploads-2-4cqFbEDsZqE class="playnav-item-title ellipsis" onclick="playnav.playVideo('uploads','2','4cqFbEDsZqE');return false;" href="/watch?v=4cqFbEDsZqE"><SPAN dir=ltr>In Soviet Russia...</SPAN></A>
<DIV class=metadata><SPAN dir=ltr>3,259,191 views - 1 week ago</SPAN> </DIV>
<DIV style="DISPLAY: none" id=playnav-video-play-uploads-2>4cqFbEDsZqE</DIV></DIV></DIV></DIV>
<DIV id=playnav-video-play-uploads-3-98Alrg4pFXs class="playnav-item playnav-video">
<DIV style="DISPLAY: none" class=encryptedVideoId>98Alrg4pFXs</DIV>
<DIV id=playnav-video-play-uploads-3-98Alrg4pFXs-selector class=selector></DIV>
<DIV class=content>
<DIV class=playnav-video-thumb><A id=video-thumb-98Alrg4pFXs-2432031 class="video-thumb ux-thumb-94" href="/watch?v=98Alrg4pFXs"><SPAN class=img><IMG title=ACTION!!!!!! onclick="playnav.playVideo('uploads','3','98Alrg4pFXs');return false;" src="http://i2.ytimg.com/vi/98Alrg4pFXs/default.jpg"></SPAN> <SPAN class=video-time>4:06</SPAN></A> </DIV>
<DIV class=playnav-video-info><A id=playnav-video-title-play-uploads-3-98Alrg4pFXs class="playnav-item-title ellipsis" onclick="playnav.playVideo('uploads','3','98Alrg4pFXs');return false;" href="/watch?v=98Alrg4pFXs"><SPAN dir=ltr>ACTION!!!!!!</SPAN></A>
<DIV class=metadata><SPAN dir=ltr>3,418,613 views - 2 weeks ago</SPAN> </DIV>
<DIV style="DISPLAY: none" id=playnav-video-play-uploads-3>98Alrg4pFXs</DIV></DIV></DIV></DIV>
<DIV id=playnav-video-play-uploads-4-jdAbpLooDgM class="playnav-item playnav-video">
<DIV style="DISPLAY: none" class=encryptedVideoId>jdAbpLooDgM</DIV>
<DIV id=playnav-video-play-uploads-4-jdAbpLooDgM-selector class=selector></DIV>
<DIV class=content>

I have checked multiple times to make sure I did it right and as far as I know it should work right, but the non capturing part still shows up in the array. Any help would be great and thanks.

Link to comment
Share on other sites

You have a non-capturing group, which is fine, but you need a capturing group as well.

StringRegExp($a[$x], '(?:94" href="/watch\?v=)(\S{10})', 1)

or

StringRegExp($a[$x], '?:94" href="/watch\?v=(\S{10})', 1)
Edited by zorphnog
Link to comment
Share on other sites

Will this give the desired results?

#include <Array.au3>
#include <File.au3>
Local $a, $b, $c
_FileReadToArray(@ScriptDir & "\a.txt", $a)
For $x = 1 To UBound($a) - 1
    If StringInStr($a[$x], '94" href="/watch?v=') Then
        $b = StringRegExp($a[$x], '94" href="/watch\?v=(\S{10})', 1)
        _ArrayDisplay($b)
        $c &= $b[0] & @LF
    EndIf
Next
FileWrite(@ScriptDir & "\b.txt", $c)
Link to comment
Share on other sites

Beware that your field has 11 characters (apparently it is fixed size), not 10.

I bet zorphnog meant the pattern '94" href="/watch\?v=(\S{11})'

This is because the non-captured part won't make it in the match output as long as there is at least one capturing group.

In this case, I always find it more precise to specify that you want the field which is after 94" href="/watch\?v= and which doesn't contain ". This leads us to the equivalent pattern '94" href="/watch\?v=([^"]{11})' and even the {11} might prove superflous and can be made +.

Now there is another way to put it: a 11-character field not containing " which is preceeded by 94" href="/watch\?v=

This translate into: (?<=94" href="/watch\?v=)[^"]+

Of course, this latter version does _not_ need more backtracking than the "positive match" versions above, thanks to internal optimizations in PCRE.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

If you don't want a capturing group then you could mess around with StringRegExpReplace.

Edit;

Have you tried the PCRE Tester in my signature? It should be fairly easy to find out where your expression fails.

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Thanks jchd I noticed the 10 after I posted. Also I am still not that great with RegExpressions does the (?< make it non capturing? Plus how does the + sign make it stop after 11 characters when [^"] should match anything but " and there's not a " for quite a while after the 11 chars. Also GEOSoft I did try your PCRE tester but it was buggy on my computer at work so I stopped after a couple mins.

Link to comment
Share on other sites

Thanks jchd I noticed the 10 after I posted. Also I am still not that great with RegExpressions does the (?< make it non capturing? Plus how does the + sign make it stop after 11 characters when [^"] should match anything but " and there's not a " for quite a while after the 11 chars. Also GEOSoft I did try your PCRE tester but it was buggy on my computer at work so I stopped after a couple mins.

Did you report the bugs? There were a few bug fixes released and as far as I am aware there are no bugs in the current release.

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Onichan,

(?<=text) part is a positive lookbehind assertion. For instance, (?<=foo)bar matches bar if and only if it is preceeded by foo.

(?<!text) negative lookbehind assertion. For instance, (?<!foo)bar matches bar if and only if it is NOT preceeded by foo.

(?=text) positive lookahead assertion. For instance, bar(?=foo) matches bar if and only if it is followed by foo.

(?!text) negative lookahead assertion. For instance, bar(?!foo) matches bar if and only if it is NOT followed by foo.

The text part in never captured.

Here are the two official reference pages about PCRE syntax and semantics: here and here. You'll find a large number of useful hints, tricks and examples brought to you by PCRE author himself.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...