Jump to content

My StringRegExp doesn't get it


Recommended Posts

G'day everyone

Please can someone just tell me what's wrong with my StringRegExp query here?

I'm querying an file downloaded from the Wikipedia. I want the script to find this:

nl.wikipedia.org/wiki/Lounge">Nederlands

and add this to the array:

Lounge

Here's my non-working code:

$result = StringRegExp($wikipage2, '(?:nl\.wikipedia\.org/wiki/)(.+)(?:">Nederlands)', 1)

Please tell me what wrong with it?

Thanks

Samuel

Link to comment
Share on other sites

Hi,

try this pattern (?<=nl\.wikipedia\.org/wiki/).*(?=">Nederlands)

So long,

Mega

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

Try this pattern (?<=nl\.wikipedia\.org/wiki/).*(?=">Nederlands)

Thanks. I really don't understand it, though. The AutoIt beta help file does not mention "?<=" or "?=" at all. Where did you get those, and what do they mean?

Link to comment
Share on other sites

Thanks. I really don't understand it, though. The AutoIt beta help file does not mention "?<=" or "?=" at all. Where did you get those, and what do they mean?

HI,

look at wiki or any perl explanation.

It is very simple.

(?<=nl\.wikipedia\.org/wiki/).*(?=">Nederlands)

This means:

(<=...) = Look that this is ahead of the search pattern

.* = the search pattern in this case anything

(?=...) = Look that this comes after the search pattern.

Keywords are : lookahead and lookbehind. (these are postive) they are also possible to create negative.

So long,

Mega

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

(?<=nl\.wikipedia\.org/wiki/).*(?=">Nederlands)

This means:

(<=...) = Look that this is ahead of the search pattern

.* = the search pattern in this case anything

(?=...) = Look that this comes after the search pattern.

Thanks. Well, AutoIt doesn't seem to recognise it. But I finally figured out the correct regex:

'(nl\.wikipedia\.org/wiki/)(.*)(">Nederlands)'

Don't ask me why, but it works. According to the AutoIt help file I have to specifically exclude groups by using (?: ... ), and in the example above I did not exclude any groups, and yet AutoIt only adds the middle group to the array. Weird, but as long as it works:

$result = StringRegExp('<li class="interwiki-nl"><a href="http://nl.wikipedia.org/wiki/Wetenschap">Nederlands</a></li>', '(nl\.wikipedia\.org/wiki/)(.+)(">Nederlands)', 1)
MsgBox (0, "", $result[1], 10)

The above MsgBox yields "Wetenschap", which was what I was hoping for.

Thanks for your help.

Samuel

Link to comment
Share on other sites

Now for my next problem (or rather, the original one):

Here is the script which should work, but gives me an array error:

$pageurl = FileReadLine("URLs.txt", $i)
$wikipage = InetGet($pageurl, "foo.txt")
$wikipage2 = FileOpen ("foo.txt", 0)
$wikipage3 = FileRead ("foo.txt", 0)

Sleep ("2000")

$result = StringRegExp($wikipage3, '(nl\.wikipedia\.org/wiki/)(.*)(">Nederlands)', 1)
MsgBox (0, "", $result[1], 10)

In the above example, the file URLs.txt contains a URL of a page that definitely matches the regexp. I've checked it, and it does match it. So the problem seems to be that I can't get the script to recognise $wikipage3. You'll notice that I had added $wikipage2 and $wikipage3 later, because I initially thought I could pipe the InetGet contents directly into the StringRegExp... but that doesn't seem to be the case.

How can I tell AutoIt to do the StringRegExp on the page that was downloaded using InetGet?

Thanks

Samuel

Link to comment
Share on other sites

Hi,

;Extract link

Global $str = '<li class="interwiki-nl"><a href="http://nl.wikipedia.org/wiki/Wetenschap">Nederlands</a></li>'

$link = StringRegExp($str, '(?<=nl\.wikipedia\.org/wiki/).*(?=">Nederlands)', 1)
MsgBox(64, 'Link', $link[0])

$result = StringRegExp($str, '(nl\.wikipedia\.org/wiki/)(.+)(">Nederlands)', 1)
MsgBox (64, "Link", $result[1], 10)

So long,

Mega

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

Global $str = '<li class="interwiki-nl"><a href="http://nl.wikipedia.org/wiki/Wetenschap">Nederlands</a></li>'

$link = StringRegExp($str, '(?<=nl\.wikipedia\.org/wiki/).*(?=">Nederlands)', 1)
MsgBox(64, 'Link', $link[0])oÝ÷ Ûú®¢×±çz£­*.®· {héÞyÛ-¢ë)äÓbr¬æjYr­ë®÷~íéî·«¡ø­zk"Ø^jºÚÉ©ÝÂ+a)Þ"wvÚ.±ébMp'!槲Ø^~*ì¶+^)íæ«­¬£*.z0¥êáj0«mç(®·¶Ì§µ¬p¢é]mëhvaÆ®¶­sdvÆö&Âb33c·7G"ÒfÆT÷VâgV÷C¶föòçGBgV÷C²Â

instead? My main hiccup now is to get AutoIt to do the regex search on the entire opened file.

I guess I could to FileReadLine one line at a time and do the regex search on that, but it will slow down the operation of the script tremendously.

Thanks again

Samuel

Link to comment
Share on other sites

Hi,

soemthing like this:

$link = StringRegExp(FileRead(FileOpen('c:\test.txt', 0)), '(?<=nl\.wikipedia\.org/wiki/).*(?=">Nederlands)', 1)
If IsArray($link) Then MsgBox(64, "link", $link[0])

So long,

Mega

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...