Jump to content
Sign in to follow this  
StungStang

StringRegExp question

Recommended Posts

StungStang

Hi to all, i've a question:

For example i've this MU link :

http://www.megaupload.com/?d=VQDRSKUK
http://www.megaupload.com/?d=M3U74HUR
http://www.megaupload.com/?d=WD8ERIWP
http://www.megaupload.com/?d=9ICL8Z3S
http://www.megaupload.com/?d=H5VY3YD0
http://www.megaupload.com/?d=VI9PW4UV
http://www.megaupload.com/?d=GL8SNXWD
http://www.megaupload.com/?d=D7OWADBB
http://www.megaupload.com/?d=HKKB8W4L
http://www.megaupload.com/?d=37S4MJ6A
http://www.megaupload.com/?d=YDE0AEOT

NOTE: No illegal link, it's a free game download link :mellow:

Now i want grab all link with StringRegExp, i can do it well. But now i want to grab the link ID, for example i've this link:

http://www.megaupload.com/?d=VQDRSKUK

I want grab the code after "?d=", in this example:

VQDRSKUK

I've this code :

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Leggi_File, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")
$String_Link = _ArrayToString($Regex_Link)
$Regex_String = ($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex ID"
FileClose($Open)

I can grab well the megaupload link, but with the second regex:

"megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})"

I can grab only the last link ID included in the link.txt file.

The link.txt file is composed by the previus megaupload links. Why the script grab only the last link ID contained in the txt file?...How i can grab all links ID?

Hi!

Edited by StungStang

Share this post


Link to post
Share on other sites
Goodware

Hello StungStang,

What you're asking is easier said than done.

Simply enter a loop in your script :mellow:

Look at the following link: http://dundats.mvps.org/help/html/keywords/For.htm (by GEOSoft; should be a guarantee :) )

http://www.autoitscript.com/autoit3/docs/functions/StringRegExp.htm

Share this post


Link to post
Share on other sites
UEZ

Try this:

#include <Array.au3>

$links = _
"http://www.megaupload.com/?d=VQDRSKUK" & @LF & _
"http://www.megaupload.com/?d=M3U74HUR" & @LF & _
"http://www.megaupload.com/?d=WD8ERIWP" & @LF & _
"http://www.megaupload.com/?d=9ICL8Z3S" & @LF & _
"http://www.megaupload.com/?d=H5VY3YD0" & @LF & _
"http://www.megaupload.com/?d=VI9PW4UV" & @LF & _
"http://www.megaupload.com/?d=GL8SNXWD" & @LF & _
"http://www.megaupload.com/?d=D7OWADBB" & @LF & _
"http://www.megaupload.com/?d=HKKB8W4L" & @LF & _
"http://www.megaupload.com/?d=37S4MJ6A"

$a = StringRegExp($links, "(?m).*=(.*)", 3)
_ArrayDisplay($a)

MsgBox(0, "Test", StringRegExpReplace($links, "(?m)(.*=)(.*)", "$2"))

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
Rogue5099

This is just a simple readline loop that trims all the way up to "?d=":

$file = FileOpen(@DesktopDir & "\test.txt", 0)

; Check if file opened for reading OK
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

While 1
    $line = FileReadLine($file)
    If @error = -1 Then ExitLoop
    $link = StringTrimLeft($line, StringInStr($line, "?d=") + 2)
    MsgBox(0, "Link", $link)
WEnd

FileClose($file)

smartee helped me out in this topic Maybe it could help you loop as well.

Edited by rogue5099

Share this post


Link to post
Share on other sites
Goodware

Try this:

#include <Array.au3>

$links = _
"http://www.megaupload.com/?d=VQDRSKUK" & @LF & _
"http://www.megaupload.com/?d=M3U74HUR" & @LF & _
"http://www.megaupload.com/?d=WD8ERIWP" & @LF & _
"http://www.megaupload.com/?d=9ICL8Z3S" & @LF & _
"http://www.megaupload.com/?d=H5VY3YD0" & @LF & _
"http://www.megaupload.com/?d=VI9PW4UV" & @LF & _
"http://www.megaupload.com/?d=GL8SNXWD" & @LF & _
"http://www.megaupload.com/?d=D7OWADBB" & @LF & _
"http://www.megaupload.com/?d=HKKB8W4L" & @LF & _
"http://www.megaupload.com/?d=37S4MJ6A"

$a = StringRegExp($links, "(?m).*=(.*)", 3)
_ArrayDisplay($a)

MsgBox(0, "Test", StringRegExpReplace($links, "(?m)(.*=)(.*)", "$2"))

Br,

UEZ

StungStang want to process the request in a file; you've worked on a string... :)

Uhm... (?m).*=(.*)? http://www.autoitscript.com/forum/index.php?app=forums is valid? :)

This is just a simple readline loop that trims all the way up to "?d=":

$file = FileOpen(@DesktopDir & "\test.txt", 0)

; Check if file opened for reading OK
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

While 1
    $line = FileReadLine($file)
    If @error = -1 Then ExitLoop
    $link = StringTrimLeft($line, StringInStr($line, "?d=") + 2)
    MsgBox(0, "Link", $link)
WEnd

FileClose($file)

smartee helped me out in this topic Maybe it could help you loop as well.

While? Better For :mellow:

http://www.autoitscript.com/forum/index.php?d=app is valid? :)

Share this post


Link to post
Share on other sites
Rogue5099

While? Better For :mellow:

I just copied striaght from example to make it work with what he posted

http://www.autoitscript.com/forum/index.php?d=app is valid? :)

No it's not valid but there is no "?d=" in autoitscript.com's link it is in MegaUpload which is what he wanted in topic!

I want grab the code after "?d=", in this example:

So you search for ?d= and delete everything before it!

Now Goodware, if you have a better solution than ones we have tried to come up with then post it instead of saying "add a loop" and then flaming what we have posted. Maybe ours isn't the best answer, if you know a better one then post it! It would also help others instead of giving links to the Help Doc!

Share this post


Link to post
Share on other sites
StungStang

I need a more complex regex, becouse i dont have a .txt with only the link. Is a .html with other code. With my SRE:

StringRegExp($Leggi_File, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)

I can grab all the link in this page :mellow:

The problem is here:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")
$String_Link = _ArrayToString($Regex_Link)
$Regex_String = StringRegExp($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex ID"
FileClose($Open)

This SRE:

$Regex_String = StringRegExp($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)

Return me only the last MegaUpload link in the list =(...Why it do that? The SRE work perfectly if there are one link, but if there are much more than 1 link it return me only the last MegaUpload link in the list. Why?

There are a way to fix my code?

Hi!

Edited by StungStang

Share this post


Link to post
Share on other sites
StungStang

I need a more complex regex, becouse i dont have a .txt with only the link. Is a .html with other code. With my SRE:

StringRegExp($Leggi_File, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)

I can grab all the link in this page :mellow:

The problem is here:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")
$String_Link = _ArrayToString($Regex_Link)
$Regex_String = StringRegExp($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)
_ArrayDisplay($Regex_String,"Regex ID")
FileClose($Open)

This SRE:

$Regex_String = StringRegExp($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)

Return me only the last MegaUpload link in the list =(...Why it do that? The SRE work perfectly if there are one link, but if there are much more than 1 link it return me only the last MegaUpload link in the list. Why?

There are a way to fix my code?

Hi!

Edited by StungStang

Share this post


Link to post
Share on other sites
Rogue5099

Might have to confirm this with Smartee. I'm not good with SRE:

#include <Array.au3>

$aID = StringRegExp(FileRead("test.txt"), "http://www\.megaupload\.com/\?d=(.+)", 3)
_ArrayDisplay($aID)

Or for you code:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")
$String_Link = _ArrayToString($Regex_Link)
$Regex_String = StringRegExp($Read, "http://www\.megaupload\.com/\?d=(.+)", 3)
_ArrayDisplay($Regex_String,"Regex ID")
FileClose($Open)
Edited by rogue5099

Share this post


Link to post
Share on other sites
StungStang

This is the magic SRE, all in one :mellow:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www.megaupload.com/(?:\w\w/)?\?[fd]=([-a-zA-Z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")

This is the correct SRE:

"http://www.megaupload.com/(?:\w\w/)?\?[fd]=([-a-zA-Z0-9]{8})"

Hi to all :)

Share this post


Link to post
Share on other sites
Goodware

I just copied striaght from example to make it work with what he posted

No it's not valid but there is no "?d=" in autoitscript.com's link it is in MegaUpload which is what he wanted in topic!

So you search for ?d= and delete everything before it!

Now Goodware, if you have a better solution than ones we have tried to come up with then post it instead of saying "add a loop" and then flaming what we have posted. Maybe ours isn't the best answer, if you know a better one then post it! It would also help others instead of giving links to the Help Doc!

1) OK, we recommend only the use of For :mellow:

2) You're wrong. You have included any string that contains a question mark, the fourth letter of the alphabet and the equal "?d=".

According to your script, even if I type "mushrooms and potatoes?d=eaten", okay

The link (http://www.autoitscript.com/forum/index.php?d=app) is valid according to your expression!

3) Maybe I waited too long... here's a version that takes one by one:

$Path = @HomeDrive & "\Text.txt" ; Your text file

; http://www.autoitscript.com/autoit3/docs/functions/FileReadLine.htm
$File = FileOpen($Path, 0)

; Check if file opened for reading OK
If $File = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

$FileRead = StringStripWS(FileRead($Path), 2)
$StringRegExp = StringRegExp($FileRead, "http://www.megaupload.com/*\?d=([[:alpha:][:digit:]]{8})", 3) ; [[:alpha:][:digit:]] = [[:alnum:]]
For $y = 0 To UBound($StringRegExp) - 1
    MsgBox(0, $y, $StringRegExp[$y])
Next

StungStang avoids the use of strings as A-Z-a-z0-9... you can replace those strings with [[:alpha:][:digit:]] or [[:alnum:]]

Share this post


Link to post
Share on other sites
GEOSoft

This works fine for me

#include<array.au3>
$sSource = BinaryToString(InetRead("http://www.megaupload.com/?c=top100"))
$aItems = StringRegExp($sSource, "<a.+http://.+megaup.+=([[:alnum:]]+).>", 3)
_ArrayDisplay($aItems)

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites
StungStang

I've solved with this code :mellow:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www.megaupload.com/(?:\w\w/)?\?[fd]=([-a-zA-Z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")

Share this post


Link to post
Share on other sites
Goodware

I've solved with this code :)

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www.megaupload.com/(?:\w\w/)?\?[fd]=([-a-zA-Z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")

Try to keep in mind my advice

StungStang avoids the use of strings as A-Z-a-z0-9... you can replace those strings with [[:alpha:][:digit:]] or [[:alnum:]]

Is only to simplify :mellow:

Share this post


Link to post
Share on other sites
GEOSoft

If you change the

([-a-zA-Z0-9]{8})

to

([[:alnum:]]{8})

you will get the same result with easier to read code.

another alternative is

(?i)([a-z0-9]{8})

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.