Jump to content

StringRegExp question


Recommended Posts

Hi to all, i've a question:

For example i've this MU link :

http://www.megaupload.com/?d=VQDRSKUK
http://www.megaupload.com/?d=M3U74HUR
http://www.megaupload.com/?d=WD8ERIWP
http://www.megaupload.com/?d=9ICL8Z3S
http://www.megaupload.com/?d=H5VY3YD0
http://www.megaupload.com/?d=VI9PW4UV
http://www.megaupload.com/?d=GL8SNXWD
http://www.megaupload.com/?d=D7OWADBB
http://www.megaupload.com/?d=HKKB8W4L
http://www.megaupload.com/?d=37S4MJ6A
http://www.megaupload.com/?d=YDE0AEOT

NOTE: No illegal link, it's a free game download link :mellow:

Now i want grab all link with StringRegExp, i can do it well. But now i want to grab the link ID, for example i've this link:

http://www.megaupload.com/?d=VQDRSKUK

I want grab the code after "?d=", in this example:

VQDRSKUK

I've this code :

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Leggi_File, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")
$String_Link = _ArrayToString($Regex_Link)
$Regex_String = ($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex ID"
FileClose($Open)

I can grab well the megaupload link, but with the second regex:

"megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})"

I can grab only the last link ID included in the link.txt file.

The link.txt file is composed by the previus megaupload links. Why the script grab only the last link ID contained in the txt file?...How i can grab all links ID?

Hi!

Edited by StungStang
Link to comment
Share on other sites

Hello StungStang,

What you're asking is easier said than done.

Simply enter a loop in your script :mellow:

Look at the following link: http://dundats.mvps.org/help/html/keywords/For.htm (by GEOSoft; should be a guarantee :) )

http://www.autoitscript.com/autoit3/docs/functions/StringRegExp.htm

Link to comment
Share on other sites

Try this:

#include <Array.au3>

$links = _
"http://www.megaupload.com/?d=VQDRSKUK" & @LF & _
"http://www.megaupload.com/?d=M3U74HUR" & @LF & _
"http://www.megaupload.com/?d=WD8ERIWP" & @LF & _
"http://www.megaupload.com/?d=9ICL8Z3S" & @LF & _
"http://www.megaupload.com/?d=H5VY3YD0" & @LF & _
"http://www.megaupload.com/?d=VI9PW4UV" & @LF & _
"http://www.megaupload.com/?d=GL8SNXWD" & @LF & _
"http://www.megaupload.com/?d=D7OWADBB" & @LF & _
"http://www.megaupload.com/?d=HKKB8W4L" & @LF & _
"http://www.megaupload.com/?d=37S4MJ6A"

$a = StringRegExp($links, "(?m).*=(.*)", 3)
_ArrayDisplay($a)

MsgBox(0, "Test", StringRegExpReplace($links, "(?m)(.*=)(.*)", "$2"))

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Link to comment
Share on other sites

This is just a simple readline loop that trims all the way up to "?d=":

$file = FileOpen(@DesktopDir & "\test.txt", 0)

; Check if file opened for reading OK
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

While 1
    $line = FileReadLine($file)
    If @error = -1 Then ExitLoop
    $link = StringTrimLeft($line, StringInStr($line, "?d=") + 2)
    MsgBox(0, "Link", $link)
WEnd

FileClose($file)

smartee helped me out in this topic Maybe it could help you loop as well.

Edited by rogue5099
Link to comment
Share on other sites

Try this:

#include <Array.au3>

$links = _
"http://www.megaupload.com/?d=VQDRSKUK" & @LF & _
"http://www.megaupload.com/?d=M3U74HUR" & @LF & _
"http://www.megaupload.com/?d=WD8ERIWP" & @LF & _
"http://www.megaupload.com/?d=9ICL8Z3S" & @LF & _
"http://www.megaupload.com/?d=H5VY3YD0" & @LF & _
"http://www.megaupload.com/?d=VI9PW4UV" & @LF & _
"http://www.megaupload.com/?d=GL8SNXWD" & @LF & _
"http://www.megaupload.com/?d=D7OWADBB" & @LF & _
"http://www.megaupload.com/?d=HKKB8W4L" & @LF & _
"http://www.megaupload.com/?d=37S4MJ6A"

$a = StringRegExp($links, "(?m).*=(.*)", 3)
_ArrayDisplay($a)

MsgBox(0, "Test", StringRegExpReplace($links, "(?m)(.*=)(.*)", "$2"))

Br,

UEZ

StungStang want to process the request in a file; you've worked on a string... :)

Uhm... (?m).*=(.*)? http://www.autoitscript.com/forum/index.php?app=forums is valid? :)

This is just a simple readline loop that trims all the way up to "?d=":

$file = FileOpen(@DesktopDir & "\test.txt", 0)

; Check if file opened for reading OK
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

While 1
    $line = FileReadLine($file)
    If @error = -1 Then ExitLoop
    $link = StringTrimLeft($line, StringInStr($line, "?d=") + 2)
    MsgBox(0, "Link", $link)
WEnd

FileClose($file)

smartee helped me out in this topic Maybe it could help you loop as well.

While? Better For :mellow:

http://www.autoitscript.com/forum/index.php?d=app is valid? :)

Link to comment
Share on other sites

While? Better For :mellow:

I just copied striaght from example to make it work with what he posted

http://www.autoitscript.com/forum/index.php?d=app is valid? :)

No it's not valid but there is no "?d=" in autoitscript.com's link it is in MegaUpload which is what he wanted in topic!

I want grab the code after "?d=", in this example:

So you search for ?d= and delete everything before it!

Now Goodware, if you have a better solution than ones we have tried to come up with then post it instead of saying "add a loop" and then flaming what we have posted. Maybe ours isn't the best answer, if you know a better one then post it! It would also help others instead of giving links to the Help Doc!

Link to comment
Share on other sites

I need a more complex regex, becouse i dont have a .txt with only the link. Is a .html with other code. With my SRE:

StringRegExp($Leggi_File, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)

I can grab all the link in this page :mellow:

The problem is here:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")
$String_Link = _ArrayToString($Regex_Link)
$Regex_String = StringRegExp($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex ID"
FileClose($Open)

This SRE:

$Regex_String = StringRegExp($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)

Return me only the last MegaUpload link in the list =(...Why it do that? The SRE work perfectly if there are one link, but if there are much more than 1 link it return me only the last MegaUpload link in the list. Why?

There are a way to fix my code?

Hi!

Edited by StungStang
Link to comment
Share on other sites

I need a more complex regex, becouse i dont have a .txt with only the link. Is a .html with other code. With my SRE:

StringRegExp($Leggi_File, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)

I can grab all the link in this page :mellow:

The problem is here:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")
$String_Link = _ArrayToString($Regex_Link)
$Regex_String = StringRegExp($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)
_ArrayDisplay($Regex_String,"Regex ID")
FileClose($Open)

This SRE:

$Regex_String = StringRegExp($String_Link, "megaupload\.com.*(?:\?|&)(?:(?:folderi)?d|f)=([A-Z-a-z0-9]{8})", 3)

Return me only the last MegaUpload link in the list =(...Why it do that? The SRE work perfectly if there are one link, but if there are much more than 1 link it return me only the last MegaUpload link in the list. Why?

There are a way to fix my code?

Hi!

Edited by StungStang
Link to comment
Share on other sites

Might have to confirm this with Smartee. I'm not good with SRE:

#include <Array.au3>

$aID = StringRegExp(FileRead("test.txt"), "http://www\.megaupload\.com/\?d=(.+)", 3)
_ArrayDisplay($aID)

Or for you code:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www\.megaupload\.com/(?:\w\w/)?\?[fd]=[-a-zA-Z0-9]+", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")
$String_Link = _ArrayToString($Regex_Link)
$Regex_String = StringRegExp($Read, "http://www\.megaupload\.com/\?d=(.+)", 3)
_ArrayDisplay($Regex_String,"Regex ID")
FileClose($Open)
Edited by rogue5099
Link to comment
Share on other sites

This is the magic SRE, all in one :mellow:

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www.megaupload.com/(?:\w\w/)?\?[fd]=([-a-zA-Z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")

This is the correct SRE:

"http://www.megaupload.com/(?:\w\w/)?\?[fd]=([-a-zA-Z0-9]{8})"

Hi to all :)

Link to comment
Share on other sites

I just copied striaght from example to make it work with what he posted

No it's not valid but there is no "?d=" in autoitscript.com's link it is in MegaUpload which is what he wanted in topic!

So you search for ?d= and delete everything before it!

Now Goodware, if you have a better solution than ones we have tried to come up with then post it instead of saying "add a loop" and then flaming what we have posted. Maybe ours isn't the best answer, if you know a better one then post it! It would also help others instead of giving links to the Help Doc!

1) OK, we recommend only the use of For :mellow:

2) You're wrong. You have included any string that contains a question mark, the fourth letter of the alphabet and the equal "?d=".

According to your script, even if I type "mushrooms and potatoes?d=eaten", okay

The link (http://www.autoitscript.com/forum/index.php?d=app) is valid according to your expression!

3) Maybe I waited too long... here's a version that takes one by one:

$Path = @HomeDrive & "\Text.txt" ; Your text file

; http://www.autoitscript.com/autoit3/docs/functions/FileReadLine.htm
$File = FileOpen($Path, 0)

; Check if file opened for reading OK
If $File = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

$FileRead = StringStripWS(FileRead($Path), 2)
$StringRegExp = StringRegExp($FileRead, "http://www.megaupload.com/*\?d=([[:alpha:][:digit:]]{8})", 3) ; [[:alpha:][:digit:]] = [[:alnum:]]
For $y = 0 To UBound($StringRegExp) - 1
    MsgBox(0, $y, $StringRegExp[$y])
Next

StungStang avoids the use of strings as A-Z-a-z0-9... you can replace those strings with [[:alpha:][:digit:]] or [[:alnum:]]

Link to comment
Share on other sites

This works fine for me

#include<array.au3>
$sSource = BinaryToString(InetRead("http://www.megaupload.com/?c=top100"))
$aItems = StringRegExp($sSource, "<a.+http://.+megaup.+=([[:alnum:]]+).>", 3)
_ArrayDisplay($aItems)

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

I've solved with this code :)

$Open = FileOpen(@ScriptDir & "\link.txt")
$Read = FileRead($Open)
$Regex_Link = StringRegExp($Read, "http://www.megaupload.com/(?:\w\w/)?\?[fd]=([-a-zA-Z0-9]{8})", 3)
_ArrayDisplay($Regex_Link,"Regex Link!")

Try to keep in mind my advice

StungStang avoids the use of strings as A-Z-a-z0-9... you can replace those strings with [[:alpha:][:digit:]] or [[:alnum:]]

Is only to simplify :mellow:
Link to comment
Share on other sites

If you change the

([-a-zA-Z0-9]{8})

to

([[:alnum:]]{8})

you will get the same result with easier to read code.

another alternative is

(?i)([a-z0-9]{8})

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...