stratosgr Posted February 10, 2010 Share Posted February 10, 2010 hey, i have a log file with multiple lines like : 2010.2.1 9:34:46 - 10.0.11.47 http://www.google.com GET 9143 -20 1 200 text/html Default - 2010.1.31 15:43:34 - 10.0.11.46 http://download.windowsupdate.com/v9/blah/blah/bla01200122.cab GET 20742 0 1 200 - Default - 2010.2.1 9:33:56 - 10.0.11.47 http://login.live.com/ppcrlcheck.srf GET 63 0 1 200 text/html Default - how can i grab only the urls? I used this line " StringRegExp($str, '(http://.*?/)', 3) " and worked fine with urls ending with a '/'. The problem is where urls ending with space and not a slash.. any suggestions? Link to comment Share on other sites More sharing options...
PsaltyDS Posted February 10, 2010 Share Posted February 10, 2010 Like this: #include <Array.au3> $sInput = "2010.2.1 9:34:46 - 10.0.11.47 http://www.google.com GET 9143 -20 1 200 text/html Default -" & @CRLF & _ "2010.1.31 15:43:34 - 10.0.11.46 http://download.windowsupdate.com/v9/blah/blah/bla01200122.cab GET 20742 0 1 200 - Default -" & @CRLF & _ "2010.2.1 9:33:56 - 10.0.11.47 http://login.live.com/ppcrlcheck.srf GET 63 0 1 200 text/html Default -" $aOutput = StringRegExp($sInput, "(http://.+)(?: GET )", 3) If IsArray($aOutput) Then _ArrayDisplay($aOutput, "$aOutput") Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
stratosgr Posted February 10, 2010 Author Share Posted February 10, 2010 thnx for your quick reply psaltyds.. I also tried this and didnt work for me. Those lines was just an example. The actual file might have asterisks, questionmarks and other symbols after the url. Also some lines end with a simple 'space'... Can you modify the statment to stop at 'space' ? Link to comment Share on other sites More sharing options...
PsaltyDS Posted February 10, 2010 Share Posted February 10, 2010 This is a smarter pattern anyway: "(http://\S+)" Note that's a capital S (any non-whitespace). Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
Mison Posted February 11, 2010 Share Posted February 11, 2010 "http://[a-zA-Z0-9./-]+" Hi ;) Link to comment Share on other sites More sharing options...
PsaltyDS Posted February 11, 2010 Share Posted February 11, 2010 "http://[a-zA-Z0-9./-]+" That pattern would leave out other valid punctuation that might appear in a URL, like underscore or ampersand. Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
Mison Posted February 11, 2010 Share Posted February 11, 2010 Erm, you're right. \S by far is the best pattern, the only flaw is that it includes non white space invalid char.Just so you know, spaces are valid in URLs. Hi ;) Link to comment Share on other sites More sharing options...
stratosgr Posted February 11, 2010 Author Share Posted February 11, 2010 thaaaaank you ppl !! really appreciate your replies. \S does the trick really nice Another question.. can i only stop on first white space OR first '/' ??? Im only interested in the the url for example : www.google.com/ and not www.google.com/search..blahblahblah Link to comment Share on other sites More sharing options...
PsaltyDS Posted February 11, 2010 Share Posted February 11, 2010 thaaaaank you ppl !! really appreciate your replies.\S does the trick really niceAnother question.. can i only stop on first white space OR first '/' ???Im only interested in the the url for example : www.google.com/ and not www.google.com/search..blahblahblahNo more spoon feeding - you're too old for the high chair. This has gone on long enough for you to put some effort into learning it. Read the help file under StringRegExp(), write a short demo script and try some changes to the pattern. Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now