drever44 Posted December 8, 2008 Share Posted December 8, 2008 I am fairly new to auto it and have been reading up on this for several weeks now. Some time ago I created a neat little script using Aldos Macro Recorder. All worked rather nicely. So I am trying to convert it over to autoit for many reasons. More functionality and the ability to compile my script into an exe file are just a few. I have read all that I can get my hands on when it comes to StringRegExp for autoit and well I must say I am still in the dark. Spent sever days playing around with it and can get it to do what I want most of the time but still dont seam to quite have it. I have a network server with a ton of html pages on it that I have created. I want to move things around and reorganize them and of cause this breaks the links. I am able to get the script to go threw the list of files one at a time however I cant get it to find links and fix them, well sort of, it will work the first time but if I try to loop it the second time around things get weird. So I am using a page from imdb as an example which has several links in it which is similar to what I am trying to do on my server. I want to find all similar instances of this link and change it to a different link From: <a href=/name/nm0000001/>Fred Astaire</a> To: <a href=Fred Astaire.html>Fred Astaire</a> Now that sounds simple I know but hers the catch the nm numbers are all different and the names are as well. Each number is assigned to that name which is the reason for the loop. Here is what I have written so far CODEDo $Tfile = "temp.txt" $Tread = FileRead($Tfile) $glnum = stringregexp($Tread, '<a href="/name/*.*/">', 1, 0) $glnum = StringTrimLeft($glnum[0], 15) $glnum = StringTrimRight($glnum, 3) MsgBox(0, 'Link Number Result', $glnum, 1) $glname = StringRegExp($Tread, <a href="/name/& $glnum &'/">*.*</a>_', 1, 0) $glname = StringTrimLeft($glname[0], 3) $glname = StringTrimRight($glname, 4) MsgBox(0, 'Link Name Result', $glname, 1) _ReplaceStringInFile("temp.txt", '<a href="/name/'& $glnum &'/">'& $glname &'</a>', '<a href="'& $glname &'.html/">'& $glname &'</a>',0 ,1) until $glnum = "" Any incite as to what I am doing wrong here would be much appreciated for I am truly here as a last ditch effort. And cant find the answers I am looking for an other threads, some are close and I have tried them but with no luck. Thanks for your time. Link to comment Share on other sites More sharing options...
ChangMinYang Posted December 8, 2008 Share Posted December 8, 2008 I am fairly new to auto it and have been reading up on this for several weeks now. Some time ago I created a neat little script using Aldo's Macro Recorder. All worked rather nicely. So I am trying to convert it over to autoit for many reasons. More functionality and the ability to compile my script into an exe file are just a few. I have read all that I can get my hands on when it comes to "StringRegExp" for autoit and well I must say I am still in the dark. Spent sever days playing around with it and can get it to do what I want most of the time but still don't seam to quite have it. I have a network server with a ton of html pages on it that I have created. I want to move things around and reorganize them and of cause this breaks the links. I am able to get the script to go threw the list of files one at a time however I cant get it to find links and fix them, well sort of, it will work the first time but if I try to loop it the second time around things get weird. So I am using a page from imdb as an example which has several links in it which is similar to what I am trying to do on my server. I want to find all similar instances of this link and change it to a different link From: <a href="/name/nm0000001/">Fred Astaire</a> To: <a href="Fred Astaire.html">Fred Astaire</a> Now that sounds simple I know but hers the catch the nm numbers are all different and the names are as well. Each number is assigned to that name which is the reason for the loop. Here is what I have written so far CODEDo $Tfile = "temp.txt" $Tread = FileRead($Tfile) $glnum = stringregexp($Tread, '<a href="/name/*.*/">', 1, 0) $glnum = StringTrimLeft($glnum[0], 15) $glnum = StringTrimRight($glnum, 3) MsgBox(0, 'Link Number Result', $glnum, 1) $glname = StringRegExp($Tread, '<a href="/name/'& $glnum &'/">*.*</a>_', 1, 0) $glname = StringTrimLeft($glname[0], 3) $glname = StringTrimRight($glname, 4) MsgBox(0, 'Link Name Result', $glname, 1) _ReplaceStringInFile("temp.txt", '<a href="/name/'& $glnum &'/">'& $glname &'</a>', '<a href="'& $glname &'.html/">'& $glname &'</a>',0 ,1) until $glnum = "" Any incite as to what I am doing wrong here would be much appreciated for I am truly here as a last ditch effort. And cant find the answers I am looking for an other threads, some are close and I have tried them but with no luck. Thanks for your time. If you upload 'temp.txt', i will test and modify, later~ Link to comment Share on other sites More sharing options...
drever44 Posted December 8, 2008 Author Share Posted December 8, 2008 If you upload 'temp.txt', i will test and modify, later~I am unable to upload for some reason so this will retrieve the file that I am using and rename it temp.txtCODEInetGet("http://www.imdb.com/name/nm0000092/bio", "temp.txt", 1)Thanks Link to comment Share on other sites More sharing options...
ChangMinYang Posted December 9, 2008 Share Posted December 9, 2008 I am unable to upload for some reason so this will retrieve the file that I am using and rename it temp.txt CODEInetGet("http://www.imdb.com/name/nm0000092/bio", "temp.txt", 1) Thanks You want to this ? For $i = +1 to +3 Step +1 $Tindex = "nm" & StringRight( "000000" & $i , 7 ) $Tfile = "C:\TEMP\" & $Tindex & ".html" InetGet( "[url="http://www.imdb.com/name/"]http://www.imdb.com/name/[/url]" & $Tindex & "/bio" , $Tfile , 1 ) MsgBox( 0 , $Tindex , "Saved " & FileGetSize( $Tfile ) & " bytes to " & $Tfile , 1 ) Next Link to comment Share on other sites More sharing options...
drever44 Posted December 9, 2008 Author Share Posted December 9, 2008 (edited) You want to this ? For $i = +1 to +3 Step +1 $Tindex = "nm" & StringRight( "000000" & $i , 7 ) $Tfile = "C:\TEMP\" & $Tindex & ".html" InetGet( "[url="http://www.imdb.com/name/"]http://www.imdb.com/name/[/url]" & $Tindex & "/bio" , $Tfile , 1 ) MsgBox( 0 , $Tindex , "Saved " & FileGetSize( $Tfile ) & " bytes to " & $Tfile , 1 ) Next Well that is part of it I am trying to find and change all similar instances of the links that are like this in the entire page <a href="/name/nm0000001/">Fred Astaire</a> And change them to this <a href="Fred Astaire.html">Fred Astaire</a> Which is why I was looping it till it did not find any more so I was first getting the nm number and assigning it to $glnum and then getting the name associated with that number and assigning it to $glname so that I could then find the string that contained both num and name to replace it with my string/link <a href="Fred Astaire.html">Fred Astaire</a>. But in the short I am trying to replace <a href="/name/nm0000001/">Fred Astaire</a> with <a href="Fred Astaire.html">Fred Astaire</a> and all other similar links, only the numbers and name change from link to link and the number of links vary from page to page so it seemed to be a little tricky to me, perhaps I was over thinking it. Edited December 9, 2008 by drever44 Link to comment Share on other sites More sharing options...
ChangMinYang Posted December 9, 2008 Share Posted December 9, 2008 Well that is part of it I am trying to find and change all similar instances of the links that are like this in the entire page <a href="/name/nm0000001/">Fred Astaire</a> And change them to this <a href="Fred Astaire.html">Fred Astaire</a> Which is why I was looping it till it did not find any more so I was first getting the nm number and assigning it to $glnum and then getting the name associated with that number and assigning it to $glname so that I could then find the string that contained both num and name to replace it with my string/link <a href="Fred Astaire.html">Fred Astaire</a>. But in the short I am trying to replace <a href="/name/nm0000001/">Fred Astaire</a> with <a href="Fred Astaire.html">Fred Astaire</a> and all other similar links, only the numbers and name change from link to link and the number of links vary from page to page so it seemed to be a little tricky to me, perhaps I was over thinking it. Try this, friend :-) expandcollapse popup#Include <Array.au3> ;=================================================================================================== ================= ; AutoIt3 Forum, GoodMan ; E-Mail to ChangMin,Yang<[email="year1969@naver.com"]year1969@naver.com[/email]> Republic of Korea ; Reply to [url="http://www.autoitscript.com/forum/index.php?s=&showtopic=85645&view=findpost&p=614352"]http://www.autoitscript.com/forum/index.php?s=&showtopic=85645&view=findpost&p=614352[/url] ;=================================================================================================== ================= Local $Tindex = "" Local $Tpath = "C:\TEMP\" Local $Tfile = "" Local $Turl = "" Local $Tread = "" Local $Tname = "" Local $Tsave = "" Local $TflagSpaceToDash = 0; [0]=Save name with SPACE , [1]=Save name without SPACE (replaced to under-dash) Local $TworkBeg = 1; Save from ... Local $TworkEnd = 3; Save to ... For $i = $TworkBeg to $TworkEnd Step +1 $Tindex = "nm" & StringRight( "000000" & $i , 7 ) $Tfile = $Tpath & $Tindex & ".html" $Turl = "[url="http://www.imdb.com/name/"]http://www.imdb.com/name/[/url]" & $Tindex & "/bio" InetGet( $Turl , $Tfile , 1 ) If @Error Then MsgBox( 0 , $Tindex , "INet Get Error at " & $Turl , 1 ) If $i = 1 Then ExitLoop EndIf Else $Tread = FileRead( $Tfile ) $Tname = StringRegExp( $Tread , '(<meta name="title" content=")([[:ascii:]]+)([ ]+\-[ ]+Biography">)' , 1 , 1 ) If IsArray( $Tname ) = 1 AND StringLen( $Tname[1] ) >= 1 Then If $TflagSpaceToDash Then $Tsave = $Tpath & StringReplace( $Tname[1] , " " , "_" , 0 , 2 ) & ".html" Else $Tsave = $Tpath & $Tname[1] & ".html" EndIf If FileExists( $Tsave ) = 1 Then FileDelete( $Tsave ) EndIf $Tread = StringRegExpReplace( $Tread , "(<iframe [^>]+>)(<[/]iframe>|)", "" ) FileWrite( $Tsave , '<BASE href="[url="http://www.imdb.com/"]'">http://www.imdb.com/">'[/url] & @LF & $Tread ) MsgBox( 0 , $Tindex & " => " & $Tname[1] , "Saved SRC: " & FileGetSize( $Tfile ) & " bytes" & @CRLF & "Saved DST: " & FileGetSize( $Tsave ) & " bytes " & $Tsave , 1 ) If FileExists( $Tfile ) = 1 Then FileDelete( $Tfile ) EndIf EndIf EndIf Next Link to comment Share on other sites More sharing options...
GEOSoft Posted December 9, 2008 Share Posted December 9, 2008 (edited) From: <a href=/name/nm0000001/>Fred Astaire</a> To: <a href=Fred Astaire.html>Fred Astaire</a>#include <Inet.au3> $sURL = _InetGetSource ("whatever page");; This could also be a FileRead() if they are local files. $aRegExp = StringRegExp($sURL, "<a\s.+\s?=.?/.+</a>", 3);; Get the links into an array. If NOT @Error Then For $i = 0 To Ubound($aRegExp) -1 ;; Now we will replace them $sURL = StringRegExpReplace($sURL, "(?i)<a\s.+\s?=.?/.+>(.+)</a>", '<a href="$1\.html">$1</a>') Next EndIf Edit: Removed a capturing group that it didn't need and changed the back-reference number. Edited December 9, 2008 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
drever44 Posted December 9, 2008 Author Share Posted December 9, 2008 (edited) Wow from the looks of this I was not even close. Must be the downfall of knowing too many different programming languages or something for I thought I could pick this up rather easily but after much reading and digging around the list of terms and commands are quite overwhelming but I will definitely continue to absorb this a little at a time. I will be able to pick this apart tonight and see if I can get this to work with my server thanks Oh and one more thing am I understanding this correctly, at the top of the script there is a list of local strings. Are these to be filled in with the defaults? or are you just clearing them in preporation for the rest of the script? Many Thanks Edited December 9, 2008 by drever44 Link to comment Share on other sites More sharing options...
GEOSoft Posted December 9, 2008 Share Posted December 9, 2008 (edited) Oh and one more thing am I understanding this correctly, at the top of the script there is a list of local strings. Are these to be filled in with the defaults? or are you just clearing them in preporation for the rest of the script?Many ThanksHe is just declaring the variables for the rest of the script.Do you already have an array of files? Are there local copies of the files?If it's only from the server then just put the code I gave you inside another loop. Shorter and faster. I can give you another example for that. If they are local then it's still shorter to use the RegExp and I would use it as a separate function in that case.EDIT: BTW: Don't forget that if you have images etc. where the html code will contain something like src="/images/inage1.jpg" then you will have to also update all of those and most of everything you want could be done in a single function. Edited December 9, 2008 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
drever44 Posted December 9, 2008 Author Share Posted December 9, 2008 And thanks to you GEOSoft I will also utilize your example too. It really helps me if I have some thing to work from or an example to learn from. Some of the terminology is overwhelming in the help definitions. I see that I will be up late tonight going over all this. Many Thanks Link to comment Share on other sites More sharing options...
GEOSoft Posted December 9, 2008 Share Posted December 9, 2008 And thanks to you GEOSoft I will also utilize your example too. It really helps me if I have some thing to work from or an example to learn from. Some of the terminology is overwhelming in the help definitions. I see that I will be up late tonight going over all this. Many ThanksNo problem but make sure you check the edit on my last post. George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
drever44 Posted December 9, 2008 Author Share Posted December 9, 2008 He is just declaring the variables for the rest of the script.Do you already have an array of files? Are there local copies of the files?If it's only from the server then just put the code I gave you inside another loop. Shorter and faster. I can give you another example for that. If they are local then it's still shorter to use the RegExp and I would use it as a separate function in that case.Well I am trying to accomplish several things here, I am an avid movie lover and I have created a large array of storage for my Store Bought Movies (over 14 terabytes) and also created several databases with Movie info and what not, actors and stuff. I wanted to cross reference them so if I click on an actor I get a list of the movies that the actor is in that I have. Then I can click on the movie and get that info and even watch it from any computer in my house, even sending it to my projector room for a family viewing.Most of my files are local and also I will be gathering some info such as biographies and stuff of the actors and plots for the movies. So my dataset is local and the data is not, if that makes sense. After all is done all info will be local for my server.And I would love to look at any examples that you would like to show me. The more the better. This is the best way for me to learn, I know that there is more then one way to skin a cat.Many Thanks. Link to comment Share on other sites More sharing options...
GEOSoft Posted December 9, 2008 Share Posted December 9, 2008 If they are local then you don't need to use _InetGetSource() Assuming that you already have an array of the files including the path, we will call that array $aFiles. For $i = 1 To Ubound($aFiles) -1;; If the array is 0 based then change "$i = 1" to $i= 0 $sNewCode = _ModLinks($aFiles[$i]) If NOT @Error Then $oFile = FileOpen($aFiles[$i], 2) FileWrite($oFile, $sNewCode) FileClose($oFile) EndIf Next Func _ModLinks($sStr) If FileExists($sStr) Then $sStr = FileRead($sStr) $aRegExp = StringRegExp($sStr, "<a\s.+\s?=.?/.+</a>", 3);; Get the links into an array. If NOT @Error Then For $i = 0 To Ubound($aRegExp) -1 ;; Now we will replace them $sStr = StringRegExpReplace($sStr, "(?i)<a\s.+\s?=.?/.+>(.+)</a>", '<a href="$1\.html">$1</a>') Next Return $sStr EndIf Return SetError(1);; The array could not be created so set @Error to 1 EndFunc George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
drever44 Posted January 2, 2009 Author Share Posted January 2, 2009 If they are local then you don't need to use _InetGetSource() Assuming that you already have an array of the files including the path, we will call that array $aFiles. For $i = 1 To Ubound($aFiles) -1;; If the array is 0 based then change "$i = 1" to $i= 0 $sNewCode = _ModLinks($aFiles[$i]) If NOT @Error Then $oFile = FileOpen($aFiles[$i], 2) FileWrite($oFile, $sNewCode) FileClose($oFile) EndIf Next Func _ModLinks($sStr) If FileExists($sStr) Then $sStr = FileRead($sStr) $aRegExp = StringRegExp($sStr, "<a\s.+\s?=.?/.+</a>", 3);; Get the links into an array. If NOT @Error Then For $i = 0 To Ubound($aRegExp) -1 ;; Now we will replace them $sStr = StringRegExpReplace($sStr, "(?i)<a\s.+\s?=.?/.+>(.+)</a>", '<a href="$1\.html">$1</a>') Next Return $sStr EndIf Return SetError(1);; The array could not be created so set @Error to 1 EndFunc I have been trying to understand the many handles for this function and I must say I have had little luck. Perhaps something a little more simple like returning the title of a page between the <title> and </title> in the <head> could you provide me with links that would perhaps better explain this. Or maybe I am totally using the wrong function for doing this seemingly simple task. Oh and I have picked your examples apart and fiddled with them quite extensively and just can’t understand why it works. Link to comment Share on other sites More sharing options...
GEOSoft Posted January 2, 2009 Share Posted January 2, 2009 (edited) I have been trying to understand the many handles for this function and I must say I have had little luck. Perhaps something a little more simple like returning the title of a page between the <title> and </title> in the <head> could you provide me with links that would perhaps better explain this. Or maybe I am totally using the wrong function for doing this seemingly simple task. Oh and I have picked your examples apart and fiddled with them quite extensively and just cant understand why it works. Can you post the method you use for getting the file list into an array and then perhaps I can add more comments. The comments in the function itself should be pretty sufficient to understand whats happening there. $sStr is either a block of text or a file path and name. File name alone will seldom work on file functions. $aRegEx is an array of the links found in the the block or file ($sStr). Then it checks to be sure that the array was created, if not it will return an error. If no error then it replaces the found links in the text with the proper string by looping through the array. I wouldn't exactly call 2 handles "many". 5 if you count those in the code block before the function itself, still not "many". I really suspect that your issue may be related to not sending the full path to the function. Test it yourself by sending the file contents (use fileread) of the 1st element of your file array to to the clipboard and then pasting it into a new text editor window. Did you get what you expected? You could also test my function by simply copying one of your html pages to the clipboard and then call the function as below. ClipPut (_ModLinks(ClipGet())) Then paste the new clip contents into a new blank editor page. Edited January 2, 2009 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
drever44 Posted January 4, 2009 Author Share Posted January 4, 2009 Sorry when I mentioned handles I was referring to the list below matching characters in the function help for stringregexp but I am getting to understand them a little better, I have been messing around with them and came up with this but I am not shire that it is exactly correct. So I have sort of taken a step back from what you posted for me last and were trying to do something a little easier so I can better understand the functionality of stringregexp so for the learning experience I am trying to extract the name of a HTML page to a string and also there is an ocasional &#xxx; html code that I will have to work out later. FileOpen(c:/html/0012.html, 0) $myfile = FileRead(c:/html/0012.html) FileClose(c:/html/0012.html) $name = StringRegExp($myfile, '<title>(.*?)</title>', 1) MsgBox(0, "Output", $name[0], 20) I have declared the strings before using them and I am using a msgbox for testing the string just to see if it is outputting what I want. However I am not understanding why I have to have [0] after the string name to view it in the message box. Or even use it in combination with other strings for that matter. I have searched the help files and not finding any definition of usage for this type of command if you call it that. How can I get this to work without using [0] for some times it errors out for seemingly no reason with a return error of ==> Subscript used with non-Array variable.: on the msgbox line. Perhaps you could point me in the right direction here. this is something i can do in just a few lines of code with vbscript. perhaps this is my problem for i keep thinking in vbscript. Many thanks. Link to comment Share on other sites More sharing options...
GEOSoft Posted January 4, 2009 Share Posted January 4, 2009 Sorry when I mentioned handles I was referring to the list below matching characters in the function help for stringregexp but I am getting to understand them a little better, I have been messing around with them and came up with this but I am not shire that it is exactly correct. So I have sort of taken a step back from what you posted for me last and were trying to do something a little easier so I can better understand the functionality of stringregexp so for the learning experience I am trying to extract the name of a HTML page to a string and also there is an ocasional &#xxx; html code that I will have to work out later. FileOpen(c:/html/0012.html, 0) $myfile = FileRead(c:/html/0012.html) FileClose(c:/html/0012.html) $name = StringRegExp($myfile, '<title>(.*?)</title>', 1) MsgBox(0, "Output", $name[0], 20) I have declared the strings before using them and I am using a msgbox for testing the string just to see if it is outputting what I want. However I am not understanding why I have to have [0] after the string name to view it in the message box. Or even use it in combination with other strings for that matter. I have searched the help files and not finding any definition of usage for this type of command if you call it that. How can I get this to work without using [0] for some times it errors out for seemingly no reason with a return error of ==> Subscript used with non-Array variable.: on the msgbox line. Perhaps you could point me in the right direction here. this is something i can do in just a few lines of code with vbscript. perhaps this is my problem for i keep thinking in vbscript. Many thanks.In this case you are only expecting the Title tage to appear once so the usage of StringRegExp(String, Expression, 1) is correct. and if you look at the help file you will see that anything except 0 will return an array. RegExp Arrays are zero based which means in this case that the first element will contain the data in element 0, hence the need to use [0]. Another tip. In most cases there is no need to open and close the file to read it. Just use FileRead(c:/html/0012.html) In the function I gave you, the first regexp is expection 1 or more instances of the string so I used flag 3 The break down for that reg exp is Find any occurances of "<a" followed by a space(\s) then get anything (.+) until we find an equal sign which may or may not be preceeded by a space(\s?=) and possibly followed by another character(.?) which is usually a double quote but in html could also be a single followed by a slash and anything until the end of the tag(/.+</a>) The replace is much the same except that we are now working only against the results found in each element of the first array. "(?i)<a\s.+\s?=.?/.+>(.+)</a>" translates to Case-insensitive [(?i)] start with the literal string ("<a") which is follwed hy a space and anything up to the first ">" but that must include an = sign and may or may not contain spaces around the = sign. We are actually ignoring all of this because the part we want follows the ">". That part I enclosed in parenthesis so we can back reference it [(.+)] which means that we get everything up to but NOT INCLUDING the "</a>". The replacement string is what we want the string to start with (<a href="), followed by the part we saved. in this case I back-referenced that with $1 but I could just as easily have used \1 (no difference) and followed the backreference with the remainder of our URL which is just (.html">). Then I but put the display text in which is just the part we saved, again back-referenced as $1 and then closed the URL with </a> It may seem complex, but really it's not if you compare the breakdown to the actual code I gave you. RegExps are difficult for most developers because there are so many different engines available and many things are not consistent between the engines. Also most developers will avoid their use if there is another fairly simple method of doing the same thing. In this case RegExp was the better way to go, just don't fall into the trap of overusing them, primarily because they are generally slower than normal String functions and there is no sense in banging your head against a wall just so you can write slower code. George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
drever44 Posted January 5, 2009 Author Share Posted January 5, 2009 Thanks I can actually understand that now and also I went back and started to reread the help files and found a tutorial on stringregexp that I must have overlooked before and that also helped a lot. Then came across @extended which really brought light to the [0] deal. This is very intriguing to me, I did a 48 hour session straight threw playing around with the stringregexp in a while loop using the @extended and I must say its a little tricky but can see the need for it now. I was able to extract the name from a HTML file and replace all the &#XXX; instances with its corresponding characters so I have made some achievements since my first post. However I have experienced some issues, a few times I had a hard crash and once in a while it seams to skip over the &#XXX; for some reason, sort of like it is running so fast it trips. In the help file it sad that open and close was not necessary but it was good practice to do it that way so that was why I did it. And also I was kind of playing around with stringinstr, am I understanding that this function will not produce a string but only tell you if it exists? And my last question, is it possible to do the equivalent to stringregexp with out the array, for the script that I have in mind will probably contain 30 to 35 stringregexp unless you think this is not overusing it? Link to comment Share on other sites More sharing options...
GEOSoft Posted January 5, 2009 Share Posted January 5, 2009 (edited) Thanks I can actually understand that now and also I went back and started to reread the help files and found a tutorial on stringregexp that I must have overlooked before and that also helped a lot. Then came across @extended which really brought light to the [0] deal. This is very intriguing to me, I did a 48 hour session straight threw playing around with the stringregexp in a while loop using the @extended and I must say its a little tricky but can see the need for it now. I was able to extract the name from a HTML file and replace all the &#XXX; instances with its corresponding characters so I have made some achievements since my first post. However I have experienced some issues, a few times I had a hard crash and once in a while it seams to skip over the &#XXX; for some reason, sort of like it is running so fast it trips. In the help file it sad that open and close was not necessary but it was good practice to do it that way so that was why I did it. And also I was kind of playing around with stringinstr, am I understanding that this function will not produce a string but only tell you if it exists? And my last question, is it possible to do the equivalent to stringregexp with out the array, for the script that I have in mind will probably contain 30 to 35 stringregexp unless you think this is not overusing it?Without seeing the actual RegExps and the page you are running it against, I won't venture a guess at why it missed some. RegExp doesn't run fast at any time so I would doubt that it "trips". Yes, good antiquated code practice dictates that it's better to use FileOpen() and I guess I should use it more. I seldom use it unless I want to delete the contents of the file before writing to it again or if the files I'm reading are very large file in which case the file handle method is faster than a simple FileRead() $hFile = FileOpen($File, 2) FileWrite($hFile, MyFunc($sStr)) FileClose($hFile) One of the reasons I don't bother with the FileOpen() when doing a read is because I will often write functions like Func MyFunc($sStr) If FileExists($sStr) Then $sStr = FileRead($sStr) ;; Do some string manipulation here Return $sStr EndFunc StringInStr does not, by itself return a string. It does return the starting position of the given string or returns 0 if the string can not be found. It can be used as a reference point in other string functions though. $sStr = "This is some string of text." MsgBox(4096, "RESULTS", StringMid($sStr, StringInStr($sStr, "some"))) Without using an array, you can not RETURN the reults of a RegExp. It can be used to verify a string though. Much like StringInStr() If StringRegExp($sStr, "(?i)this\s.*\.") Then ;; do whatever. That call could also be written as If StringRegExp($sStr, "(?i)this\s.*\.", 0) Edited January 5, 2009 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
drever44 Posted January 19, 2009 Author Share Posted January 19, 2009 Ok I have that finished now i figured it out thank you for the help. I am now on a new issue with stringregexp and perhaps you could help me once more. i want to return a complete "<table> .*? </table>" from a string using stringregexp the problem is that the table is on multiple lines and the number of @cr in the table very, Is this possible? if not what function should i use. thank you. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now