tsue 0 Posted December 12, 2010 hello, im trying to substrack words from links, example: <TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_lantys.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_lantys.jpg" BORDER="0" ALT="Alcione&Lantys" WIDTH="150" HEIGTH="150"></A></TD> <TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_sierra01.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_sierra01.jpg" BORDER="0" ALT="Alcione&Sierra" WIDTH="150" HEIGTH="150"></A></TD> <TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_sierra02.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_sierra02.jpg" BORDER="0" ALT="Alcione&Sierra" WIDTH="150" HEIGTH="150"></A></TD> <TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_zagato01.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_zagato01.jpg" BORDER="0" ALT="Alcione&Zagato" WIDTH="150" HEIGTH="150"></A></TD> </TR><TR> i only need the link from images to .jpg for each href images/magic_knight_rayearth-love-alcione_lantys.jpg ive search in the autoit web but i have found nothing. is this possible? Share this post Link to post Share on other sites
Realm 18 Posted December 12, 2010 Hello tsue, Have you tried StringRegExp()? This example works fine with your HTML example. #include <Array.au3> $fPath = @DesktopDir & 'HTML_Test.txt' $HTML_text = FileRead($fPath) $sre_Array = StringRegExp($HTML_text,'IMG SRC\=\"(.*?)\" BORDER',3) _ArrayDisplay($sre_Array) Realm My Contributions: Unix Timestamp: Calculate Unix time, or seconds since Epoch, accounting for your local timezone and daylight savings time. RegEdit Jumper: A Small & Simple interface based on Yashied's Reg Jumper Function, for searching Hives in your registry. Share this post Link to post Share on other sites
tsue 0 Posted December 12, 2010 Hello tsue, Have you tried StringRegExp()? This example works fine with your HTML example. #include <Array.au3> $fPath = @DesktopDir & 'HTML_Test.txt' $HTML_text = FileRead($fPath) $sre_Array = StringRegExp($HTML_text,'IMG SRC\=\"(.*?)\" BORDER',3) _ArrayDisplay($sre_Array) Realm im trying this, but i cant manage to get yust images/magic_knight_rayearth-love-alcione_lantys.jpg here is the code $array = StringRegExp("<TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_lantys.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_lantys.jpg" BORDER="0" ALT="Alcione&Lantys" WIDTH="150" HEIGTH="150"></A></TD>", "<(?i)TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="(.*?)"><(?i)IMG SRC="galeria/_magic_knight_rayearth-love-alcione_lantys.jpg" BORDER="0" ALT="Alcione&Lantys" WIDTH="150" HEIGTH="150"></A></TD>" , 3) for $i = 0 to UBound($array) - 1 msgbox(0, "RegExp Test with Option 2 - " & $i, $array[$i]) Next Share this post Link to post Share on other sites
Jury 12 Posted December 12, 2010 im trying this, but i cant manage to get yust images/magic_knight_rayearth-love-alcione_lantys.jpg here is the code $array = StringRegExp("<TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_lantys.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_lantys.jpg" BORDER="0" ALT="Alcione&Lantys" WIDTH="150" HEIGTH="150"></A></TD>", "<(?i)TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="(.*?)"><(?i)IMG SRC="galeria/_magic_knight_rayearth-love-alcione_lantys.jpg" BORDER="0" ALT="Alcione&Lantys" WIDTH="150" HEIGTH="150"></A></TD>" , 3) for $i = 0 to UBound($array) - 1 msgbox(0, "RegExp Test with Option 2 - " & $i, $array[$i]) Next try this: $array = StringRegExp($HTML_text, '(?s).*?(images.*?jpg).*?', 3, 1) For $i = 0 To UBound($array) - 1 MsgBox(0, "RegExp Test with Option 1 - " & $i, $array[$i]) Next Share this post Link to post Share on other sites
GEOSoft 67 Posted December 12, 2010 (edited) $aImages = StringRegExp($sHTML, "(?i)HREF\s*=\s*\x22(.+\.jpg)", 3) If Not @Error Then For $i = 0 To UBound($aImages) -1 MsgBox(4096, "Result " & $i+1, $aImages[$i]) Next Else MsgBox(4096, "Error", "The expression returned error code " & @Error) EndIf Another one that might be what you are really needing would be $aImages = StringRegExp($sHTML, "(?i)<img\ssrc\s*=\s*\x22(.+\.jpg)", 3) If Not @Error Then For $i = 0 To UBound($aImages) -1 MsgBox(4096, "Result " & $i+1, $aImages[$i]) Next Else MsgBox(4096, "Error", "The expression returned error code " & @Error) EndIf EDIT: By the way, the test you ran was doomed to failure from the start becuase of all the double quotes in it "<TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_lantys.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_lantys.jpg" BORDER="0" ALT="Alcione&Lantys" WIDTH="150" HEIGTH="150"></A></TD>"In order to use that you would wrap the whole thing in single quotes and do the same in your expression. Rather than use it in the expression, I prefer to use \x22 to check if a double-quote appears at a given position. Edited December 12, 2010 by GEOSoft GeorgeQuestion about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else."Old age and treachery will always overcome youth and skill!" Share this post Link to post Share on other sites
tsue 0 Posted December 12, 2010 try this: $array = StringRegExp($HTML_text, '(?s).*?(images.*?jpg).*?', 3, 1) For $i = 0 To UBound($array) - 1 MsgBox(0, "RegExp Test with Option 1 - " & $i, $array[$i]) Next thanks it worked, and thanks geosoft for the other information too now im trying to understand how does it work ok so (?s) makes the . next to match anything even newlines and * makes it repeat, the next i dont really understand ?(Find the smallest match instead of the largest after a repeat character) but i found that if i take it out it only gives me 1 result so all of this part was to search unti it finds everything inside () ? now insede () it will search from images to jpg? and why does it has to be added .*? again? can u help me to understand this. thanks Share this post Link to post Share on other sites
Jury 12 Posted December 12, 2010 See attached (if I can get the attacment thingy to work).regex.html Share this post Link to post Share on other sites
tsue 0 Posted December 14, 2010 See attached (if I can get the attacment thingy to work).thank you, for this all of this information, i found that it can give the same result as this$array = StringRegExp($HTML_text, '(?s)(images.*?jpg)', 3, 1)from what i have read, i havent found the reason why you add .*? outside ()thanks Share this post Link to post Share on other sites
GEOSoft 67 Posted December 14, 2010 The *? tells PCRE to find the smallest match which is particularly important when using (?s). Uing your first post as an example, if you did not add the *? it would returnimages/magic_knight_rayearth-love-alcione_lantys.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_lantys.jpg" BORDER="0" ALT="Alcione&Lantys" WIDTH="150" HEIGTH="150"></A></TD><TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_sierra01.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_sierra01.jpg" BORDER="0" ALT="Alcione&Sierra" WIDTH="150" HEIGTH="150"></A></TD><TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_sierra02.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_sierra02.jpg" BORDER="0" ALT="Alcione&Sierra" WIDTH="150" HEIGTH="150"></A></TD><TD WIDTH="150px" HEIGHT="150px"><A TARGET="_blank" A HREF="images/magic_knight_rayearth-love-alcione_zagato01.jpg"><IMG SRC="galeria/_magic_knight_rayearth-love-alcione_zagato01.jpg in element 0 of the array. Using the ? forces it to stop matching after the first .jpg and start searching for the next match. GeorgeQuestion about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else."Old age and treachery will always overcome youth and skill!" Share this post Link to post Share on other sites