StungStang Posted February 17, 2011 Share Posted February 17, 2011 Hi to all, i've another question for you =P I've this html page : <TABLE WIDTH="452" BORDER="0" CELLSPACING="0" CELLPADDING="0" BGCOLOR="#A6B9C8"> <TD WIDTH="125" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica"><B>MY INFO 1</B></FONT></TD> <TD WIDTH="5" VALIGN="TOP" BGCOLOR="#A6B9C8"><IMG SRC="../../images/hello.gif" WIDTH=4 HEIGHT=1 BORDER=0 ALT=""></TD> <TD WIDTH="121" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica" SIZE="-1">MY INFO 2</FONT></TD> <TD WIDTH="121" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica" SIZE="-1">MY INFO 3</FONT></TD> <TD WIDTH="80" VALIGN="TOP" BGCOLOR="#A6B9C8"><A HREF="http://MY INFO 4.com"><IMG SRC="../../images/exit.gif" WIDTH=78 HEIGHT=16 BORDER=0 ALT="Exit"></A></TD> </TABLE> Now i want grab for example "MY INFO 1" , "MY INFO 2", "MY INFO 3", "MY INFO 4"...of course the value "MY INFO" are only an example ... Now i can grab with my script only the "MY INFO 1" with this code : $MYINFO1 = _StringBetween ($Source,'<FONT FACE="Arial, Helvetica"><B>',' </B>' But i've problem to grab the other INFO...how i can grab the other info?... Thanks a lot! Link to comment Share on other sites More sharing options...
PsaltyDS Posted February 17, 2011 Share Posted February 17, 2011 (edited) Use the _IE* functions of the IE.au3 UDF. See help file. You could use _IETableWriteToArray(), or get a collection of the TD tags with _IETagNameGetCollection() and loop through it getting the value of each one with _IEPropertyGet() for "innerText". If you have to string parse it, look into StringRegExp(). See help file. Example with StringRegExp(): #include <Array.au3> $sString = '<TABLE WIDTH="452" BORDER="0" CELLSPACING="0" CELLPADDING="0" BGCOLOR="#A6B9C8">' & _ '<TD WIDTH="125" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica"><B>MY INFO 1</B></FONT></TD>' & _ '<TD WIDTH="5" VALIGN="TOP" BGCOLOR="#A6B9C8"><IMG SRC="../../images/hello.gif" WIDTH=4 HEIGHT=1 BORDER=0 ALT=""></TD>' & _ '<TD WIDTH="121" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica" SIZE="-1">MY INFO 2</FONT></TD>' & _ '<TD WIDTH="121" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica" SIZE="-1">MY INFO 3</FONT></TD>' & _ '<TD WIDTH="80" VALIGN="TOP" BGCOLOR="#A6B9C8"><A HREF="http://MY INFO 4.com"><IMG SRC="../../images/exit.gif" WIDTH=78 HEIGHT=16 BORDER=0 ALT="Exit"></A></TD>' & _ '</TABLE>' $aRET = StringRegExp($sString, '(?U)(?:>)(?:<B>)?([^<]+)(?:</B>)?(?:</FONT)', 3) _ArrayDisplay($aRET) Edited February 17, 2011 by PsaltyDS Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted February 17, 2011 Moderators Share Posted February 17, 2011 StungStang, This does it: #include <Array.au3> $sText = '<TABLE WIDTH="452" BORDER="0" CELLSPACING="0" CELLPADDING="0" BGCOLOR="#A6B9C8">' & _ '<TD WIDTH="125" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica"><B>MY INFO 1</B></FONT></TD>' & _ '<TD WIDTH="5" VALIGN="TOP" BGCOLOR="#A6B9C8"><IMG SRC="../../images/hello.gif" WIDTH=4 HEIGHT=1 BORDER=0 ALT=""></TD>' & _ '<TD WIDTH="121" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica" SIZE="-1">MY INFO 2</FONT></TD>' & _ '<TD WIDTH="121" VALIGN="TOP" BGCOLOR="#A6B9C8"><FONT FACE="Arial, Helvetica" SIZE="-1">MY INFO 3</FONT></TD>' & _ '<TD WIDTH="80" VALIGN="TOP" BGCOLOR="#A6B9C8"><A HREF="http://MY INFO 4.com"><IMG SRC="../../images/exit.gif" WIDTH=78 HEIGHT=16 BORDER=0 ALT="Exit"></A></TD>' & _ '</TABLE>' $aResult = StringSplit(StringRegExpReplace(StringRegExpReplace($sText, "<(.*?)>", "<>"), "<[>|<]+>", "<>"), "<>", 1) $sInfo4 = StringRegExpReplace($sText, ".*http:\/\/(.*?).com.*", "$1") ConsoleWrite($aResult[2] & @CRLF) ConsoleWrite($aResult[3] & @CRLF) ConsoleWrite($aResult[4] & @CRLF) ConsoleWrite($sInfo4 & @CRLF) But I am sure a real SRE guru will come along and do it one line in a minute! The SREs are pretty simple, but do ask if you want them explained. M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
StungStang Posted February 17, 2011 Author Share Posted February 17, 2011 (edited) Ops, i've many tables structured like that in a html page, but all "MY INFO" value are different. How i can adapt the scipt to read the contents of all tables? Thanks Edited February 17, 2011 by StungStang Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted February 17, 2011 Moderators Share Posted February 17, 2011 StungStang,In the first line, the innermost SRER removes any characters within <>. The second SRER then collapses multiple consecutive <> into a single <>. Finally the StringSplit splits the result on these <> and retrieves the remaining text which was not within <> at the start - and that is the first 3 items you are looking for.The second line looks for any text betweem "http://" and ".com" - which is how you get the final item.So as long as your items match those criteria you can pull them from any page. if you need info which does not match those criteria then you will have to develop new SREs. You will really enjoy that - they are so much fun, they make my brain bleed! M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
PsaltyDS Posted February 17, 2011 Share Posted February 17, 2011 ...or you could just learn to use the _IE* functions. Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted February 17, 2011 Moderators Share Posted February 17, 2011 PsaltyDS,learn to use the _IE* functionsQuite agree, but he did ask. M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
StungStang Posted February 17, 2011 Author Share Posted February 17, 2011 (edited) This is an example of my html file see --->HERE<---How you can see i have 10 equal table but with differen "MY INFO EXAMPLE" value.Be carefull the Last MY INFO VALUE at the end may have .com or.net, etc... for example (http://www.MYINFO.com or http://www.MYINFO.net)I try with :#include <String.au3> #include <Array.au3> #include <File.au3> #include <Inet.au3> $Input = InputBox ("","Link") $Source = _INetGetSource ($Input) $aResult = StringSplit(StringRegExpReplace(StringRegExpReplace($Source, "<(.*?)>", "<>"), "<[>|<]+>", "<>"), "<>", 1) $sInfo4 = StringRegExpReplace($Source, ".*http:\/\/(.*?).com.*", "$1") ConsoleWrite($aResult[2] & @CRLF) ConsoleWrite($aResult[3] & @CRLF) ConsoleWrite($aResult[4] & @CRLF) ConsoleWrite($sInfo4 & @CRLF)But don't work ...how i can fix that?Thanks! Edited February 17, 2011 by StungStang Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted February 17, 2011 Moderators Share Posted February 17, 2011 StungStang, This works reasonably well - although you do get some false URL matches as well: ; Get all text not between <> $aResult = StringSplit(StringRegExpReplace(StringRegExpReplace($Source, "<(.*?)>", "<>"), "<[>|<]+>", "<>"), "<>", 1) _ArrayDisplay($aResult) ; Get any URLS then are between "http://" and either ".com" or ".net" $aURL = StringRegExp($Source, "(?i)http:\/\/(.*?)\.(?:com|net)", 3) _ArrayDisplay($aURL) But as PsaltyDS keeps pointing out - you would be much better off using the IE functions that he suggested above. I have only been playing with these SREs for my own amusement and learning. M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
StungStang Posted February 17, 2011 Author Share Posted February 17, 2011 @M23 It's dont work for me ...becouse your funcion grab all item Between "<" and ">". But on my html there arent only this table...this 10 table are only a part of my source. But i am interested only on table contenent... I see the IE function help...but honestly I didn't understand anything Another soluction? Thanks Link to comment Share on other sites More sharing options...
kylomas Posted February 17, 2011 Share Posted February 17, 2011 StungStang, Look at _IETableGetCollection. Run the example. It is obvious what is going on, and, how it fits processing your 10 tables. kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
StungStang Posted February 17, 2011 Author Share Posted February 17, 2011 @kylomas This metod doesnt work...my html page give a redirect...for cath my table i've to disable the hotredirect on my browser... Another soluction..._IE function dont work for page Hi! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now