Marnie Posted September 3, 2013 Share Posted September 3, 2013 Hi all, I need help please There is a HTML source code which contains: <a class="productDetailLink" title="Bačkora" href="http://www.hracky.cz/backora"> The source code has almost 6000 lines.Line containing "productDetailLink" (above) is unique for whole code. I need script which copy text from href (means actually the link in " ") and save it to excel file. Many thanks Link to comment Share on other sites More sharing options...
orbs Posted September 3, 2013 Share Posted September 3, 2013 try to code this: read the entire file to a single string var use StringSplit() with the string "productDetailLink" as the delimiter you will have an array of strings loop the array for every string: - use StringInStr() to locate only the first instance of "href" - locate the 1st and 2nd instances of the double-quote character following the 1st "href" - read whats in the middle - write into a new line of a text file when all done, open the text file with Excel. Signature - my forum contributions: Spoiler UDF: LFN - support for long file names (over 260 characters) InputImpose - impose valid characters in an input control TimeConvert - convert UTC to/from local time and/or reformat the string representation AMF - accept multiple files from Windows Explorer context menu DateDuration - literal description of the difference between given dates Apps: Touch - set the "modified" timestamp of a file to current time Show For Files - tray menu to show/hide files extensions, hidden & system files, and selection checkboxes SPDiff - Single-Pane Text Diff Link to comment Share on other sites More sharing options...
mikell Posted September 3, 2013 Share Posted September 3, 2013 Or you can InetRead the source code and then use this regular expression on the whole text to separate the link $res = StringRegExpReplace($text, '(?s).+productDetailLink.+?href="([^"]+).+', "$1") msgbox(0,"", $res) Link to comment Share on other sites More sharing options...
Gianni Posted September 3, 2013 Share Posted September 3, 2013 hi Marnie you could try something like this: #include <IE.au3> Local $oIE = _IECreate("www.yoursite.com") ; <--- change this link Local $oLinks = _IELinkGetCollection($oIE) For $olink In $oLinks If StringInStr($olink.outerhtml, "productDetailLink") Then ConsoleWrite("href Info ===>" & $olink.href & @CRLF) EndIf Next bye Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
Marnie Posted September 3, 2013 Author Share Posted September 3, 2013 mikell thanks very much...that's exactly what i was looking for ... appreciated Link to comment Share on other sites More sharing options...
Marnie Posted September 4, 2013 Author Share Posted September 4, 2013 Please What to do if there is an exception that link with "productDetailLink" is not included. I tried to define it as follows but no luck: $res = StringRegExpReplace($address, '(?s).+productDetailLink.+?href="([^"]+).+', "$1") If @error = 0 Then ; do if error occurs Else FileWriteLine($file, $res & @CRLF) EndIf Link to comment Share on other sites More sharing options...
dragan Posted September 4, 2013 Share Posted September 4, 2013 Local $text1 = '<a class="productDetailLink" title="Bačkora" href="http://www.hracky.cz/backora">' Local $text2 = '<a class="productDetailLinkXXXXX" title="Bačkora" href="http://www.hracky.cz/backora">';<---- addition with XXXXX ;=============================================================================================== ;=============================================================================================== Local $Link1 = _GetLinkBackFrom($text1, 'productDetailLink') If NOT @error Then MsgBox(0, 'Success - $text1', $Link1);<----- will be successfull Else MsgBox(0, 'Failure - $text1', $Link1) EndIf Local $Link2 = _GetLinkBackFrom($text2, 'productDetailLink') If NOT @error Then MsgBox(0, 'Success - $text2', $Link2) Else MsgBox(0, 'Failure - $text2', 'Link does not exist');<---- will have error EndIf ;=============================================================================================== ;==================================== Function: ================================================ ;=============================================================================================== ; $__text = text from which you want to extract the url ; $__attributName = matching attribute name (can be class, id, name, etc...) Func _GetLinkBackFrom($__text, $__attributName) Local $__pattern = '(?s).+\"' & $__attributName & '\".+?href="([^"]+).+' If StringRegExp($__text, $__pattern) Then Return StringRegExpReplace($__text, $__pattern, "$1") Else Return SetError(1, 0, '') EndIf EndFunc ;=============================================================================================== ;=============================================================================================== Link to comment Share on other sites More sharing options...
Gianni Posted September 4, 2013 Share Posted September 4, 2013 excuse me, maybe I did not understand the question but isn't easier to extract the links with this simple script? if you have the source code in an html file and not on a web site, you can always use the command _IECreate () pointing to that local file instead of a url: #include <IE.au3> Local $oIE = _IECreate(".\file.html") ; <--- path of local html file Local $oLinks = _IELinkGetCollection($oIE) For $olink In $oLinks If StringInStr($olink.outerhtml, "productDetailLink") Then ConsoleWrite("href Info ===>" & $olink.href & @CRLF) EndIf Next bye Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
Marnie Posted September 5, 2013 Author Share Posted September 5, 2013 thanks very much, solved Link to comment Share on other sites More sharing options...
Gianni Posted September 5, 2013 Share Posted September 5, 2013 Hi Marnie, it would not bad if you show us your solution bye Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
remin Posted February 26, 2015 Share Posted February 26, 2015 I was wondering if it is possible to copy also a link text with autoit? I use a Copy Link Text plugin in my chrome and in my firefox browser. (right click on link --> "Copy Link Text") Would be much easier if I could do it using autoit. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now