Sodori Posted July 23, 2014 Share Posted July 23, 2014 Hi all! In Google Image search, the link to every search is this: <a class="irc_fsl irc_but" href="/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&docid=52J1DT9EdL3YlM&tbnid=HWENGOZPg550qM:&ved=0CAIQjBw&url=http%3A%2F%2Fwww.autoitscript.com%2Fw%2Fimages%2Fc%2Fc7%2FAutoit-1-2-3.jpg&ei=NlbPU9fOCanQ4QSR9IH4Bg&bvm=bv.71667212,d.bGE&psig=AFQjCNGuYULJGRB-DrgwyzLTK9QFXy__yA&ust=1406182924754493" data-ved="0CAIQjBw" data-href="http://www.autoitscript.com/w/images/c/c7/Autoit-1-2-3.jpg"><span class="irc_but_t">Visa bild</span></a> Would it be possible in Autoit to search for each and every say class="irc_fsl irc_but", and put them each in a array? Maybe even output them into an array much like _ArrayFindAll does. Would also be nice if it did this from default browser and not IE because then I can deal with the anti spam quota! Link to comment Share on other sites More sharing options...
Xenobiologist Posted July 23, 2014 Share Posted July 23, 2014 What should you array look like? Splitted by class= ... ? Scripts & functions Organize Includes Let Scite organize the include files Yahtzee The game "Yahtzee" (Kniffel, DiceLion) LoginWrapper Secure scripts by adding a query (authentication) _RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...) Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc. MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times Link to comment Share on other sites More sharing options...
Sodori Posted July 24, 2014 Author Share Posted July 24, 2014 Being totally blunt, I'd settle for if each row in the array looked like above because I can simply use string manipulation to extract the desired part of it. But if you look at above again, you see the link "'>" at the end that is the direct link to the image. That is my goal. Link to comment Share on other sites More sharing options...
Jfish Posted July 24, 2014 Share Posted July 24, 2014 If you were looking for image each time you could something like this (there are probably way better ways to do this ... I sadly don't know regular expressions - which would be probably way better). This works on the example you provided - I don't know if it would work on every result: #include <Array.au3> $string = '<a class="irc_fsl irc_but" href="/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&docid=52J1DT9EdL3YlM&tbnid=HWENGOZPg550qM:&ved=0CAIQjBw&url=http%3A%2F%2Fwww.autoitscript.com%2Fw%2Fimages%2Fc%2Fc7%2FAutoit-1-2-3.jpg&ei=NlbPU9fOCanQ4QSR9IH4Bg&bvm=bv.71667212,d.bGE&psig=AFQjCNGuYULJGRB-DrgwyzLTK9QFXy__yA&ust=1406182924754493" data-ved="0CAIQjBw" data-href="http://www.autoitscript.com/w/images/c/c7/Autoit-1-2-3.jpg"><span class="irc_but_t">Visa bild</span></a>' $delim=-'href=' $valueArray=StringSplit($string,'href=',1) ;_ArrayDisplay($valueArray) for $a=0 to UBound($valueArray)-1 $containsHTTP = StringInStr($valueArray[$a],"http:") $containsJpg=StringInStr($valueArray[$a],"jpg") if $containsJpg<>0 and $containsHTTP<>0 Then ;MsgBox("","","This index contains your http with JPG: "&$a) $endofURL = StringInStr($valueArray[$a],'">') $lineLen=StringLen($valueArray[$a]) $modifiedResult=StringTrimRight($valueArray[$a],(($lineLen-$endofURL))) ConsoleWrite(@crlf&$modifiedResult&@crlf) EndIf Next Build your own poker game with AutoIt: pokerlogic.au3 | Learn To Program Using FREE Tools with AutoIt Link to comment Share on other sites More sharing options...
mikell Posted July 24, 2014 Share Posted July 24, 2014 (edited) As each link is tagged <a ... data-href="link" you can try this regex way $string = '<a class="irc_fsl irc_but" href="/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&docid=52J1DT9EdL3YlM&tbnid=HWENGOZPg550qM:&ved=0CAIQjBw&url=http%3A%2F%2Fwww.autoitscript.com%2Fw%2Fimages%2Fc%2Fc7%2FAutoit-1-2-3.jpg&ei=NlbPU9fOCanQ4QSR9IH4Bg&bvm=bv.71667212,d.bGE&psig=AFQjCNGuYULJGRB-DrgwyzLTK9QFXy__yA&ust=1406182924754493" data-ved="0CAIQjBw" data-href="http://www.autoitscript.com/w/images/c/c7/Autoit-1-2-3.jpg"><span class="irc_but_t">Visa bild</span></a>' Msgbox(0,"", StringRegExp($string, 'data-href="([^"]+)', 3)[0] ) and (maybe) on the source code of the page #include <Array.au3> $txt = ... ; source code $aLinks = StringRegExp($txt, 'data-href="([^"]+)', 3) _ArrayDisplay($aLinks) Edited July 24, 2014 by mikell Link to comment Share on other sites More sharing options...
Sodori Posted July 25, 2014 Author Share Posted July 25, 2014 I greatly appreciate the effort people are giving in helping me with this. However, it does appear my explanation of the situation is flawed. So I will try to explain a bit deeper. This is the (right now example) website: https://www.google.se/search?q=autoit+download+google+search&source=lnms&tbm=isch&sa=X&ei=fQ_SU_6FJKrMygP66IDwCw&ved=0CAgQ_AUoAQ&biw=1366&bih=683 At that place, each and every result is represented by: <a class="irc_fsl irc_but" href="/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&docid=52J1DT9EdL3YlM&tbnid=HWENGOZPg550qM:&ved=0CAIQjBw&url=http%3A%2F%2Fwww.autoitscript.com%2Fw%2Fimages%2Fc%2Fc7%2FAutoit-1-2-3.jpg&ei=NlbPU9fOCanQ4QSR9IH4Bg&bvm=bv.71667212,d.bGE&psig=AFQjCNGuYULJGRB-DrgwyzLTK9QFXy__yA&ust=1406182924754493" data-ved="0CAIQjBw" data-href="http://www.autoitscript.com/w/images/c/c7/Autoit-1-2-3.jpg"><span class="irc_but_t">Visa bild</span></a> So if I can extract each and every above into preferably an array, that would mean Google image scraping to me since extracting the direct link won't be much of a hassle! As a side note, this is also how I managed cleaning it up (disclaimer: this is NOT orignally what I need help with, as it is already done); ;© Original editor and creator Sodori 2014-07-25 Local $imageRaw = '<a class="irc_fsl irc_but" href="/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&docid=52J1DT9EdL3YlM&tbnid=HWENGOZPg550qM:&ved=0CAIQjBw&url=http%3A%2F%2Fwww.autoitscript.com%2Fw%2Fimages%2Fc%2Fc7%2FAutoit-1-2-3.jpg&ei=NlbPU9fOCanQ4QSR9IH4Bg&bvm=bv.71667212,d.bGE&psig=AFQjCNGuYULJGRB-DrgwyzLTK9QFXy__yA&ust=1406182924754493" data-ved="0CAIQjBw" data-href="http://www.autoitscript.com/w/images/c/c7/Autoit-1-2-3.jpg"><span class="irc_but_t">Visa bild</span></a>' Local $index = StringInStr($imageRaw, 'data-href="', 0, -1) ;Determing the start Local $imageProcess = $index & @LF & StringRight($imageRaw, StringLen($imageRaw) - $index) ;Cutting off anything unneeded ahead of the link $imageprocess = StringReplace($imageProcess, 'ata-href="', "") ;Cleaning that one up, the link is now clear from start ;Starting to clean the end by checking where the link ends. Did not do a for loop since I can manually exit it instead and save time! Local $i = 0 While $i <> -1 If StringInStr(StringRight($imageProcess, $i), '"><') > 0 Then Local $imageProcessed = StringLeft($imageProcess, StringLen($imageProcess) -$i) $i = -1 Else $i += 1 EndIf WEnd ConsoleWrite($imageProcessed & @LF) So, as you can see, I just need that array with a bunch of $imageRaw that I can convert into direct links. Link to comment Share on other sites More sharing options...
Jfish Posted July 25, 2014 Share Posted July 25, 2014 I am not sure I understand the question about what you still need help with. You have some ways to get arrays of the link provided to extract just the image. Can you please clarify? Build your own poker game with AutoIt: pokerlogic.au3 | Learn To Program Using FREE Tools with AutoIt Link to comment Share on other sites More sharing options...
Sodori Posted July 25, 2014 Author Share Posted July 25, 2014 You see provided link on my post? From there you got a good amount of images. Each image or search result, I want to end up downloading them. But to download them I need to extract the direct link for each one. And to do THAT I first need to fetch the raw code from the image and preferably store it in an array with each result in it's own line. Guess my only reason to bring up such a big code and not simply the direct link, was to give the person with the know how a bit of lee way if you understand me. Was that any better? Link to comment Share on other sites More sharing options...
Jfish Posted July 25, 2014 Share Posted July 25, 2014 Well when you run the code I posted for that example you posted you get this in the console: "'>" I thought that was the image link that you wanted. Are you saying you need help getting that html string first? Build your own poker game with AutoIt: pokerlogic.au3 | Learn To Program Using FREE Tools with AutoIt Link to comment Share on other sites More sharing options...
Sodori Posted July 25, 2014 Author Share Posted July 25, 2014 I need help to convert https://www.google.se/search?q=autoit+download+google+search&source=lnms&tbm=isch&sa=X&ei=fQ_SU_6FJKrMygP66IDwCw&ved=0CAgQ_AUoAQ&biw=1366&bih=683 into an array of each search result it contains. The code I put out that returns into your console was purely none related to my issue. I just guessed it would be helpful material for anyone stumbling over this issue in the future as well as a solid proof that I have already cracked that nut. Link to comment Share on other sites More sharing options...
Jfish Posted July 25, 2014 Share Posted July 25, 2014 (edited) Okay, now I get it. You want all the results that come back from a search. Have you looked at the Google search API? - Maybe this should be merged with your other thread on the same topic about InetGet? Edited July 25, 2014 by Jfish Build your own poker game with AutoIt: pokerlogic.au3 | Learn To Program Using FREE Tools with AutoIt Link to comment Share on other sites More sharing options...
Sodori Posted July 25, 2014 Author Share Posted July 25, 2014 I am using autoit on my work, and got quite a lot of projects at the same time. My other thread was about a different task to this. Thus I felt better separating them as they booth are two different issues. Even if I would eventually have to address InetGet for this one as well. But not anymore, thanks to a good community! Addressing the matter back at hand, eventually I will get into that. But that will be for greater project than this as well as when I feel more ready in learning a new programming code, as it seems anyways. I was hoping I could get away cheaply for now, since it's not THAT much worry about speed on this program I am making. It's more or less going whenever Google likes it to go. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now