Sign in to follow this  
Followers 0
Sodori

Need easy help to retrieve proxies

3 posts in this topic

Hi all,

I am really advancing in the arts of web scraping, but I still got issues where there is this class identities... they hate me..

http://proxyipchecker.com/, they have this really neat area of their page, where they show up last checked IPs. I wish to have them. I would love if anyone could be the sweetest and help me! I will demonstrate the interesting bit here for you:

<div class="innertube">

<div class="innertube">
<div class="hovermenu">
<ul>
<li><a href="/check-my-proxy-ip.html" title="Check my Proxy IP">Check my Proxy IP</a></li>
<li><a href="/proxy-headers-checker.html" title="Proxy Headers Checker">Proxy Headers Checker</a></li>
<li><a href="/proxy-checker-online.html" title="Proxy Checker Online">Proxy Checker Online</a></li>
<li><a href="/buy-proxies-proxy-buy.html" title="Buy Proxies - Proxy Buy" style="background-color:#5c63ff;color:#ffffff">Buy Proxies - Proxy Buy</a></li>
<li><a href="/api.html" title="Proxy Checker API - Proxy List API">Proxy Checker API - Proxy List API</a></li>
</ul>
</div>
</div>

<h2>Latest open proxy servers, fast, checked and alive! Fresh proxies IP address and port continuously updated!</h2>

<ul class="freshproxies">
<li class="down">190.74.203.4 : 8080</li><li class="medium lowbw">118.26.142.5 : 80  </li><li class="medium lowbw">111.11.14.174 : 80  </li><li class="down">41.207.116.233 : 3128</li><li class="fast lowbw">195.40.6.43 : 8080  fresh</li><li class="fast lowbw">200.27.79.74 : 8080  open</li><li class="down">190.37.62.240 : 8080</li><li class="veryfast lowbw">77.243.2.171 : 80  up</li></ul>
</div>

Under "freshproxies" down at the slight bottom, you got a list of recent searches with their IP and port. I would like a simple code to fetch anything that's not related with "class=down" in an array. Anyone mind helping me with this? The code ought to be so simple I don't know if I really have to put up how I have faired in it. But I shall, case it humours you :)

Local $oIE = _IECreate("http://proxyipchecker.com/")
;~ Local $fresh = _IEGetObjById($oIE, "rightcolumn")
$tags = $oIE.document.GetElementsByTagName("li")
For $tag in $tags
    $class_value = $tag.className("class")
    If $class_value = "freshproxies" Then
       ConsoleWrite($class_value & @LF)
    EndIf
Next

Thanks again!

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

#include <array.au3>
#include <File.au3>
#include <String.au3>
#include <IE.au3>

Local $oIE = _IECreate("http://proxyipchecker.com/")
WinWait("Online Proxy Checker - IP Checker - Check Proxy - Internet Explorer")
Local $HTML = _IEDocReadHTML($oIE);Gets all HTML
Local $LeftCount = StringInStr($HTML,'<ul class="freshproxies">');find the count of characters that come before the first string you want to find
Local $temp = StringTrimLeft($HTML,$LeftCount + 25);removes all characters before the first ipaddress
Local $RightLocation = StringInStr($temp,"</li></ul>");position of the end of the ip address section in the html
Local $RawData = StringMid($temp,1,$RightLocation - 1);unedited datablock of ip address information
Local $SplitRaw = StringSplit($RawData,'</li>',1)
Local $TempArray[0][3]
For $i = 1 To Ubound($SplitRaw) - 1
    Local $M = StringReplace($SplitRaw[$i],'<li class="',"");remove leading text
    Local $N = StringReplace($M,'">',";");remove unwanted characters
    Local $O = StringReplace($N,":",";")
    _ArrayAdd($TempArray,$O,0,";")
Next
    _ArrayDisplay($TempArray)

This just needs the description removed and it is ready to use.

Edit - For anyone who wants to chime in on this one there is a description that becomes part of the string behind the port number that sometimes does not show up at all (it's optional when entering the data in the website). I cannot figure out how to trim the description from my array.

Edited by computergroove
1 person likes this

Get Scite to add a popup when you use a 3rd party UDF -> http://www.autoitscript.com/autoit3/scite/docs/SciTE4AutoIt3/user-calltip-manager.html

Share this post


Link to post
Share on other sites
#include <array.au3>
#include <File.au3>
#include <String.au3>
#include <IE.au3>

Local $oIE = _IECreate("http://proxyipchecker.com/")
WinWait("Online Proxy Checker - IP Checker - Check Proxy - Internet Explorer")
Local $HTML = _IEDocReadHTML($oIE);Gets all HTML
Local $LeftCount = StringInStr($HTML,'<ul class="freshproxies">');find the count of characters that come before the first string you want to find
Local $temp = StringTrimLeft($HTML,$LeftCount + 25);removes all characters before the first ipaddress
Local $RightLocation = StringInStr($temp,"</li></ul>");position of the end of the ip address section in the html
Local $RawData = StringMid($temp,1,$RightLocation - 1);unedited datablock of ip address information
Local $SplitRaw = StringSplit($RawData,'</li>',1)
Local $TempArray[0][3]
For $i = 1 To Ubound($SplitRaw) - 1
    Local $M = StringReplace($SplitRaw[$i],'<li class="',"");remove leading text
    Local $N = StringReplace($M,'">',";");remove unwanted characters
    Local $O = StringReplace($N,":",";")
    _ArrayAdd($TempArray,$O,0,";")
Next
    _ArrayDisplay($TempArray)

This just needs the description removed and it is ready to use.

Edit - For anyone who wants to chime in on this one there is a description that becomes part of the string behind the port number that sometimes does not show up at all (it's optional when entering the data in the website). I cannot figure out how to trim the description from my array.

 

Big thanks! :) A bug inside port section, but I think I can manage it from here. Dare say it would be kinda handy with a "_IEGetObjectBy($type (class,name, ID etc), $oObject, $sName, $iIndex [optional])", but I digress. Perhaps they would have done it already if it were possible, who knows.

Again, computergroove, thanks!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0