Jump to content

Removing Certain String


Recommended Posts

Hello,

I am pulling information from yellow pages and seem to be having a issue

I want to pull any website's that are not internal links or yellowpages.com

here is my current code

#include <IE.au3>
#include <array.au3>
#Include <File.au3>
#include <string.au3>
#include <INet.au3>
#include <Excel.au3>
 
 
$YellowPagesUrl = "http://www.yellowpages.com/phoenix-az/pet-store?g=Phoenix%2C+AZ&page=2&q=pet+store";This will help us on finding the next URL.
$i = 1;This will keep track of how many pofiles we have pulled from linkedin
 
 
 
$YellowPages = _INetGetSource($YellowPagesUrl);Pulls the data from the address
InetClose ($YellowPages);Closes the connection to linkedin
 
$YellowPagesWebsite = _StringBetween($YellowPages, '<a href="', '"');List out all yellow pages links
 
_ArrayDisplay($YellowPagesWebsite);
Link to comment
Share on other sites

hmm second time this has happened it didnt include what I put in my message after the code.

I wan this to exclude anything that is a /ofiheif.html type of link or anything that is a yellowpages.com/ type of link

how would I do this

Thanks

Link to comment
Share on other sites

Is it good with this ?

#include <array.au3>

$YellowPagesUrl = "http://www.yellowpages.com/phoenix-az/pet-store?g=Phoenix%2C+AZ&page=2&q=pet+store";This will help us on finding the next URL.
 
$YellowPages = BinaryToString( InetRead ($YellowPagesUrl) );Pulls the data from the address
 $YellowPagesWebsite = StringRegExp($YellowPages, '<a href="(http://(?!www\.yellowpages\.com)[^"]+)', 3) ; 
 _ArrayDisplay($YellowPagesWebsite);

Match only links starting by "http://" and exclude yellowpages.com

 

Or this

$YellowPagesWebsite = StringRegExp($YellowPages, '<a href="([^/#](?!.*yellowpages)[^"]+)', 3) ;

for links not in "http://" format

Edited by jguinch
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...