stefionesco

Web page searcher

6 posts in this topic

#1 ·  Posted

Hi there.

I try to make an Web searcher and I don't know where to start.

So...

I have a txt file with a list of actors named actors.txt.  Then another txt file with some IMDb links named movies.txt

I need to make an application that open one by one the links from movies.txt, and inside each opened page to search, one by one, all the actors name from the actors.txt.

If nothing found, just close the page. When it found something, just leave it opened and go to the next one, until all the pages are finished.

It can be done?

 

Thanks in advance

 

Stef Ionesco

 

 

Share this post


Link to post
Share on other sites



#2 ·  Posted

Sure, use FIleReadToArray to create two seperate Arrays and then go through each array item and perform an open using _IECreate and then Copy the data to a string using _IEBodyReadText and then perform a Stringinstr to find the actors.  Basic example:

#include <IE.au3>

Local $oIE, $sIMDB
Local $aActors = IniReadSection('Movie_Search.ini', 'Actors')
Local $aIMDB = IniReadSection('Movie_Search.ini', 'IMDb')

For $i = 1 To $aIMDB[0][0]
    $oIE = _IECreate($aIMDB[$i][0], 1)
    _IELoadWait($oIE)
    $sIMDB = _IEBodyReadText($oIE)
    For $j = 1 To $aActors[0][0]
        If StringInStr($sIMDB, $aActors[$j][0]) Then IniWrite('Movie_Search.ini', $aIMDB[$i][1], $aActors[$j][0], $aActors[$j][1])
    Next
Next

Movie_Search.ini

[IMDb]
http://www.imdb.com/title/tt0047396/=Rear Window (1954)
http://www.imdb.com/title/tt0055031/=Judgment at Nuremberg (1961)
http://www.imdb.com/title/tt0068646/=The Godfather (1972)
http://www.imdb.com/title/tt0032976/=Rebecca (1940)
http://www.imdb.com/title/tt0064115/=Butch Cassidy and the Sundance Kid (1969)

[Actors]
Grace Kelly=Actress
James Stewart=Actor
Spencer Tracy=Actor
Marlon Brando=Actor
Laurence Olivier=Actor
Paul Newman=Actor

This should result in the following being added to Movie_Search.ini

[Rear Window (1954)]
Grace Kelly=Actress
James Stewart=Actor

[Judgment at Nuremberg (1961)]
James Stewart=Actor
Spencer Tracy=Actor
Marlon Brando=Actor

[The Godfather (1972)]
Marlon Brando=Actor

[Rebecca (1940)]
Grace Kelly=Actress
James Stewart=Actor
Laurence Olivier=Actor

[Butch Cassidy and the Sundance Kid (1969)]
Paul Newman=Actor

 

Share this post


Link to post
Share on other sites

#3 ·  Posted

Thanks Subz.

I will try to put together all this. I will let you know if I will succeed.

It not seems so easy for me.

 

Stef Ionesco

Share this post


Link to post
Share on other sites

#4 ·  Posted

 

Hi, Subz.

It works fine.

I add only a MsgBox at the end, so I know when it's finished.

And I make another INI file for the results only.

The only bad thing is: Sometimes, when the page isn't loaded correctly (I think so. I'm not sure because the page seems right to me), _IELoadWait just wait and do nothing. I must refresh myself the page and then  the app go on without any problem. 

Anyway, thanks a lot.

Stef Ionesco

Share this post


Link to post
Share on other sites

#5 ·  Posted

No worries, you could close each IE Window after its been searched see code below, also you could add a splash page to show which movie you're currently searching shown in the code below as well.  Also in case you don't have the movie name under IMDb section, you can use _IEPropertyGet to get the Title name for the Movie_Result.ini section names.

Example

Movie_Search.ini

[IMDb]
http://www.imdb.com/title/tt0047396/=Rear Window (1954)
http://www.imdb.com/title/tt0055031/=
http://www.imdb.com/title/tt0068646/=The Godfather (1972)
http://www.imdb.com/title/tt0032976/=
http://www.imdb.com/title/tt0064115/=Butch Cassidy and the Sundance Kid (1969)

[Actors]
Grace Kelly=Actress
James Stewart=Actor
Spencer Tracy=Actor
Marlon Brando=Actor
Laurence Olivier=Actor
Paul Newman=Actor

Movie_Result.ini

[Rear Window (1954)]
Grace Kelly=Actress
James Stewart=Actor
Marlon Brando=Actor
[Judgment at Nuremberg (1961)]
James Stewart=Actor
Spencer Tracy=Actor
Marlon Brando=Actor
Paul Newman=Actor
[The Godfather (1972)]
Marlon Brando=Actor
[Rebecca (1940)]
Grace Kelly=Actress
James Stewart=Actor
Laurence Olivier=Actor
[Butch Cassidy and the Sundance Kid (1969)]
Marlon Brando=Actor
Paul Newman=Actor

Updated code

#include <IE.au3>

Local $oIE, $sIMDB
;~ Source Ini File
Local $sMovie_Search = @ScriptDir & '\Movie_Search.ini'
;~ Result Ini File
Local $sMovie_Result = @ScriptDir & '\Movie_Result.ini'

Local $aActors = IniReadSection($sMovie_Search, 'Actors')
Local $aIMDB = IniReadSection($sMovie_Search, 'IMDb')

SplashTextOn("IMDb Search", "Searching...", 600, 50)
For $i = 1 To $aIMDB[0][0]
    ControlSetText("IMDb Search", "", "Static1", "Searching : " & $aIMDB[$i][1])
    $oIE = _IECreate($aIMDB[$i][0])
    _IELoadWait($oIE)
    ;~ If the movie title is blank get page title
    If $aIMDB[$i][1] = "" Then $aIMDB[$i][1] = StringReplace(_IEPropertyGet($oIE, "title"), "- IMDb", "")
    $sIMDB = _IEBodyReadText($oIE)
    For $j = 1 To $aActors[0][0]
        If StringInStr($sIMDB, $aActors[$j][0]) Then
            IniWrite($sMovie_Result, $aIMDB[$i][1], $aActors[$j][0], $aActors[$j][1])
        EndIf
    Next
    _IEQuit($oIE)
Next
SplashOff()

Share this post


Link to post
Share on other sites

#6 ·  Posted

Oh, better one. Yes, I like the splash text. Thanks a lot again.

Stef Ionesco

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now