Jump to content

Implenting script into another one.


andrewz
 Share

Go to solution Solved by andrewz,

Recommended Posts

Hey,

first off, thanks to everyone on this forum who has helped me out yet. I wouldnt have made it this far in such a short time, as I am a beginner.

Now this is my last problem: (Well it's not rlly a problem as I could run both seperately, but I would like to create one script.

Here is the first script:

#NoTrayIcon
#include <Inet.au3>
#include <Array.au3>
#include <String.au3>
#include <AutoItConstants.au3>
#include <MsgBoxConstants.au3>
Global $url = InputBox("ScoutID","Please enter the Scout-ID (URL)")

If StringInStr($url,"http://www.immobilienscout24.de/") = false Then
MsgBox(48,"Error","URL not specified. Please make sure you enter the URL !")
Exit
Else
EndIf

Global $content = _INetGetSource($url)
Global $string_A = _StringBetween($content, '<li data-item="result" data-obid=', 'id="result')

$number =UBound ( $string_A ,$UBOUND_ROWS)

$timesran = -1
$i=0

$numberarray = $number
$numberarray -= 1

Do
$i += 1
$timesran += 1
If $timesran = $numberarray = True Then
Exit
Else
EndIf
$id= StringReplace($string_A[$timesran],'"',"")
$urlscout = "http://www.immobilienscout24.de/expose/"&$id
export ()
Until $i = $numberarray

func export ()
$url = $urlscout

If StringInStr($url,"http://www.immobilienscout24.de/expose") = false Then
MsgBox(48,"Error","URL not specified. Please make sure you enter the URL !")
Exit
Else
EndIf

If FileExists("Immobilien.csv") =false Then
FileWrite("Immobilien.csv","Name;Adresse;Telefon;Objekttyp;Ort;Baujahr;Zimmer;Bezugfrei ab;Wfl./ qm;Kaltmiete;Warmmiete;Scout- ID"& @CRLF)
EndIf

$content = _INetGetSource($url)
$name_A = _StringBetween($content, '<span data-qa="contactName" class="font-bold">', '</span>')
$preis_A = _StringBetween($content, '<span class="is24-operator">', '<div class="sourcecode section">')
$strase_A = _StringBetween($content, '<strong class="font-standard">' , '</strong><br/>')
$telefon_A = _StringBetween($content, '<div class="is24-phone-number hide">' ,'</div>')
$objekttyp_A = _StringBetween($content, '<dd class="is24qa-wohnungstyp">' ,'</dd>')
$ort_A = _StringBetween($content, '<span id="quickCheckHeader" class="is24-f">' , '</span>')
$baujahr_A = _StringBetween($content, '<dd class="is24qa-baujahr">','</dd>')
$zimmer_A = _StringBetween($content, '<dd class="is24qa-zimmer">','</dd>')
$bezugsfrei_A = _StringBetween($content, '<dd class="is24qa-bezugsfrei-ab">' ,'</dd>')
$wohnflache_A = _StringBetween($content, '<dd class="is24qa-wohnflaeche-ca">' ,'</dd>')
$preiswarm_A =_StringBetween($content, '<strong class="is24qa-gesamtmiete">','</strong>')

If  IsArray($strase_A) Then
$strase_B = $strase_A[0]
Else
$strase_B = "/"
EndIf

If  IsArray($name_A) Then
$name_B = $name_A[0]
Else
$name_B = "/"
EndIf

If  IsArray($telefon_A) Then
$telefon_B = $telefon_A[0]
Else
$telefon_B = "/"
EndIf

If  IsArray($objekttyp_A) Then
$objekttyp_B = $objekttyp_A[0]
Else
$objekttyp_B = "/"
EndIf

If  IsArray($baujahr_A) Then
$baujahr_B = $baujahr_A[0]
Else
$baujahr_B = "/"
EndIf

If  IsArray($preiswarm_A) Then
$preiswarm_B = $preiswarm_A[0]
$preiswarm_C = StringReplace($preiswarm_B , "disabled","")
$preiswarm_D = StringReplace($preiswarm_C, "disabled","",1) ;caused some bugs
$preiswarm_E = StringReplace($preiswarm_D, "disabled","")
Else
$preiswarm_B = "/"
EndIf

If  IsArray($zimmer_A) Then
$zimmer_B = $zimmer_A[0]
$zimmer_C = StringReplace($zimmer_B , ".",",")
Else
$zimmer_B = "/"
EndIf

If  IsArray($ort_A) Then
$ort_B = $ort_A[0]
Else
$ort_B = "/"
EndIf

If  IsArray($bezugsfrei_A) Then
$bezugsfrei_B = $bezugsfrei_A[0]
Else
$bezugsfrei_B = "/"
EndIf

If  IsArray($wohnflache_A) Then
$wohnflache_B = $wohnflache_A[0]
Else
$wohnflache_B = "/"
EndIf

If  IsArray($preis_A) Then
$preis_B = $preis_A[0]
Else
$preis_B = "/"
EndIf


$aio= $name_B&";"&$strase_B&";"&$telefon_B&";"&$objekttyp_B&";"&$ort_B&";"&$baujahr_B&";"&$zimmer_C&";"&$bezugsfrei_B&";"&$wohnflache_B&";"&$preis_B&";"&$preiswarm_E&";"&$url

$sString1 = StringReplace($aio, "  ", "")
$sString2 = StringReplace($sString1, "<p>", "")
$sString3 = StringReplace($sString2, "<span>", "")
$sString4 = StringReplace($sString3, "</p>", "")
$sString5 = StringReplace($sString4, "Â", "")
$sString6 = StringReplace($sString5, '<span class="is24-operator">=</span>', "")     ;Filtering the given Data to get a proper CSV Format
$sString7 = StringReplace($sString6, "EUR", "")
$sString8 = StringReplace($sString7, "</span>","")
$sString9 = StringReplace($sString8, "Dieses Objekt im Vergleich zu anderen in ","")
$sString10 = StringReplace($sString9, "disabled",",00") ;caused bug

$sString11 = StringReplace($sString10, "ü","ü")
$sString12 = StringReplace($sString11, "ß","ß")     ;Including special characters lower case
$sString13 = StringReplace($sString12, "ä","ä")
$sString14 = StringReplace($sString13, "ö","ö")

$sString15 = StringReplace($sString14, "Ãœ","Ãœ")    ;Including special characters upper case
$sString16 = StringReplace($sString15, "Ä","Ä")
$sString17 = StringReplace($sString16, "Ö","Ö")

$sStringfinal = StringReplace($sString17, @CRLF, "")


FileWrite ( "Immobilien.csv", $sStringfinal & @CRLF )

ToolTip("Importing...", 0, 0)
Sleep(100)
endfunc

This script works perfectly fine to extract ALL the data from one page which looks like this :

http://www.immobilienscout24.de/Suche/S-T/Wohnung-Miete/Bayern/Muenchen/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/true?enteredFrom=result_list

(The export function downloads all data from each object on the side into a cvs file.)

BUT this is just one of 19 pages, so I made this script which counts the pages and then in a loop

sets $currenturl always +1 until $currenturl and the total pages (19) match and then exits.

#NoTrayIcon
#include <Inet.au3>
#include <Array.au3>
#include <String.au3>
#include <AutoItConstants.au3>
#include <MsgBoxConstants.au3>
Global $url = InputBox("ScoutID","Please enter the Scout-ID (URL)")

If StringInStr($url,"http://www.immobilienscout24.de/") = false Then
MsgBox(48,"Error","URL not specified. Please make sure you enter the URL !")
Exit
Else
EndIf

Global $content1 = _INetGetSource($url)
Global $pagecount = _StringBetween($content1, '<li><span>&hellip;</span></li>', '<span class="smallPager">')
$filtered = StringRegExp($pagecount[0],"\d{2}",$STR_REGEXPARRAYMATCH)
$link0 = StringReplace($url,"http://www.immobilienscout24.de/Suche/S-T/","")
$link1 = StringReplace($url,$link0,"")
$link2 =  "/" & $link0
$pages = $filtered[0] ; 19 in the example
$currentpage = 0


Do
$currentpage +=1

$currenturl = $link1 & "p-" & $currentpage & $link2
MsgBox(0,"",$currenturl)

Until $pages = $currentpage

Now my question is, if there is anyway to implent the second into the first script to work the following way:

1.) Basically run the second script to generate the first $currenturl

2.) Run the first script to grab all data from $currenturl

3.) The second script "prints" the second $currenturl

4.) The first script again grabs all data from $currenturl

....

5.) Until $currenturl = $filtered ($filtered = total amount of pages)

 

I tried my best to explain it as good as possible, even tho its complicated to get what I mean...

THANKS IN ADVANCE! & best regards,

Andrewz

Edited by andrewz
Link to comment
Share on other sites

If you know what to do with the number of pages in your script then I will just give you this:

Add this to the top for your first script in the above example in the global variables section(at the top):

Global $content = _INetGetSource($url)
Global $string_A = _StringBetween($content, '<li data-item="result" data-obid=', 'id="result')
Global $pagecount = _StringBetween($content, '<li><span>&hellip;</span></li>', '<span class="smallPager">')
Global $pages;<-number of websites to scan

NumberOfPages();<-Find the number of pages to scan

Add this to the bottom of your script

Func NumberOfPages()
$filtered = StringRegExp($pagecount[0],"\d{2}",$STR_REGEXPARRAYMATCH)
$link0 = StringReplace($url,"http://www.immobilienscout24.de/Suche/S-T/","")
$link1 = StringReplace($url,$link0,"")
$link2 =  "/" & $link0
$pages = $filtered[0]
EndFunc

I played with this code for awhile and I am having a problem shoehorning this into a loop. I added a for loop:

For $j=1 To $pages

but you have a main url (ie http://www.immobilienscout24.de/Suche/S-T...) and an internal url (ie http://www.immobilienscout24.de/expose/...) that I am having a hard time integrating into the loop. $pages will give you the number of pages you need to scan this program with.

Edited by computergroove

Get Scite to add a popup when you use a 3rd party UDF -> http://www.autoitscript.com/autoit3/scite/docs/SciTE4AutoIt3/user-calltip-manager.html

Link to comment
Share on other sites

1.) Basically run the second script to generate the first $currenturl

2.) Run the first script to grab all data from $currenturl

3.) The second script "prints" the second $currenturl

4.) The first script again grabs all data from $currenturl

....

5.) Until $currenturl = $filtered ($filtered = total amount of pages)

 

Couldn't this be done using funcs like in this pseudocode ?

For $var = 1 to $filtered      ; total amount of pages
   $currenturl = _secondScript($var)    ; generate $currenturl
   $data = _firstScript($currenturl)      ; grab data from $currenturl
Next
Link to comment
Share on other sites

I made a script that goes through all the pages and gets the unique number that needs follow  "http://www.immobilienscout24.de/expose/

#include <Inet.au3>
#include <array.au3>
#include <String.au3>
#include <AutoItConstants.au3>
#include <MsgBoxConstants.au3>
#include <File.au3>

Global $Array[0];Array with all the link number info (ie http://www.immobilienscout24.de/expose/'77947839' <- this number
Global $FinalArray[0]
Global $url = "http://www.immobilienscout24.de/Suche/S-T/Wohnung-Miete/Bayern/Muenchen/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/true?enteredFrom=result_list"
Global $content = _INetGetSource($url)
Global $pagecount = _StringBetween($content, '<li><span>&hellip;</span></li>', '<span class="smallPager">')
Global $pages;<-number of websites to scan

NumberOfPages();Get number of pages to scan
For $k = 2 To 20
    GetNumberedLinks()
    _ArrayConcatenate($FInalArray,$Array)
    $url = "http://www.immobilienscout24.de/Suche/S-T/P-" & $k & "/Wohnung-Miete/Bayern/Muenchen/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/true?enteredFrom=result_list"
    $HTML = _INetGetSource($url)
Next
    _ArrayDisplay($Array)

Func GetNumberedLinks();Get the number that follows http://www.immobilienscout24.de/expose/ in the current html
        ToolTip("Processing Stage: " & $k - 1 & " of " & $pages,0,0)
        $HTML = _INetGetSource($url)
        $j = 1
        Do
            $1 = StringInStr($HTML, "/expose/", 0, $j)
            $2 = StringTrimLeft($HTML, $1)
            $3 = StringReplace($2, '"', ";;")
            $4 = StringSplit($3, ";;", 1)
            $5 = $4[1]
            $6 = StringMid($5, 8, 8)
            If $6 >= 1 Then;Needed to add this if statement because I was getting extra characters at the end of the finalarray
                _ArrayAdd($Array,$6,1)
            EndIf
            $j += 1
        Until $1 = 0;Stop loop when there are no more instances of "/expose/"
        $Array = _ArrayUnique($Array,0,0,0,0)
EndFunc

Func NumberOfPages()
    $filtered = StringRegExp($pagecount[0], "\d{2}", $STR_REGEXPARRAYMATCH)
    $link0 = StringReplace($url, "http://www.immobilienscout24.de/Suche/S-T/", "")
    $link1 = StringReplace($url, $link0, "")
    $link2 = "/" & $link0
    $pages = $filtered[0]
EndFunc   ;==>NumberOfPages
Edited by computergroove

Get Scite to add a popup when you use a 3rd party UDF -> http://www.autoitscript.com/autoit3/scite/docs/SciTE4AutoIt3/user-calltip-manager.html

Link to comment
Share on other sites

Looks like I got it to work. Test it and let me know if the data is correct.

;~ #NoTrayIcon
#include <Inet.au3>
#include <Array.au3>
#include <String.au3>
#include <AutoItConstants.au3>
#include <MsgBoxConstants.au3>

Global $Array[0];Array with all the link number info (ie http://www.immobilienscout24.de/expose/'77947839' <- this number
Global $FinalArray[0]
Global $url = "http://www.immobilienscout24.de/Suche/S-T/Wohnung-Miete/Bayern/Muenchen/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/true?enteredFrom=result_list"
Global $content = _INetGetSource($url)
Global $string_A = _StringBetween($content, '<li data-item="result" data-obid=', 'id="result')
Global $pagecount = _StringBetween($content, '<li><span>&hellip;</span></li>', '<span class="smallPager">')
Global $pages;<-number of websites to scan
FileDelete(@ScriptDir & "\Immobilien.csv")

NumberOfPages()

For $k = 2 To 20
    GetNumberedLinks()
    _ArrayConcatenate($FinalArray,$Array)
    $url = "http://www.immobilienscout24.de/Suche/S-T/P-" & $k & "/Wohnung-Miete/Bayern/Muenchen/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/true?enteredFrom=result_list"
    $HTML = _INetGetSource($url)
Next

;~ $number =UBound ( $string_A ,$UBOUND_ROWS)
;~ $timesran = -1
;~ $i=0
;~ $numberarray = $number
;~ $numberarray -= 1

For $m = 0 To Ubound($Array)
$id= $FinalArray[$m]
$urlscout = "http://www.immobilienscout24.de/expose/"&$id
export ()
Next

func export ()
$url = $urlscout

If StringInStr($url,"http://www.immobilienscout24.de/expose") = false Then
MsgBox(48,"Error","URL not specified. Please make sure you enter the URL !")
Exit
Else
EndIf

If FileExists("Immobilien.csv") =false Then
FileWrite("Immobilien.csv","Name;Adresse;Telefon;Objekttyp;Ort;Baujahr;Zimmer;Bezugfrei ab;Wfl./ qm;Kaltmiete;Warmmiete;Scout- ID"& @CRLF)
EndIf

$content = _INetGetSource($url)
$name_A = _StringBetween($content, '<span data-qa="contactName" class="font-bold">', '</span>')
$preis_A = _StringBetween($content, '<span class="is24-operator">', '<div class="sourcecode section">')
$strase_A = _StringBetween($content, '<strong class="font-standard">' , '</strong><br/>')
$telefon_A = _StringBetween($content, '<div class="is24-phone-number hide">' ,'</div>')
$objekttyp_A = _StringBetween($content, '<dd class="is24qa-wohnungstyp">' ,'</dd>')
$ort_A = _StringBetween($content, '<span id="quickCheckHeader" class="is24-f">' , '</span>')
$baujahr_A = _StringBetween($content, '<dd class="is24qa-baujahr">','</dd>')
$zimmer_A = _StringBetween($content, '<dd class="is24qa-zimmer">','</dd>')
$bezugsfrei_A = _StringBetween($content, '<dd class="is24qa-bezugsfrei-ab">' ,'</dd>')
$wohnflache_A = _StringBetween($content, '<dd class="is24qa-wohnflaeche-ca">' ,'</dd>')
$preiswarm_A =_StringBetween($content, '<strong class="is24qa-gesamtmiete">','</strong>')

If  IsArray($strase_A) Then
$strase_B = $strase_A[0]
Else
$strase_B = "/"
EndIf

If  IsArray($name_A) Then
$name_B = $name_A[0]
Else
$name_B = "/"
EndIf

If  IsArray($telefon_A) Then
$telefon_B = $telefon_A[0]
Else
$telefon_B = "/"
EndIf

If  IsArray($objekttyp_A) Then
$objekttyp_B = $objekttyp_A[0]
Else
$objekttyp_B = "/"
EndIf

If  IsArray($baujahr_A) Then
$baujahr_B = $baujahr_A[0]
Else
$baujahr_B = "/"
EndIf

If  IsArray($preiswarm_A) Then
$preiswarm_B = $preiswarm_A[0]
$preiswarm_C = StringReplace($preiswarm_B , "disabled","")
$preiswarm_D = StringReplace($preiswarm_C, "disabled","",1) ;caused some bugs
$preiswarm_E = StringReplace($preiswarm_D, "disabled","")
Else
$preiswarm_B = "/"
EndIf

If  IsArray($zimmer_A) Then
$zimmer_B = $zimmer_A[0]
$zimmer_C = StringReplace($zimmer_B , ".",",")
Else
$zimmer_B = "/"
EndIf

If  IsArray($ort_A) Then
$ort_B = $ort_A[0]
Else
$ort_B = "/"
EndIf

If  IsArray($bezugsfrei_A) Then
$bezugsfrei_B = $bezugsfrei_A[0]
Else
$bezugsfrei_B = "/"
EndIf

If  IsArray($wohnflache_A) Then
$wohnflache_B = $wohnflache_A[0]
Else
$wohnflache_B = "/"
EndIf

If  IsArray($preis_A) Then
$preis_B = $preis_A[0]
Else
$preis_B = "/"
EndIf

$aio= $name_B&";"&$strase_B&";"&$telefon_B&";"&$objekttyp_B&";"&$ort_B&";"&$baujahr_B&";"&$zimmer_C&";"&$bezugsfrei_B&";"&$wohnflache_B&";"&$preis_B&";"&$preiswarm_E&";"&$url

$sString1 = StringReplace($aio, "  ", "")
$sString2 = StringReplace($sString1, "<p>", "")
$sString3 = StringReplace($sString2, "<span>", "")
$sString4 = StringReplace($sString3, "</p>", "")
$sString5 = StringReplace($sString4, "Â", "")
$sString6 = StringReplace($sString5, '<span class="is24-operator">=</span>', "")     ;Filtering the given Data to get a proper CSV Format
$sString7 = StringReplace($sString6, "EUR", "")
$sString8 = StringReplace($sString7, "</span>","")
$sString9 = StringReplace($sString8, "Dieses Objekt im Vergleich zu anderen in ","")
$sString10 = StringReplace($sString9, "disabled",",00") ;caused bug
$sString11 = StringReplace($sString10, "ü","ü")
$sString12 = StringReplace($sString11, "ß","ß")     ;Including special characters lower case
$sString13 = StringReplace($sString12, "ä","ä")
$sString14 = StringReplace($sString13, "ö","ö")
$sString15 = StringReplace($sString14, "Ãœ","Ãœ")    ;Including special characters upper case
$sString16 = StringReplace($sString15, "Ä","Ä")
$sString17 = StringReplace($sString16, "Ö","Ö")
$sStringfinal = StringReplace($sString17, @CRLF, "")

FileWrite ( "Immobilien.csv", $sStringfinal & @CRLF )

ToolTip("Stage 2 of 2, Processing: " & $m + 1 & " of " & Ubound($Array),0,0)

Sleep(100)
endfunc

Func GetNumberedLinks();Get the number that follows http://www.immobilienscout24.de/expose/ in the current html
        ToolTip("Stage 1 of 2, Processing: " & $k - 1 & " of " & $pages,0,0)
        $HTML = _INetGetSource($url)
        $j = 1
        Do
            $1 = StringInStr($HTML, "/expose/", 0, $j)
            $2 = StringTrimLeft($HTML, $1)
            $3 = StringReplace($2, '"', ";;")
            $4 = StringSplit($3, ";;", 1)
            $5 = $4[1]
            $6 = StringMid($5, 8, 8)
            If $6 >= 1 Then;Needed to add this if statement because I was getting extra characters at the end of the finalarray
                _ArrayAdd($Array,$6,1)
            EndIf
            $j += 1
        Until $1 = 0;Stop loop when there are no more instances of "/expose/"
        $Array = _ArrayUnique($Array,0,0,0,0)
EndFunc

Func NumberOfPages()
    $filtered = StringRegExp($pagecount[0], "\d{2}", $STR_REGEXPARRAYMATCH)
    $link0 = StringReplace($url, "http://www.immobilienscout24.de/Suche/S-T/", "")
    $link1 = StringReplace($url, $link0, "")
    $link2 = "/" & $link0
    $pages = $filtered[0]
EndFunc   ;==>NumberOfPages

Get Scite to add a popup when you use a 3rd party UDF -> http://www.autoitscript.com/autoit3/scite/docs/SciTE4AutoIt3/user-calltip-manager.html

Link to comment
Share on other sites

  • Solution

 

Looks like I got it to work. Test it and let me know if the data is correct.

;~ #NoTrayIcon
#include <Inet.au3>
#include <Array.au3>
#include <String.au3>
#include <AutoItConstants.au3>
#include <MsgBoxConstants.au3>

Global $Array[0];Array with all the link number info (ie http://www.immobilienscout24.de/expose/'77947839' <- this number
Global $FinalArray[0]
Global $url = "http://www.immobilienscout24.de/Suche/S-T/Wohnung-Miete/Bayern/Muenchen/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/true?enteredFrom=result_list"
Global $content = _INetGetSource($url)
Global $string_A = _StringBetween($content, '<li data-item="result" data-obid=', 'id="result')
Global $pagecount = _StringBetween($content, '<li><span>&hellip;</span></li>', '<span class="smallPager">')
Global $pages;<-number of websites to scan
FileDelete(@ScriptDir & "\Immobilien.csv")

NumberOfPages()

For $k = 2 To 20
    GetNumberedLinks()
    _ArrayConcatenate($FinalArray,$Array)
    $url = "http://www.immobilienscout24.de/Suche/S-T/P-" & $k & "/Wohnung-Miete/Bayern/Muenchen/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/true?enteredFrom=result_list"
    $HTML = _INetGetSource($url)
Next

;~ $number =UBound ( $string_A ,$UBOUND_ROWS)
;~ $timesran = -1
;~ $i=0
;~ $numberarray = $number
;~ $numberarray -= 1

For $m = 0 To Ubound($Array)
$id= $FinalArray[$m]
$urlscout = "http://www.immobilienscout24.de/expose/"&$id
export ()
Next

func export ()
$url = $urlscout

If StringInStr($url,"http://www.immobilienscout24.de/expose") = false Then
MsgBox(48,"Error","URL not specified. Please make sure you enter the URL !")
Exit
Else
EndIf

If FileExists("Immobilien.csv") =false Then
FileWrite("Immobilien.csv","Name;Adresse;Telefon;Objekttyp;Ort;Baujahr;Zimmer;Bezugfrei ab;Wfl./ qm;Kaltmiete;Warmmiete;Scout- ID"& @CRLF)
EndIf

$content = _INetGetSource($url)
$name_A = _StringBetween($content, '<span data-qa="contactName" class="font-bold">', '</span>')
$preis_A = _StringBetween($content, '<span class="is24-operator">', '<div class="sourcecode section">')
$strase_A = _StringBetween($content, '<strong class="font-standard">' , '</strong><br/>')
$telefon_A = _StringBetween($content, '<div class="is24-phone-number hide">' ,'</div>')
$objekttyp_A = _StringBetween($content, '<dd class="is24qa-wohnungstyp">' ,'</dd>')
$ort_A = _StringBetween($content, '<span id="quickCheckHeader" class="is24-f">' , '</span>')
$baujahr_A = _StringBetween($content, '<dd class="is24qa-baujahr">','</dd>')
$zimmer_A = _StringBetween($content, '<dd class="is24qa-zimmer">','</dd>')
$bezugsfrei_A = _StringBetween($content, '<dd class="is24qa-bezugsfrei-ab">' ,'</dd>')
$wohnflache_A = _StringBetween($content, '<dd class="is24qa-wohnflaeche-ca">' ,'</dd>')
$preiswarm_A =_StringBetween($content, '<strong class="is24qa-gesamtmiete">','</strong>')

If  IsArray($strase_A) Then
$strase_B = $strase_A[0]
Else
$strase_B = "/"
EndIf

If  IsArray($name_A) Then
$name_B = $name_A[0]
Else
$name_B = "/"
EndIf

If  IsArray($telefon_A) Then
$telefon_B = $telefon_A[0]
Else
$telefon_B = "/"
EndIf

If  IsArray($objekttyp_A) Then
$objekttyp_B = $objekttyp_A[0]
Else
$objekttyp_B = "/"
EndIf

If  IsArray($baujahr_A) Then
$baujahr_B = $baujahr_A[0]
Else
$baujahr_B = "/"
EndIf

If  IsArray($preiswarm_A) Then
$preiswarm_B = $preiswarm_A[0]
$preiswarm_C = StringReplace($preiswarm_B , "disabled","")
$preiswarm_D = StringReplace($preiswarm_C, "disabled","",1) ;caused some bugs
$preiswarm_E = StringReplace($preiswarm_D, "disabled","")
Else
$preiswarm_B = "/"
EndIf

If  IsArray($zimmer_A) Then
$zimmer_B = $zimmer_A[0]
$zimmer_C = StringReplace($zimmer_B , ".",",")
Else
$zimmer_B = "/"
EndIf

If  IsArray($ort_A) Then
$ort_B = $ort_A[0]
Else
$ort_B = "/"
EndIf

If  IsArray($bezugsfrei_A) Then
$bezugsfrei_B = $bezugsfrei_A[0]
Else
$bezugsfrei_B = "/"
EndIf

If  IsArray($wohnflache_A) Then
$wohnflache_B = $wohnflache_A[0]
Else
$wohnflache_B = "/"
EndIf

If  IsArray($preis_A) Then
$preis_B = $preis_A[0]
Else
$preis_B = "/"
EndIf

$aio= $name_B&";"&$strase_B&";"&$telefon_B&";"&$objekttyp_B&";"&$ort_B&";"&$baujahr_B&";"&$zimmer_C&";"&$bezugsfrei_B&";"&$wohnflache_B&";"&$preis_B&";"&$preiswarm_E&";"&$url

$sString1 = StringReplace($aio, "  ", "")
$sString2 = StringReplace($sString1, "<p>", "")
$sString3 = StringReplace($sString2, "<span>", "")
$sString4 = StringReplace($sString3, "</p>", "")
$sString5 = StringReplace($sString4, "Â", "")
$sString6 = StringReplace($sString5, '<span class="is24-operator">=</span>', "")     ;Filtering the given Data to get a proper CSV Format
$sString7 = StringReplace($sString6, "EUR", "")
$sString8 = StringReplace($sString7, "</span>","")
$sString9 = StringReplace($sString8, "Dieses Objekt im Vergleich zu anderen in ","")
$sString10 = StringReplace($sString9, "disabled",",00") ;caused bug
$sString11 = StringReplace($sString10, "ü","ü")
$sString12 = StringReplace($sString11, "ß","ß")     ;Including special characters lower case
$sString13 = StringReplace($sString12, "ä","ä")
$sString14 = StringReplace($sString13, "ö","ö")
$sString15 = StringReplace($sString14, "Ãœ","Ãœ")    ;Including special characters upper case
$sString16 = StringReplace($sString15, "Ä","Ä")
$sString17 = StringReplace($sString16, "Ö","Ö")
$sStringfinal = StringReplace($sString17, @CRLF, "")

FileWrite ( "Immobilien.csv", $sStringfinal & @CRLF )

ToolTip("Stage 2 of 2, Processing: " & $m + 1 & " of " & Ubound($Array),0,0)

Sleep(100)
endfunc

Func GetNumberedLinks();Get the number that follows http://www.immobilienscout24.de/expose/ in the current html
        ToolTip("Stage 1 of 2, Processing: " & $k - 1 & " of " & $pages,0,0)
        $HTML = _INetGetSource($url)
        $j = 1
        Do
            $1 = StringInStr($HTML, "/expose/", 0, $j)
            $2 = StringTrimLeft($HTML, $1)
            $3 = StringReplace($2, '"', ";;")
            $4 = StringSplit($3, ";;", 1)
            $5 = $4[1]
            $6 = StringMid($5, 8, 8)
            If $6 >= 1 Then;Needed to add this if statement because I was getting extra characters at the end of the finalarray
                _ArrayAdd($Array,$6,1)
            EndIf
            $j += 1
        Until $1 = 0;Stop loop when there are no more instances of "/expose/"
        $Array = _ArrayUnique($Array,0,0,0,0)
EndFunc

Func NumberOfPages()
    $filtered = StringRegExp($pagecount[0], "\d{2}", $STR_REGEXPARRAYMATCH)
    $link0 = StringReplace($url, "http://www.immobilienscout24.de/Suche/S-T/", "")
    $link1 = StringReplace($url, $link0, "")
    $link2 = "/" & $link0
    $pages = $filtered[0]
EndFunc   ;==>NumberOfPages

 

Wow thanks :P Testing it right now and will report back in a few minutes!

Holy ... it works perfectly! So basically it first saves all links and then starts the import ... genius :D, I didnt think of doing it this way ^^

Really nicely solved + looks clean now .

Is there any way I can thank you? Like a small donation (Well I dont have access to a creditcard yet, so it's gonna be a Paysafecard or iTunes/Google Play -Card)?

Edited by andrewz
Link to comment
Share on other sites

 

Couldn't this be done using funcs like in this pseudocode ?

For $var = 1 to $filtered      ; total amount of pages
   $currenturl = _secondScript($var)    ; generate $currenturl
   $data = _firstScript($currenturl)      ; grab data from $currenturl
Next

 

Nope somehow not, that was my first attempt to fix this problem, but it messed up some arrays or variables, dunno why.

Link to comment
Share on other sites

Wow thanks :P Testing it right now and will report back in a few minutes!

Holy ... it works perfectly! So basically it first saves all links and then starts the import ... genius :D, I didnt think of doing it this way ^^

Really nicely solved + looks clean now .

Is there any way I can thank you? Like a small donation (Well I dont have access to a creditcard yet, so it's gonna be a Paysafecard or iTunes/Google Play -Card)?

Yes please lol. I spent all day working on it. Lucky for you I have some free time and I like to code. PM me if you want.

Get Scite to add a popup when you use a 3rd party UDF -> http://www.autoitscript.com/autoit3/scite/docs/SciTE4AutoIt3/user-calltip-manager.html

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...