Jump to content

Wierd behaviour of extracting source information


Recommended Posts

So I'm making script that should source one website and read info I need from it. 

I can do it as straight request and it's working fine, but if I generate link of website with another script and pass it into this mining part it just return nothing. Hope someone can enlighten me whats mistake in code.

If I do it like this (directly setting path) it works flawlessly.

$s_URL = "http://www.njuskalo.hr/procesori/cpu-amd-sempron-3400-x64-am2-oglas-9288094"
    $source = _INetGetSource ( $s_URL)
    $string = BinaryToString($source)
    $cijena = _StringBetween($string, "cijena:", " kn")
    Local $zaDBcijena = StringStripWS($cijena[0], 3)
    MsgBox(0, "out", $zaDBcijena)

But if do it as part of other script it simply doesn't work:

procesor()
Func procesor()
Local $i = 0
$s_URL = "http://www.njuskalo.hr/procesori"
$source = _INetGetSource ( $s_URL)
$string = BinaryToString($source)
Global $url = _StringBetween($string, '<div class="image"><a href="', '"><img class="img_var')
$size = UBound ( $url )

While $i < $size
    $lokacija = $s_URL & $url[$i]
    otvaranje()
    $i = $i + 1
WEnd
EndFunc

;~ //Pregledavanje oglasa
Func otvaranje()
    $s_URL = $lokacija
    $source = _INetGetSource ($s_URL)
    $string = BinaryToString($source)
    $cijena = _StringBetween($string, "cijena:", " kn")
    Local $zaDBcijena = StringStripWS($cijena[0], 3)
    MsgBox(0, "out", $zaDBcijena)
EndFunc

Any ideas is highly appreciated.

Edit 1:

So after getting over code line by line over and over it's obvious that this line is not working but why it's still mystery:

$source = _INetGetSource ( $s_URL)
Edited by Centrally
Link to comment
Share on other sites

the url of the first example is different form the url passed to Func otvaranje() in the second listing
try this to see it:

#include <Inet.au3>
#include <string.au3>
Global $lokacija
procesor()
Func procesor()
    Local $i = 0
    $s_URL = "http://www.njuskalo.hr/procesori"
    $source = _INetGetSource($s_URL)
    $string = BinaryToString($source)
    Global $url = _StringBetween($string, '<div class="image"><a href="', '"><img class="img_var')
    $size = UBound($url)

    While $i < $size
        $lokacija = $s_URL & $url[$i]
        otvaranje()
        $i = $i + 1
    WEnd
EndFunc   ;==>procesor

;~ //Pregledavanje oglasa
Func otvaranje()
    ConsoleWrite("http://www.njuskalo.hr/procesori/cpu-amd-sempron-3400-x64-am2-oglas-9288094" & @CRLF)
    ConsoleWrite($lokacija & @CRLF)
    MsgBox(0, "debug", "check the urls")
    Local $s_URL = "http://www.njuskalo.hr/procesori/cpu-amd-sempron-3400-x64-am2-oglas-9288094" ;$lokacija
    $source = _INetGetSource($s_URL)
    ConsoleWrite("source " & $source & @CRLF)
    $string = BinaryToString($source)
    $cijena = _StringBetween($string, "cijena:", " kn")
    Local $zaDBcijena = StringStripWS($cijena[0], 3)
    MsgBox(0, "out", $zaDBcijena)
EndFunc   ;==>otvaranje

edit

you have to remove /processori from this line:

$s_URL = "http://www.njuskalo.hr/procesori"

edit2:

you have to add this new line just before the  while whend loop:

$s_URL = "http://www.njuskalo.hr"

$s_URL = "http://www.njuskalo.hr" ; <-- add this
    While $i < $size
        $lokacija = $s_URL & $url[$i]
        otvaranje()
        $i = $i + 1
    WEnd
Edited by PincoPanco

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

It doesn't really matter if link is same or not because all link extracted contain this info in their source, in my 2nd example at step  $s_URL = $lokacija  debug give correct link at $s_URL but than after $source = _INetGetSource($s_URL) step, it returns nothing.

Edit: 

If I remove that part of link than all my generated links will be invalid

Edit2:

I just cannot believe that I didn't see double usage of subcategory. Everything is fixed now, thank you very much for all your help.

Edited by Centrally
Link to comment
Share on other sites

It doesn't really matter if link is same or not because all link extracted contain this info in their source, in my 2nd example at step  $s_URL = $lokacija  debug give correct link at $s_URL but than after $source = _INetGetSource($s_URL) step, it returns nothing.

Edit: 

If I remove that part of link than all my generated links will be invalid

 

the url passed to the function is not correct, it contains 2 times /processori/processori/

to correct it you have to add this new line just before the  while whend loop:

$s_URL = "http://www.njuskalo.hr"

$s_URL = "http://www.njuskalo.hr" ; <-- add this
    While $i < $size
        $lokacija = $s_URL & $url[$i]
        otvaranje()
        $i = $i + 1
    WEnd

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...