Jump to content
Sign in to follow this  
AL3X

--

Recommended Posts

Now I have to find a way to download the file without using WinWait.... MuouseClick... and so ....

But I dont want to use things such as ControlClick or MouseClick.... I want to make a script able to detect the link gor the file and donwload it... I'm still on it :)

I'm so sorry, I misread your post :">

Actually I got curious about that and I got into it as well... Let's see what we can find ;)

EDIT: OK, let's face it. I have no clue on manually sending 'POST' data to get the download file.

Edited by footswitch

Share this post


Link to post
Share on other sites

sorry , I cant understand : I have no clue on manually sending 'POST' data to get the download file.

could you explain it or traduce it to spanish :">

still no progress with the download... :)

EDIT: but I posted another tools that I made xD

Alright, so... if you look at the source code of the download page, you can see that the URL of the file that you want is shown there.

But without the correct 4-digit code, instead of the file, you get a code error page.

The code is passed with a POST method when you submit the form; it's a kind of data sending used for many purposes, like in a shopping cart or an user registration, for instance.

Unlike the GET method, which is quite transparent and easy to use (placed in the URL itself - http://www.autoitscript.com/forum/index.php?act=Post&CODE=02&f=2&t=52505&qpid=398745), POST's data is sent in a certain "hidden" way. With POST, the URL remains intact and the user doesn't know exactly what data is being sent. And I have no clue on how to send that data manually in order for the server to return the file.

google 'methods get post' for more info

EDIT: wait i think i found something... i'll get to this thread soon

Edited by footswitch

Share this post


Link to post
Share on other sites

Ok I'm here :) waiting... I am searching too...but...I dont know what am I searching... xD

A cant understand what are you telling me because my english isnt very good... (I think)....

Well, if you know the thing with the POST data...search it.... I'm doing a GUI ;)

PD: WoW !!! We are a great team ;) .... making our program for downloading from rapidshare ;):D:D:D xD

Indeed! :D Well I can't explain stuff that well in english, so it makes it even harder!

But guess what... I'm half the way there with this POST stuff =)

I grabbed an old UDF from this guy OverloadUT... He doesn't come here since last November, but he started a really nice idea. This guy has some knowledge in the http protocol and shared a couple of UDFs to get started with.

So I've been doing some experiments to try to learn via brute-force trial/error and I'm starting to feel really hopeful right now :(

For now I can go from the main download page to the '4-digit code' page so let's say that, from now on, it can only get better... I hope :">

Keep it up :(

Share this post


Link to post
Share on other sites

Indeed! :D Well I can't explain stuff that well in english, so it makes it even harder!

But guess what... I'm half the way there with this POST stuff =)

I grabbed an old UDF from this guy OverloadUT... He doesn't come here since last November, but he started a really nice idea. This guy has some knowledge in the http protocol and shared a couple of UDFs to get started with.

So I've been doing some experiments to try to learn via brute-force trial/error and I'm starting to feel really hopeful right now ;)

For now I can go from the main download page to the '4-digit code' page so let's say that, from now on, it can only get better... I hope :">

Keep it up :(

Done. :)

OK, I'm just messing around! It's almost done... ;)

I'm sharing my thoughts to give you the big picture of where we're standing.

So this made me build the script from scratch but hell did I love to learn this new stuff ;)

The idea here now is totally different and does not involve a browser at all (at least it's not needed anymore, although it could still be done with extra coding, and autoit would interact with it like a pseudo-proxy).

ToDo (edited):

- Enable any link support (tricky) - for coding and testing purposes I only used the link you provided earlier. Now I realised that some modifications are needed.

- Implement OCR in this new method (easy, practically effortless)

- Gradually save the received data to a file (don't have a clue)

So let me explain: as I understand it, the data is being received via TCP connection and is being saved to a variable, so basically it's being saved to RAM memory. Now if we have a 100 MB file, you can imagine what will happen. It will start to eat RAM memory as crazy and then we'll have a 100 MB variable waiting to be saved to hard-disk.

I think that TCPRecv() together with FileOpen() - both in binary mode - could eventually work, but I know little of these functions and the way they work (FileOpen in binary mode and TCPRecv at all), so it'll take some time until I get the whole idea of what to do...

EDIT2: So basically AutoIt will manage the download directly. No external utilities involved. Maybe someone has already made a downloader that also involves TCPRecv and binary data. That would be awesome and save a LOT of work.

EDIT3: TCPRecv() automatically switches to binary mode when the download is started. Cool, huh? However, it switches to binary before the http headers are downloaded, so this may represent an additional challenge.

EDIT4: There's also this big one... How the hell does one know if the download is completed? I mean, the server closes the connection at the end, but how can we know if the connection didn't drop? I hope it's not as complex as I see it.

Let's see what happens :(

Edited by footswitch

Share this post


Link to post
Share on other sites

WoW !!! Exelent :) . I made the GUI

Here you go a screenshot :(

Posted Image

RAD ..... A great tool ... xD jiji

Lets see now how can we put the link in the "Links" edit form.... hmmm

EDIT: In the "Path" windows and with the help of the "Browse" button the user will be able to point witch file should the program read for the links... xD

Good morning! ... omg, I only slept like 5 hours ;) it's so hot in here I can't sleep...

But hey, nice GUI ;) you gave me some ideas with that (the GUI will need to be bigger, I think):

- Suggestion no. 1:

- Change 'browse' to 'Load file'

- Add 'Save File'

- create 'file_XYZ.txt.old' when saving, for backup reasons

- Suggestion no. 2:

- 'Add from clipboard' button to add a link from the clipboard (needs _IsRapidShare($link) to verify that the link is actually rapidshare :()

- Maybe change the Links control to a List control (GUICtrlCreateList) and then add the options:

- 'Select All'

- 'Download All'

- change 'Download' to 'Download Selected'

- 'Remove Selected'

- Add an input control to add links by typing them

- Suggestion no. 3:

- Have two progress bars. One for current file progress and the other for total progress.

- Maybe there should be a "_RapidShareGetBytesTotal()" to get size information on all the files, so that the total progress bar can be accurate.

Wow... what do you think? ;)

Share this post


Link to post
Share on other sites

Ok , here is the GUI version 0.2 xD

(...)

How is going the POST data thing ?

What a great team we are xD

Great, the GUI is getting shape ;)

No kidding, I'm excited with this project because after it's done there's a lot of code we can use in lots of other stuff ;)

We need to think about those buttons. There are a lot of other options, like, 'Select All', 'Reverse Selection'... Maybe we should create a menu instead, like Edit --> Select All Links, Edit --> Copy Links to clipboard, and so on. Because otherwise we're gonna have a thousand buttons to click and the user goes nuts, right? ;)

Regarding the POST stuff... well let's say that, from my ToDo list, the "any" rapidshare URL support is now a reality. It doesn't have a good filter for rapidshare links yet: If the URL contains a file named 'rapidshare.com.rar', StringInStr($url,'rapidshare.com') returns a false positive.

I'm getting to the File Save now... Wish me luck... :)

Share this post


Link to post
Share on other sites

Ok ;) Luck !!! :think:

I was all the day out of my house.... because this week is the "fiesta" in my town xD. FIESTAAAAAAAA xD !!!

A lot of driking, a lot of fucking, a los of playing and almoust no programing/sleeping xD !! JAJAJA

But I see what can I do with the Edit menu :D

EDIT: Ok ;) It's ready the version 0.3 of the GUI

(...)

It's pretty good xD

Do you have olready the save file thing ?

If not, dont worry, we have a lot of time ;) jijiji

EDIT2: WOW !!! This post its the one and the only post with 440 visits xD and 3 pages :( xD

Lol so who knows you'll drink too much vodka and forget all about autoit :argue:

I think people get in this thread just because of its title. It's not a clear title so people tend to check it out :D

I would suggest that you change the title and add a brief comment on your first post, so that people would know what this project is about and the work that's being done.

Alright, so... I'm able to save the file :( BUT the file still contains the http header.

In other words, how to split a 100MB binary file?

:)

Share this post


Link to post
Share on other sites

O.o

I dont know :S :(

and...yes :P , I'll drink vodka&eristov black& calimocho&wisky and I'll forget autoit c, vb, and linux shell xD

jijiji :P

but hey !, I'll learn it all another time from 0 xD

OK, I'll change the title :)

Good luck with the file split :)

EDIT: hey ! Can I help you with something ?

Well... Right now the only true challenge remains to be the file splitting.

The rest are just safety measures, like URL checking and such... Those are just a matter of spending the time to write them.

I think I'll just post a thread asking for a kind soul's help on this splitting!

Share this post


Link to post
Share on other sites

Ok :) When your done I'll post the source code of the GUI. You could post the source of the "body" of the program and we could split the GUI with the rest... xD :)

I'll see if I can make some other changes to the GUI...

I have to go to the bed now :P (24 H without sleeping xD)

See you tomorrow :P

Well... after struggling with the whole file split scene... I've decided to perform this step differently.

Every chunk of received data is going to be parsed until a @CRLF&@CRLF is found (0x0D0A0D0A). These two empty lines are provided before the file starts to be sent by the server. At least that's what I've seen from the beggining. I just HOPE that this is regular HTTP 1.1 protocol.

When these 4 bytes are found, chunks after them start to be saved to disk.

Any data received before the file (that is, the http 1.1 header) shall be parsed and saved to variables - to gather some details about the downloaded file (for instance, the server tells the file size in this header, and this can be useful to cross-check with the downloaded file)

I'm imagining that it won't be too complicated. Actually, after thinking about it, it looks like it's pretty obvious.

Things I haven't thought about just yet:

- GOCR can return a bad result (but with 4 chars anyway), we have to admit that this can happen.

Possible solutions:

- Cross-check the filesize of the file we want with header's "Content-Length".

- Consider "Content-Type" info in the header.

- There's also a parameter in the header when the server is returning a file for download: "Content-Disposition: Attachment; filename=xxxxxxxxxx.xxx". If the header doesn't contain Content-Disposition, then there's no file to save (this seems the best option)

EDIT: no major changes

Edited by footswitch

Share this post


Link to post
Share on other sites

#cs

Last update:

8-Sep-2007 @ 1:01 GMT

ToDo list:

- implement method to verify that the URL is downloadable, just to be able to perform INetGet() with no flaws)

- implement method to download a file as successfully as possible via INetGet() - try again, wait if internet connection is down, and so on

- implement method to verify if rapidshare download is valid:

- use http://rapidshare.com/en/checkfiles.html ??

- important: get a collection of the error pages that can be returned instead of the expected page and create an appropriate behaviour for each situation

(this may require the script to 'scan' htmls before opening them in IE - because some of them pop up warning dialog boxes and those can be tuff to handle)

(also look for a method to avoid dialog boxes in IE. I THINK there's a silent option somewhere)

- parse the HTTP 1.1 header when data is received in binary mode (this requires some modifications in HTTP.au3 - HTTP UDFs by OverloadUT)

; i have to check out that _HTTPRead() because it seems to parse HTTP 1.1 headers pretty well.

; in the future the parser should be a standalone function to use both in _HTTPRead() and _HTTPReadToFile()

; it could be called _HTTPParseHeader() or something like that

url that's being used for tests: "http://rapidshare.com/files/53773278/rapidshare.com.rar.html"

#ce

Share this post


Link to post
Share on other sites

I think what you guys should do is create a picture label and when you get the image url download it and set pic for the label, and then ask user to put it in manually

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...