--

footswitch · September 5, 2007

Now I have to find a way to download the file without using WinWait.... MuouseClick... and so ....

But I dont want to use things such as ControlClick or MouseClick.... I want to make a script able to detect the link gor the file and donwload it... I'm still on it

I'm so sorry, I misread your post :">

Actually I got curious about that and I got into it as well... Let's see what we can find

EDIT: OK, let's face it. I have no clue on manually sending 'POST' data to get the download file.

Edited September 5, 2007 by footswitch

AL3X · September 5, 2007

--

Edited July 2, 2015 by AL3X

footswitch · September 5, 2007

sorry , I cant understand : I have no clue on manually sending 'POST' data to get the download file.
could you explain it or traduce it to spanish :">
still no progress with the download...
EDIT: but I posted another tools that I made xD

Alright, so... if you look at the source code of the download page, you can see that the URL of the file that you want is shown there.

But without the correct 4-digit code, instead of the file, you get a code error page.

The code is passed with a POST method when you submit the form; it's a kind of data sending used for many purposes, like in a shopping cart or an user registration, for instance.

Unlike the GET method, which is quite transparent and easy to use (placed in the URL itself - http://www.autoitscript.com/forum/index.php?act=Post&CODE=02&f=2&t=52505&qpid=398745), POST's data is sent in a certain "hidden" way. With POST, the URL remains intact and the user doesn't know exactly what data is being sent. And I have no clue on how to send that data manually in order for the server to return the file.

google 'methods get post' for more info

EDIT: wait i think i found something... i'll get to this thread soon

Edited September 5, 2007 by footswitch

AL3X · September 5, 2007

--

Edited July 2, 2015 by AL3X

footswitch · September 5, 2007

Ok I'm here waiting... I am searching too...but...I dont know what am I searching... xD
A cant understand what are you telling me because my english isnt very good... (I think)....
Well, if you know the thing with the POST data...search it.... I'm doing a GUI
PD: WoW !!! We are a great team .... making our program for downloading from rapidshare :D:D xD

Indeed!

Well I can't explain stuff that well in english, so it makes it even harder!

But guess what... I'm half the way there with this POST stuff =)

I grabbed an old UDF from this guy OverloadUT... He doesn't come here since last November, but he started a really nice idea. This guy has some knowledge in the http protocol and shared a couple of UDFs to get started with.

So I've been doing some experiments to try to learn via brute-force trial/error and I'm starting to feel really hopeful right now

For now I can go from the main download page to the '4-digit code' page so let's say that, from now on, it can only get better... I hope :">

Keep it up

footswitch · September 6, 2007

Indeed! Well I can't explain stuff that well in english, so it makes it even harder!
But guess what... I'm half the way there with this POST stuff =)
I grabbed an old UDF from this guy OverloadUT... He doesn't come here since last November, but he started a really nice idea. This guy has some knowledge in the http protocol and shared a couple of UDFs to get started with.
So I've been doing some experiments to try to learn via brute-force trial/error and I'm starting to feel really hopeful right now
For now I can go from the main download page to the '4-digit code' page so let's say that, from now on, it can only get better... I hope :">
Keep it up

Done.

OK, I'm just messing around! It's almost done...

I'm sharing my thoughts to give you the big picture of where we're standing.

So this made me build the script from scratch but hell did I love to learn this new stuff

The idea here now is totally different and does not involve a browser at all (at least it's not needed anymore, although it could still be done with extra coding, and autoit would interact with it like a pseudo-proxy).

ToDo (edited):

- Enable any link support (tricky) - for coding and testing purposes I only used the link you provided earlier. Now I realised that some modifications are needed.

- Implement OCR in this new method (easy, practically effortless)

- Gradually save the received data to a file (don't have a clue)

So let me explain: as I understand it, the data is being received via TCP connection and is being saved to a variable, so basically it's being saved to RAM memory. Now if we have a 100 MB file, you can imagine what will happen. It will start to eat RAM memory as crazy and then we'll have a 100 MB variable waiting to be saved to hard-disk.

I think that TCPRecv() together with FileOpen() - both in binary mode - could eventually work, but I know little of these functions and the way they work (FileOpen in binary mode and TCPRecv at all), so it'll take some time until I get the whole idea of what to do...

EDIT2: So basically AutoIt will manage the download directly. No external utilities involved. Maybe someone has already made a downloader that also involves TCPRecv and binary data. That would be awesome and save a LOT of work.

EDIT3: TCPRecv() automatically switches to binary mode when the download is started. Cool, huh? However, it switches to binary before the http headers are downloaded, so this may represent an additional challenge.

EDIT4: There's also this big one... How the hell does one know if the download is completed? I mean, the server closes the connection at the end, but how can we know if the connection didn't drop? I hope it's not as complex as I see it.

Let's see what happens

Edited September 6, 2007 by footswitch

AL3X · September 6, 2007

--

Edited July 2, 2015 by AL3X

footswitch · September 6, 2007

WoW !!! Exelent . I made the GUI
Here you go a screenshot
RAD ..... A great tool ... xD jiji
Lets see now how can we put the link in the "Links" edit form.... hmmm
EDIT: In the "Path" windows and with the help of the "Browse" button the user will be able to point witch file should the program read for the links... xD

Good morning! ... omg, I only slept like 5 hours

it's so hot in here I can't sleep...

But hey, nice GUI you gave me some ideas with that (the GUI will need to be bigger, I think):

- Suggestion no. 1:

- Change 'browse' to 'Load file'

- Add 'Save File'

- create 'file_XYZ.txt.old' when saving, for backup reasons

- Suggestion no. 2:

- 'Add from clipboard' button to add a link from the clipboard (needs _IsRapidShare($link) to verify that the link is actually rapidshare )

- Maybe change the Links control to a List control (GUICtrlCreateList) and then add the options:

- 'Select All'

- 'Download All'

- change 'Download' to 'Download Selected'

- 'Remove Selected'

- Add an input control to add links by typing them

- Suggestion no. 3:

- Have two progress bars. One for current file progress and the other for total progress.

- Maybe there should be a "_RapidShareGetBytesTotal()" to get size information on all the files, so that the total progress bar can be accurate.

Wow... what do you think?

AL3X · September 6, 2007

--

Edited July 2, 2015 by AL3X

AL3X · September 6, 2007

--

Edited July 2, 2015 by AL3X

footswitch · September 6, 2007

Ok , here is the GUI version 0.2 xD
(...)
How is going the POST data thing ?
What a great team we are xD

Great, the GUI is getting shape

No kidding, I'm excited with this project because after it's done there's a lot of code we can use in lots of other stuff

We need to think about those buttons. There are a lot of other options, like, 'Select All', 'Reverse Selection'... Maybe we should create a menu instead, like Edit --> Select All Links, Edit --> Copy Links to clipboard, and so on. Because otherwise we're gonna have a thousand buttons to click and the user goes nuts, right?

Regarding the POST stuff... well let's say that, from my ToDo list, the "any" rapidshare URL support is now a reality. It doesn't have a good filter for rapidshare links yet: If the URL contains a file named 'rapidshare.com.rar', StringInStr($url,'rapidshare.com') returns a false positive.

I'm getting to the File Save now... Wish me luck...

AL3X · September 6, 2007

--

Edited July 2, 2015 by AL3X

footswitch · September 6, 2007

Ok Luck !!!
I was all the day out of my house.... because this week is the "fiesta" in my town xD. FIESTAAAAAAAA xD !!!
A lot of driking, a lot of fucking, a los of playing and almoust no programing/sleeping xD !! JAJAJA
But I see what can I do with the Edit menu
EDIT: Ok It's ready the version 0.3 of the GUI
(...)
It's pretty good xD
Do you have olready the save file thing ?
If not, dont worry, we have a lot of time jijiji
EDIT2: WOW !!! This post its the one and the only post with 440 visits xD and 3 pages xD

Lol so who knows you'll drink too much vodka and forget all about autoit :argue:

I think people get in this thread just because of its title. It's not a clear title so people tend to check it out

I would suggest that you change the title and add a brief comment on your first post, so that people would know what this project is about and the work that's being done.

Alright, so... I'm able to save the file BUT the file still contains the http header.

In other words, how to split a 100MB binary file?

AL3X · September 6, 2007

--

Edited July 2, 2015 by AL3X

footswitch · September 6, 2007

O.o
I dont know :S
and...yes , I'll drink vodka&eristov black& calimocho&wisky and I'll forget autoit c, vb, and linux shell xD
jijiji
but hey !, I'll learn it all another time from 0 xD
OK, I'll change the title
Good luck with the file split
EDIT: hey ! Can I help you with something ?

Well... Right now the only true challenge remains to be the file splitting.

The rest are just safety measures, like URL checking and such... Those are just a matter of spending the time to write them.

I think I'll just post a thread asking for a kind soul's help on this splitting!

AL3X · September 6, 2007

--

Edited July 2, 2015 by AL3X

footswitch · September 7, 2007

Ok When your done I'll post the source code of the GUI. You could post the source of the "body" of the program and we could split the GUI with the rest... xD
I'll see if I can make some other changes to the GUI...
I have to go to the bed now (24 H without sleeping xD)
See you tomorrow

Well... after struggling with the whole file split scene... I've decided to perform this step differently.

Every chunk of received data is going to be parsed until a @CRLF&@CRLF is found (0x0D0A0D0A). These two empty lines are provided before the file starts to be sent by the server. At least that's what I've seen from the beggining. I just HOPE that this is regular HTTP 1.1 protocol.

When these 4 bytes are found, chunks after them start to be saved to disk.

Any data received before the file (that is, the http 1.1 header) shall be parsed and saved to variables - to gather some details about the downloaded file (for instance, the server tells the file size in this header, and this can be useful to cross-check with the downloaded file)

I'm imagining that it won't be too complicated. Actually, after thinking about it, it looks like it's pretty obvious.

Things I haven't thought about just yet:

- GOCR can return a bad result (but with 4 chars anyway), we have to admit that this can happen.

Possible solutions:

- Cross-check the filesize of the file we want with header's "Content-Length".

- Consider "Content-Type" info in the header.

- There's also a parameter in the header when the server is returning a file for download: "Content-Disposition: Attachment; filename=xxxxxxxxxx.xxx". If the header doesn't contain Content-Disposition, then there's no file to save (this seems the best option)

EDIT: no major changes

Edited September 7, 2007 by footswitch

footswitch · September 8, 2007

#cs

Last update:

8-Sep-2007 @ 1:01 GMT

ToDo list:

- implement method to verify that the URL is downloadable, just to be able to perform INetGet() with no flaws)

- implement method to download a file as successfully as possible via INetGet() - try again, wait if internet connection is down, and so on

- implement method to verify if rapidshare download is valid:

- use http://rapidshare.com/en/checkfiles.html ??

- important: get a collection of the error pages that can be returned instead of the expected page and create an appropriate behaviour for each situation

(this may require the script to 'scan' htmls before opening them in IE - because some of them pop up warning dialog boxes and those can be tuff to handle)

(also look for a method to avoid dialog boxes in IE. I THINK there's a silent option somewhere)

- parse the HTTP 1.1 header when data is received in binary mode (this requires some modifications in HTTP.au3 - HTTP UDFs by OverloadUT)

; i have to check out that _HTTPRead() because it seems to parse HTTP 1.1 headers pretty well.

; in the future the parser should be a standalone function to use both in _HTTPRead() and _HTTPReadToFile()

; it could be called _HTTPParseHeader() or something like that

url that's being used for tests: "http://rapidshare.com/files/53773278/rapidshare.com.rar.html"

#ce

AL3X · September 8, 2007

--

Edited July 2, 2015 by AL3X

Generator · September 8, 2007

I think what you guys should do is create a picture label and when you get the image url download it and set pic for the label, and then ask user to put it in manually

Sign In

--

Recommended Posts

footswitch

Top Posters In This Topic

Top Posters In This Topic

AL3X

footswitch

AL3X

footswitch

footswitch

AL3X

footswitch

AL3X

AL3X

footswitch

AL3X

footswitch

AL3X

footswitch

AL3X

footswitch

footswitch

AL3X

Generator

Create an account or sign in to comment

Create an account

Sign in

Browse

AutoIt Resources

Release

Beta