KarlosTHG Posted September 16, 2009 Posted September 16, 2009 Hi again guys, I study in the Universidad EAFIT in Colombia and this provide an user and password for a platform to get the results of the quiz and the works, Well, i need to make a script to download the page and parse it to fill a gui with the results, the second part is easy but i have not idea to download the page automatically because this request the authentication info. How i can do that? Sorry for my english but i have only taken a few of classes. Any help would be appreciated. thanks!
jvanegmond Posted September 16, 2009 Posted September 16, 2009 With the function: InetGet To use a username and password when connecting simply prefix the servername with "username:password@", e.g. "http://myuser:mypassword@www.somesite.com" github.com/jvanegmond
KarlosTHG Posted September 16, 2009 Author Posted September 16, 2009 With the function: InetGet To use a username and password when connecting simply prefix the servername with "username:password@", e.g. "http://myuser:mypassword@www.somesite.com" OK, i will try it now.
KarlosTHG Posted September 16, 2009 Author Posted September 16, 2009 Doesn't work, it get the login page but making something diferent: i use firefox to enter to the page and i show up the page source: http://webapps.eafit.edu.co/ulises/login.do expandcollapse popup<html:html> <head> <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"/> <meta http-equiv="Cache-Control" Content="no-cache"/> <meta http-equiv="Pragma" Content="no-cache"/> <meta http-equiv="Expires" Content="0"/> <title> Sistema de Admisiones y Registro </title> <link href="http://webapps.eafit.edu.co/imagenes/v1/styles/estiloEafit2005.css" type="text/css" rel="stylesheet"/> </head> <body topmargin="0" leftmargin="0"> <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-1439547-2"; urchinTracker(); </script> <table class="noPrint" width="100%" border="0" cellpadding="0" cellspacing="0" background='http://webapps.eafit.edu.co/imagenes/v1/apps/ulises/fondo.gif' bgcolor='#E5E5E5'> <tr> <td width="182"><img src='http://webapps.eafit.edu.co/imagenes/v1/apps/ulises/logoEafit.gif' border="0"/></td> <td align="center"><img src='http://webapps.eafit.edu.co/imagenes/v1/apps/ulises/titulo.gif' border="0"/></td> <td width="186"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td align="right"><a href='http://www.eafit.edu.co' target="_blank"><img src='http://webapps.eafit.edu.co/imagenes/v1/apps/ulises/homeEafit.gif' border="0"/></a></td> </tr> <tr> <td align="right"><img src='http://webapps.eafit.edu.co/imagenes/v1/apps/ulises/imagenTop.gif' border="0"/></td> </tr> </table></td> </tr> </table> <script type="text/javascript"> function entrar() { document.getElementById('loginId').style.display = "none"; document.getElementById('imagen').style.display = ""; loginForm.submit(); } function load() { document.getElementById('loginId').style.display = ""; document.getElementById('imagen').style.display = "none"; } function hideElement(elementId) { document.getElementById(elementId).style.display = "none"; } function showElement(elementId) { document.getElementById(elementId).style.display = ""; } function logueoPorDocumento(){ if(document.loginForm.tipo.value != '6'){ hideElement('login'); showElement('msnDocumento'); hideElement('msnLogin'); hideElement('msnLogin2'); showElement('tdcto'); showElement('ndcto'); hideElement('clave'); showElement('clave'); hideElement('entrar'); showElement('entrarRecordar'); document.loginForm.tipo.value = '6'; } } function logueoPorlogin(){ if(document.loginForm.tipo.value != '3'){ showElement('login'); showElement('msnLogin'); showElement('msnLogin2'); hideElement('msnDocumento'); hideElement('tdcto'); hideElement('ndcto'); hideElement('clave'); showElement('clave'); hideElement('entrarRecordar'); showElement('entrar'); document.loginForm.tipo.value = '3'; } } function recordarClave(){ document.loginForm.action= "/ulises/user-search.do"; document.loginForm.submit(); } </script> <!-- Título de la Página --> <p align="center" class="titulo"> Ingreso al Sistema </p> <!-- Formulario --> <form name="loginForm" method="post" action="/ulises/login-submit.do"> <!-- Muestra los errores generados por la validación del formulario --> <div id="loginId" align="center" style='display=""'> <table border="0" cellspacing="0" cellpadding="1" align="center"> <!-- Referencia al atributo tipo del formulario --> <input type="hidden" name="tipo" value="3"> <tr id="login"> <td> <!-- Referencia al label del ApplicationResources --> Usuario (*): </td> <td> <!-- Referencia al atributo login del formulario --> <input type="text" name="login" maxlength="30" size="15" value=""> </td> </tr> <tr id="tdcto" style="display:none"> <td> <!-- Referencia al label del ApplicationResources --> Tipo de Documento: </td> <td> <select name="tipoDcto"><option value="CC">Cédula De Ciudadania</option> <option value="CE">Cédula De Extranjeria</option> <option value="CO">Código De Estudiante</option> <option value="CG">Guatemala - Cédula De Ciudadanía</option> <option value="NU">Nro Único Identificación Personal</option> <option value="OT">Otro</option> <option value="PP">Pasaporte</option> <option value="RE">Registro</option> <option value="TI">Tarjeta De Identidad</option></select> </td> </tr> <tr id="ndcto" style="display:none"> <td> <!-- Referencia al label del ApplicationResources --> No. de Documento: </td> <td> <input type="text" name="nroDcto" maxlength="12" size="15" value=""> </td> </tr> <tr id="clave"> <td> <!-- Referencia al label del ApplicationResources --> Clave (*): </td> <td> <input type="password" name="clave" maxlength="30" size="15" value=""> </td> </tr> <tr id="entrar"> <td colspan="2"> <div align="center"> <a href="javascript:entrar();"> <input type="image" class="form" src="http://webapps.eafit.edu.co/imagenes/v1/botones/btn_entrar.gif" onclick="javascript:entrar();"/> </a> </div> </td> </tr> <tr id="entrarRecordar" style="display:none"> <td colspan="2"> <div align="center"> <input type="image" name="" src="http://webapps.eafit.edu.co/imagenes/v1/botones/btn_entrar.gif" border="0" class="form" alt="Entrar"> <a href="javascript:recordarClave();"> <img alt="Recordar Clave" border="0" src="http://webapps.eafit.edu.co/imagenes/v1/botones/btn_recordarClave.gif"/> </a> </div> </td> </tr> </table> <p/> <div align="center" id='msnLogin2'> <a href="javascript:logueoPorDocumento();" class="msgMensaje"> Si no recuerda o no tiene asignado su logín por favor de click aquí para logearse con tipo y número de documento de identidad. </a> </div> <div align="center" class="mini" id='msnLogin'> <br/> Use el login y la clave que tenga para entrar al correo electrónico asignado por la Universidad <p/> Los campos marcados con * son obligatorios <p/> </div> <div align="center" class="mini" id='msnDocumento' style="display:none"> Use su tipo y número de documento de identidad y la clave que tiene asignada en el sistema. <br/> <a href="javascript:logueoPorlogin();"> Si desea ingresar con el login y clave que tiene asignadas de click aquí </a> <p/> Los campos marcados con * son obligatorios <p/> </div> </div> <div align="center" class="mini"> | <a href="/ulises/comentarios.do">Comentarios y Sugerencias</a> | <br/>Universidad EAFIT: Teléfono: (57) (4) - 2619500| Dirección: Carrera 49 - 7 Sur 50 | Medellín - Colombia - Suramérica <br/>© Copyright 2007 Universidad EAFIT ® Todos los Derechos Reservados - Centro de Informática<br/> Fecha Actualización: 2009-09-09<br/> Utilice <a href='http://www.microsoft.com/downloads/details.aspx?displaylang=es&FamilyID=1e1550cb-5e5d-48f5-b02b-20b602228de6' target='_blank'>Internet Explorer 6.0</a> o una versión superior de este navegador.</div> <script type="text/javascript"> document.loginForm.login.focus(); </script> </form> <div id="imagen" align="center" style='display:none'> <img border="0" src="http://webapps.eafit.edu.co/imagenes/v1/icons/ico_animado.gif"/> </div> </body> </html:html> searching in the code i found a interesting line: <form name="loginForm" method="post" action="/ulises/login-submit.do"> with this explaination: <!-- Muestra los errores generados por la validación del formulario --> traslate: Show the errors generated with the form validation and i think that maybe is PHP or something like that is there a way to know the syntax of the post request for login-submit.do thanks
DaleHohm Posted September 16, 2009 Posted September 16, 2009 When you open a page in your browser, it is essentially downloaded to your machine and the browser gives you an interface to work with it. I am confused by your question. Once you download the page, how do you plan to work with it and what do you intend to do with the results? Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble
KarlosTHG Posted September 16, 2009 Author Posted September 16, 2009 I want to do a script that shows the schedule, the results of quiz and other info in my own way, like a offline platform, then i need to download the page which has login protection in some kind of PHP, parse it, save to hard disk, and show the information retrieved in my application GUI, I can do everything, but the page download I hope explained me well Thanks
jvanegmond Posted September 16, 2009 Posted September 16, 2009 Ok, so you download the page but instead of the page you wanted you get a login page? You want to parse the login page or something and build a HTTP request based on that? I fail to see where this is going. github.com/jvanegmond
KarlosTHG Posted September 16, 2009 Author Posted September 16, 2009 I don't need the login page i need a page after the login but when i use InetGet with the page that i need, i always get the login page source because i dont know how to pass my login info into the script to download the page that i need. i think that i can not explain me better, jaja, bye and thanks
jvanegmond Posted September 16, 2009 Posted September 16, 2009 Read over DaleHolms reply again and pay attention closely. What is happening when you try to download the page directly: - The page you are trying to download is redirecting you to the login page (probably because you don't have some cookies set) - InetGet knows it's being redirected, so it downloads the page it's being redirected to instead - You're stuck with the login page. LoL. So what you should do is: - Try to download the page - Be redirected to the login page - Log in on the login page - Check if log in was succesful - Try to download the page For this you'll need a better way of handling a website than just the InetGet function, because there are a lot of things involved: POST HTTP requests, cookies, redirection. DaleHolm has written a very nice library for this. It uses Internet Explorers internal workings to be able to interact with a website. You can find about all these things in the AutoIt help file. The functions all start with _IE and they're very convenient. I don't attend a university at all. I just do this stuff in my free time. Kthxbye. github.com/jvanegmond
DaleHohm Posted September 16, 2009 Posted September 16, 2009 How much do you know about HTML and the Document Object Model (DOM)? To login, see _IECreate, _IEFormGetObjByName, _IEFormElementGetObjByName, _IEFormElementSetValue, _IEFormImgClick, To save the page, see _IEDocReadHTML or _IEBodyReadHTML There are more refined ways of getting individual elements off the page, but the techniques require knowedge of the DOM. Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble
goldenix Posted September 17, 2009 Posted September 17, 2009 This is how I log in to the google mail: so you log in, retrieve info you need & close browser.to get that $o_signin=_IEGetObjByName($oIE,"signIn") use DEBUGBAR for internet explorer - google it.#include <IE.au3> $oIE =_IECreate("http://mail.google.com/mail/?hl=et&tab=wm");opens the webpage ; get pointers to the login form and username, password and signin fields $o_login = _IEGetObjByName($oIE,"Email") _IEFormElementSetValue ($o_login, "mudak.yo") $o_password = _IEGetObjByName($oIE,"passwd") _IEFormElementSetValue ($o_password, "dasitmeinpasswerthohoho") $o_signin=_IEGetObjByName($oIE,"signIn") _IEAction($o_signin,"click") Sleep(5000) _IEQuit($oIE)And this is how I work with webpage source hope it helps:#include <INet.au3> $URL = 'www.muwebpage.rt' $SEARCH_FOR = ' Downloads and Information</title>' $HTMLSource = _INetGetSource($URL) $_Arrayline = StringSplit($HTMLSource, @LF) ; this is the Array $_Arrayline ;~ for $i = 1 to $_Arrayline[0] If StringInStr($_Arrayline[$i],$SEARCH_FOR) Then ; if string contains word time _sample($_Arrayline[$i]) ExitLoop EndIf Next Func _sample($STRING) ;~ #ce ---------------------------------------------------------------------------- $split = StringSplit($STRING,' Downloads and Information</title>',1) ; split line to get DL URL Middle & Right part [ URL SGFDG ] ConsoleWrite($split[1] & @CRLF) ;~ $split = StringSplit($split[2],'"',1) ; 3 split $split[2] to get Final page URL [ URL ] ;~ $DL_Link = $split[1] ;~ ;~ $split = StringSplit($DL_Link,"/",1) ; Get Filename from link ;~ $Filename_Save_as = $split[7] Return $DL_Link ;~ EndFunc My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list]
KarlosTHG Posted September 17, 2009 Author Posted September 17, 2009 @Manadar: Thats is exactly what i need. About the university, I study Business administration and i just do this stuff in my free time too because is funny for me, and exercise my brain jajaja. @DaleHohm: I will try these functions later because i have to study now. About HTML i just know the basics and about DOM I just know its name, but maybe i will read something about it later. @goldenix: I will try the debugbar and your example later, but in your example a IE window should be showed? because this is a little bad-looking for an application, can i use the WinSetState function to hide the window? Thanks to all of you i cannot wait to try yours examples but i have to read now a document about the globalization and how this affects the internal economy Bye, and thanks again for your help.
KarlosTHG Posted September 21, 2009 Author Posted September 21, 2009 @goldenix: hi again, i tried your script and it works very good and hide the windows too, now i can do my project, so many thanks men, and thanks to all the others who helped me
KarlosTHG Posted September 22, 2009 Author Posted September 22, 2009 Is there a way to download just the HTML without the images to get a fast download?
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now