MonsieurOUXX Posted August 13, 2009 Share Posted August 13, 2009 Hi,the scenario is quite simple :I'm loading a webpage using _IENavigateI know that the right page is loaded because _IEBodyReadText returns a certain string that I'm expecting in the pageProblem: _IEBodyReadHTML doesn't return the same source as the source I get in IE by right-clicking on the page and selecting "View Source"Actually the source returned by IEBodyReadHTML doesn't even contain the string spotted by _IEBodyReadText (even though that string is inside the <body> tags)I'm very confused. Could it be frames interfering with that? Link to comment Share on other sites More sharing options...
DaleHohm Posted August 13, 2009 Share Posted August 13, 2009 The IE functions return the page HTML AFTER client-side processing... View Source shows you the source BEFORE client-side processing. Suggest you investigate with DebugBar -- using it's View Source icon allow you to see either and also shows you frames and iFrames. Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
MonsieurOUXX Posted August 13, 2009 Author Share Posted August 13, 2009 The IE functions return the page HTML AFTER client-side processing... View Source shows you the source BEFORE client-side processing. Suggest you investigate with DebugBarSo there's no way to simply see the basic code of the page? Why does IE *have* to mess things up???I'll check that with DebugBar, but if it appears that it's the code BEFORE processing, then I'm f***** am I not? Link to comment Share on other sites More sharing options...
MonsieurOUXX Posted August 13, 2009 Author Share Posted August 13, 2009 (edited) I have installed DebugBar.Here are the steps I follow :- I open the webpage- I make sure I have the DebugBar pane open on the left of my browser- I click on the "DOM" tab- I expand "Document"- I expand "HTML"- I click on "BODY". In the lower pane of DebugBar, I can see the code of the body. It starts with:<BODY onload="LoadDefaults('');"> <FORM id=Form1 name=Form1 action=home2.aspx method=post> <DIV> <INPUT id=__VIEWSTATE type=hidden value=Vk23RX9e............ ................. ................. ................. .................+IgmZ76v/6I6 name=__VIEWSTATE> </DIV> <TABLE id=Table1 cellSpacing=0 cellPadding=0 width="100%" bgColor=#18317b border=0> <TBODY> ................. ................. ................. </TBODY> </TABLE> <DIV> <INPUT id=__EVENTVALIDATION type=hidden value=ww/UkllZSLM0FiEMMfaU27TV7pDAkZFtPx+MAfDTL0GYxv9u800tsBzkII1XAcOSciEt0dbN6HYPCdL/JB1xVP/NQjuDxiqM name=__EVENTVALIDATION> </DIV> </FORM> </BODY>Note about the code : - The "LoadDefaults('');" javascript function simply applies some formatting (a dynamic menu) but is not meant to encrypt or hide any data.- The first "value" attribute (in the <INPUT> tag) is very long (50 lines). Looks like some encrypted stuff, but the data that I want to read on the page comes AFTER that. It's in the second part of the <FORM>, after the "<TBODY>" tag.However, when I call the function "_IEBodyReadHTML":- it returns ONLY that <FORM> (not the rest of <BODY>)- the form has the same structure but does not contain the same data(here is the result)<FORM id=Form1 name=Form1 action=home2.aspx method=post> <DIV><INPUT id=__VIEWSTATE type=hidden value=Vk23RX9e/d/sObo3c/6iAVHfIa1oHe4kro6yIFO3SwzTRoYIHyBdRLJRdq/OKZ/I3dcX8X+fNDWIzaLD7g8TOkmp5AYkKj/12+T8sGDDTMDimLrHXkLBHTwbp5R3LkYhAbUsj7C/KX8= name=__VIEWSTATE> </DIV> <TABLE id=Table1 cellSpacing=0 cellPadding=0 width="100%" bgColor=#18317b border=0> ...... ...... </TABLE></DIV> <DIV><INPUT id=__EVENTVALIDATION type=hidden value=tK4+iT12Tg4DtuwG4dDKqLI5C7FMDoN+piNBihNhYTVvIBDdZTeddGw9OP8KQUftQwnKQM49sfXMBpLHAvE3NEasUate3jyJ name=__EVENTVALIDATION> </DIV> </FORM> <TBODY>Notes :- as you can see the first <INPUT> tag (encrypted) is much shorter- the <TABLE> contains very little data. All the data that I want has been removed.So, here is the question: Why does "_IEBodyReadHTML" return only the FORM and not the rest of the BODY? (and, if possible, why does it "decrypt" it?)[EDITS]Formatting and clarifications Edited August 13, 2009 by MonsieurOUXX Link to comment Share on other sites More sharing options...
MonsieurOUXX Posted August 14, 2009 Author Share Posted August 14, 2009 OK, so it seems like the code I get is indeed the page *after* the Javascript has been applied.I have tried the following workarounds :Use INetGetSource => PROBLEM : the page I want to load is protected with a credentials popup window, and INetGet doesn't seem to have enough parameters to avoid waiting for the download to be finished.Automate the option "View Source" in IE. => PROBLEM : this works only if the IE window is visible. If I make it invisible, sending the keys sequence "Alt+V, c" to the $oIE object doesn't create the expected Notepad window with the source.Trying to iterate through the elements of the page (with $oDoc = _IEDocGetObj($o_IE)) but then I'm too bad at using the DOM keywords to get a working program. I'm not even sure that it gives the raw source of the page anyway.I haven't found a solution. I will leave that problem aside for now. Link to comment Share on other sites More sharing options...
jvanegmond Posted August 14, 2009 Share Posted August 14, 2009 Looping through all the objects in a page also returns objects created by javascript, so it is after the javascript has been applied. github.com/jvanegmond Link to comment Share on other sites More sharing options...
corgano Posted February 7, 2012 Share Posted February 7, 2012 Sorry to bump an old topic, but I was searching through the forum for help with this EXACT PROBLEM. Has a solution been found? Has any progress been made on this? 0x616e2069646561206973206c696b652061206d616e20776974686f7574206120626f64792c20746f206669676874206f6e6520697320746f206e657665722077696e2e2e2e2e Link to comment Share on other sites More sharing options...
guinness Posted February 7, 2012 Share Posted February 7, 2012 Well have you at least tried the latest version of AutoIt? Because a lot has changed in the last 2.5yrs and I mean a lot! You're no stranger here, so when do have a problem it's always best to provide some code that re-produces the problem as well as what version of AutoIt you're using and on what system e.g. Win7. UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018 Link to comment Share on other sites More sharing options...
corgano Posted February 8, 2012 Share Posted February 8, 2012 Sorry, should have known thatRunning AutoIt v3.3.8.0 on windows 7, 64 bit. I'm using the _IE commands to make a wrapper our schools "Desire to learn" service, that will make the pager tool easier to use.$oIE = _IECreate("http://dl.cssd.ab.ca") $oForm = _IEFormGetObjByName($oIE, "processLogonForm") setfeild($oForm, "userName", $user) setfeild($oForm, "password", $pass) _IEFormSubmit($oForm)This works as expected, as does then changing to the page I want to get_IENavigate($oIE,"https://dl.cssd.ab.ca/d2l/tools/pager/pager.asp?ou=41406") _IELoadWait($oIE)Then I want to run some regexp lines on the page to get the names and ID's of my contacts. This is where it stops working. I use chrome's inspect element, and I can see:DerpOk,so the data I wanted was in a FRAME on the page, Instead of loading pager.asp I loaded friends.asp and it showed the proper friends list. It worksTo anyone else looking for help and seeing this thread, use something like inspect element in chrome to check if what you want is in a FRAME, and then instead navigate to THAT page and try to get the page source. I just happened to miss this yesterday 0x616e2069646561206973206c696b652061206d616e20776974686f7574206120626f64792c20746f206669676874206f6e6520697320746f206e657665722077696e2e2e2e2e Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now