Jump to content

Rip text from a website?


Rorka
 Share

Recommended Posts

Try that (worked for me):

#AutoIt3Wrapper_Change2CUI=y

#include <INet.au3>

Global Const $URL = "http://www.mmowned.com/forums/bots-programs/266946-release-beta-theexplorer-updated-04-11-09-a.html"
ConsoleWrite("Loading URL: " & $URL & @CRLF)

Global $Source = _InetGetSource($URL)
If Not StringLen($Source) Then
    ;MsgBox(16, @ScriptName, StringFormat("Error loading the following page.\n\nURL: %s", $URL))
    ConsoleWrite("Error loading URL." & @CRLF) ; '" & $URL & "'" & @CRLF)

    Exit
EndIf

; Find posts div
Global $CurrentVersion = ""
Global $posPos = 0
Global $divPos = StringRegExp($Source,'<div[^>]+id\s*=\s*"posts"[^>]*>', 0)
If $divPos Then
    $divPos = @extended
    ; Find first post
    Global $PostID = StringRegExp($Source, '<div[^>]+id\s*=\s*"post_message_(\d+)"[^>]*>', 1, $divPos)
    $postPos = @extended

    If Not @error And StringIsInt($PostID[0]) Then
        ; Find Current version
        $CurrentVersion = StringRegExp($Source, "(?i)Current\s*Version[^:]*:\s*(\d+\.\d+)", 1, $postPos)
        If Not @error Then
            $CurrentVersion = $CurrentVersion[0]
        EndIf
    EndIf
EndIf

If Not StringLen($CurrentVersion) Then
    ;MsgBox(64, @ScriptName, "Current Version: " & $CurrentVersion)
    ConsoleWrite("Version not found" & @CRLF)
Else
    ;MsgBox(16, @ScriptName, "Current Version not found.")
    ConsoleWrite("Current Version: " & $CurrentVersion & @CRLF)
EndIf

; Sleep a little so it's actually possible to check the version
If @Compiled Then
    Sleep(5000)
EndIf
Link to comment
Share on other sites

Try that (worked for me):

#AutoIt3Wrapper_Change2CUI=y

#include <INet.au3>

Global Const $URL = "http://www.mmowned.com/forums/bots-programs/266946-release-beta-theexplorer-updated-04-11-09-a.html"
ConsoleWrite("Loading URL: " & $URL & @CRLF)

Global $Source = _InetGetSource($URL)
If Not StringLen($Source) Then
    ;MsgBox(16, @ScriptName, StringFormat("Error loading the following page.\n\nURL: %s", $URL))
    ConsoleWrite("Error loading URL." & @CRLF) ; '" & $URL & "'" & @CRLF)

    Exit
EndIf

; Find posts div
Global $CurrentVersion = ""
Global $posPos = 0
Global $divPos = StringRegExp($Source,'<div[^>]+id\s*=\s*"posts"[^>]*>', 0)
If $divPos Then
    $divPos = @extended
    ; Find first post
    Global $PostID = StringRegExp($Source, '<div[^>]+id\s*=\s*"post_message_(\d+)"[^>]*>', 1, $divPos)
    $postPos = @extended

    If Not @error And StringIsInt($PostID[0]) Then
        ; Find Current version
        $CurrentVersion = StringRegExp($Source, "(?i)Current\s*Version[^:]*:\s*(\d+\.\d+)", 1, $postPos)
        If Not @error Then
            $CurrentVersion = $CurrentVersion[0]
        EndIf
    EndIf
EndIf

If Not StringLen($CurrentVersion) Then
    ;MsgBox(64, @ScriptName, "Current Version: " & $CurrentVersion)
    ConsoleWrite("Version not found" & @CRLF)
Else
    ;MsgBox(16, @ScriptName, "Current Version not found.")
    ConsoleWrite("Current Version: " & $CurrentVersion & @CRLF)
EndIf

; Sleep a little so it's actually possible to check the version
If @Compiled Then
    Sleep(5000)
EndIf

Thanks.
Link to comment
Share on other sites

Edit: But InetGet is better

Not better, just different. If the content you want is created with dynamic HTML or is behind authentication, INetGet will not do it for you, but the IE functions will.

Dale

Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y

Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...