Jump to content
Sign in to follow this  
littlebigman

[SOLVED] [newbie/StringRegExp] Error: Subscript used with non-Array variable

Recommended Posts

littlebigman

Hello

I started learning AutoIt today, and I like it very much.

However, I couldn't find why StringRegExp() doesn't seem to be able to extract a bit of information from a web page that I copy/paste from the Google Chrome browser into the clipboard:

WinWaitActive("List of companies - Chrome")
Sleep(500)
;Select and copy current web page, and look for pattern using regex
_ClipBoard_Empty()
Send("^a^c")
$clipboard = _ClipBoard_GetData()

$nbrfound = StringRegExp($clipboard, '^(\d+) companies ', 1)
;Check that array is valid, to avoid "Subscript used with non-Array variable"
If Not IsArray($nbrfound) Then
    MsgBox(48,"Error","Not an array")
Else
    MsgBox(0,"My title",$nbrfound[0])
EndIf

Thank you for any hint.

Edited by littlebigman

Share this post


Link to post
Share on other sites
enaiman

Be sure first that you really get the content of the webpage in the Clipboard.

Using a simple MsgBox before RegEx will show you what are you deal with.


SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script

wannabe "Unbeatable" Tic-Tac-Toe

Paper-Scissor-Rock ... try to beat it anyway :)

Share this post


Link to post
Share on other sites
littlebigman

Thanks for the tip. Unfortunately, StringRegExp() still fails, although the contents of the page that I copy into the clipboard and then into a variable (as either default or $CF_TEXT) is successfully displayed in a MsgBox :idea: I'm beginning to wonder if what is displayed by MsgBox() isn't what the variable really contains, which would explain why StringRegExp() fails.

Here's the code:

#include <Clipboard.au3>

;=========== 1. Empty clipboard and check that it's really empty
ClipPut("")
$clipboard = _ClipBoard_GetData()
MsgBox(0,"Checking contents of clipboard",$clipboard)

;=========== 2. Wait for browser window to be displayed
WinWaitActive("List of companies - Chrome")
Sleep(500)

;=========== 3. Copy current page to clipboard
Send("^a^c")
Sleep(500)
;Makes no difference: Text is displayed OK in MsgBox, regardless
;$clipboard = _ClipBoard_GetData($CF_UNICODETEXT)
;$clipboard = _ClipBoard_GetData($CF_TEXT)
$clipboard = _ClipBoard_GetData()
MsgBox(0,"Contents of clipboard",$clipboard)

;=========== 4. Use regex to extract information from text
;^123 companies
$nbrfound = StringRegExp($clipboard, '^(\d+) companies', 1)
If @error <> 0 Then
    ;Error=1 Array is invalid. No matches.
    MsgBox(48,"Error",@error)
Else
    MsgBox(0,"Pattern found",$nbrfound[0])
EndIf

Has someone already struggled with text copied from a web page into the clipboard?

Thank you.

Share this post


Link to post
Share on other sites
99ojo

Hi,

for more debugging:

$nbrfound = StringRegExp($clipboard, '^(\d+) companies ', 1)

If @error Then MsgBox (0,"", "Error: " & @error & " Extended: " & @extended); insert line straight after your StringRegExp Call

See helpfile:

Flag = 1 or 2 :

@Error Meaning

0 Array is valid. Check @Extended for next offset

1 Array is invalid. No matches.

2 Bad pattern, array is invalid. @Extended = offset of error in pattern.

;-))

Stefan

Share this post


Link to post
Share on other sites
littlebigman

Thanks Stefan for the tip. I do get an error (Error 1 Extended 0).

Further investigating, I'm seeing some unexpected behavior:

1. @CRLF doesn't add a #13#10 to a string, and just returns garbage (eg. "0"):

$clipboard = "dontgetit"
;Displayed OK
MsgBox(0,"Contents of clipboard",$clipboard)

$clipboard = "dontgetit" + @CRLF
;Why "0" ?
MsgBox(0,"Contents of clipboard",$clipboard)

2. StringRegExp() works OK when I'm using a single string, but fails when working on the web page (obviously filled with CRLF's...) that I copy into the clipboard:

;========== GOOD
$clipboard = "123 companies"
$nbrfound = StringRegExp($clipboard, "^(\d+) companies", 1)
;Displays "123", as expected
If @error Then
    MsgBox (0,"", "Error: " & @error & " Extended: " & @extended)
Else
    MsgBox(0,"Result",$nbrfound[0]
EndIf
;========== BAD
ClipPut("")
WinWaitActive("List of companies - Chrome")
Sleep(500)
Send("^a^c")
Sleep(500)
$clipboard = _ClipBoard_GetData()
$nbrfound = StringRegExp($clipboard, "^(\d+) companies", 1)
If @error Then
    MsgBox (0,"", "Error: " & @error & " Extended: " & @extended)
Else
    MsgBox(0,"Result",$nbrfound[0]
EndIf
;==========

I have a couple of questions:

1. Has someone successfully used StringRegExp() with more than just a single-line string?

2. How can I build a string with CRLF's?

Thank you.

Edited by littlebigman

Share this post


Link to post
Share on other sites
Tvern

instead of

$clipboard = "dontgetit" + @CRLF

use

$clipboard = "dontgetit" & @CRLF

For the regexp I think you might be able to use "\v" which matches any vertical whitespace character.

Share this post


Link to post
Share on other sites
littlebigman

Thanks for the tip on &/+.

I don't know what a "vertical whitespace character" is compared to just a space character (" " or \s).

It seems like StringRegExp() can only work one line at a time:

;Found OK
$clipboard = "123 a single line"
$nbrfound = StringRegExp($clipboard, "^(\d+)", 1)
MsgBox(0,"Pattern found",$nbrfound[0])

$clipboard = "first line" & @CRLF & "123 companies"
$nbrfound = StringRegExp($clipboard, "^(\d+)", 1)
;Error 1 Extended 0
If @error Then
    MsgBox (0,"", "Error: " & @error & " Extended: " & @extended)
Else
    MsgBox(0,"Pattern found",$nbrfound[0])
EndIf

Before I change the code to loop through each line of the clipboard and call StringRegExp on it, can someone confirm that StringRegExp() can only work with a single line?

Thank you.

Share this post


Link to post
Share on other sites
Tvern

I'd say space ( Chr(32) ) is an horizontal whitespace character and @LF/@CR ( Chr(10) / Chr(13) ) are vertical whitespace characters, which implies that stringregexp can work on multiple lines, but that's just my interpetation of the helpfile I havn't tested it.

edit: this regexp works on a multi-line string:

#include<array.au3>
Local $string
For $i = 0 to 10
    $string &= "this is line " & $i & @CRLF ;create a multi-line string
Next

MsgBox(0,"test",$string) ;check if we realy created a multi-line string

$result = StringRegExp($string,"(this is line \d*)",3) ;search for matching strings
_ArrayDisplay($result) ;display the result
Edited by Tvern

Share this post


Link to post
Share on other sites
littlebigman

Thanks Tvern. I ran the sample above, and it does work... but I'm still unsuccessfully using StringRegExp() to extract data from the clipboard. I have no idea what else I could try :-/ This is all the more frustrating since if I paste the clipboard into UltraEdit and paste the regex, UE has no problem finding the pattern.

;====== 1. Wait for browser
WinWaitActive("List of companies - Chrome")
Sleep(500)

;====== 2. Copy page to clipboard
Send("^a^c")
Sleep(500)

;====== 3. Copy clipboard to string
$clipboard = _ClipBoard_GetData()
;OK I can see the pattern I'm looking for
MsgBox(0,"Contents of clipboard",$clipboard)

;====== 4. Extract data from string
$nbrfound = StringRegExp($clipboard, "^(\d+) companies", 3)
If @error Then
    MsgBox (0,"", "Error: " & @error & " Extended: " & @extended)
Else
    _ArrayDisplay($nbrfound)
EndIf

If someone has an idea... Maybe StringRegExp() needs some extra setting to work on a big block of text?

Share this post


Link to post
Share on other sites
Tvern

sounds like the pattern just isn't right for what you are trying to do. If you add an example string and the desired output I'm sure the usual regexp guru's will be on it like flies on you know what.

Share this post


Link to post
Share on other sites
littlebigman

Go for it boys ;-)

$clipboard = "Dummy"& @CRLF & "123 companies"& @CRLF & "Dummy dummy"
$nbrfound = StringRegExp($clipboard, "^(\d+) companies", 3)
If @error Then
    MsgBox (0,"", "Error: " & @error & " Extended: " & @extended)
Else
    _ArrayDisplay($nbrfound)
EndIf

Share this post


Link to post
Share on other sites
Malkey

Try,

#include <Array.au3>

Local $clipboard = "Dummy" & @CRLF & "123 companies" & @CRLF & "Dummy dummy" & @CRLF & _
        " 321 companies" & @CRLF & "121 companies" & @CRLF


; This RE pattern with flag = 3, captures only the digit/s at the beginning of any line
; which are followed by a space then "companies".
$nbrfound = StringRegExp($clipboard, '(?m)^(\d+) companies', 3); Use 3 for global match
If @error Then
    MsgBox(0, "", "Error: " & @error & " Extended: " & @extended)
Else
    _ArrayDisplay($nbrfound)
EndIf

Share this post


Link to post
Share on other sites
littlebigman

Damn, PCRE is single-line by default :idea:

The CHM file doesn't say: Is it possible to configure PCRE in AutoIT (to activate PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and PCRE_EXTENDED) so that I don't have to include settings in the pattern every time?

Share this post


Link to post
Share on other sites
enaiman

I wonder - why do you use the Clipboard?? There is no need to involve any copy-paste.

There are a couple _IE functions to help: _IEBodyReadHTML, _IEDocReadHTML


SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script

wannabe "Unbeatable" Tic-Tac-Toe

Paper-Scissor-Rock ... try to beat it anyway :)

Share this post


Link to post
Share on other sites
jchd

Damn, PCRE is single-line by default :idea:

The CHM file doesn't say: Is it possible to configure PCRE in AutoIT (to activate PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and PCRE_EXTENDED) so that I don't have to include settings in the pattern every time?

These are build-time options!

PCRE is oh well, PCRE! If we use non-standard options like these, we completely loose compatibility with Perl, most other PCRE engines and, most importantly, our own AutoIt code base.

That we sacrifice all that just for your convenience so you don't have to write a 4-line function to circumvent the "problem" you see is a little too much to ask.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
littlebigman

I wonder - why do you use the Clipboard?? There is no need to involve any copy-paste. There are a couple _IE functions to help: _IEBodyReadHTML, _IEDocReadHTML

Because I'm only getting started with AutoIT and didn't know about those functions :idea:

Thanks, the script is done and I happily downloaded the web pages I needed.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.