Jump to content

_StringInsert Stops working


Recommended Posts

 

hello

like the title says when _stringinsert is in a loop it fails and stops inserting after a random number of times, for below code it fails after 65 times and should insert 1000 times, other code was 335 times out of 3000…

can someone shed a light on whats  going on? I really really  need _stringinsert to be working right now

 

#include <File.au3>
#include <String.au3>

$Handle = FileOpen("SampleDataFile.txt", 0)
$Read = FileRead($Handle)
$X = 0
$Counter = 0
for $I = 1 to StringLen($Read)
    $X += 1
    if $X = 1000 Then
        $Counter += 1
        $Read = _StringInsert($Read,"ABCDE"&$Counter&",",$I)
        $X = 0
        EndIf
next
FileWrite("InsertTest.txt",$Read)

 

After it runs you can check the number of times inserted by searching the output text file for ABCDE + Counter + ,  ( The number of times inserted).

Im sure there are easier ways to test it but this is what I can think of right now.

 

SampleDataFile.txt

Edited by CrypticKiwi
Link to comment
Share on other sites

Scratch all that ridiculousness i just posted, you may have to chunk it...

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

Funny problem, (more than one issue):
1) first problem: The for $I = 1 to StringLen($Read) line gives a wrong result because the "StringLen($Read)" value is calculated only once at the start of the loop, and since the length of the $Read string increases within the loop, the loop should increase as well his ending point, but you can not change that parameter once the loop has started. So you should use another way to scan the whole string while the string itself is changing its lenght
2) second problem: actually It seems that the _StringInsert() fails if called repeatedly many times within the loop. Since that function uses RegExp patterns and since my skilling about regexp is close to zero, maybe some regexp genius could look into the matter.
anyway I tryed to change a bit the _StringInsert() function so that it makes use of native string functions instead of regexp, and it seems that it works in that way.
Here a "workaround" script that tries to solve above issues:

if you use the original  _StringInsert() function it seems that it fails, while if you comment line 11 and uncomment line 12 so to use the modified  _StringInsert_MOD it seems it suceed in the goal

#include <File.au3>
#include <String.au3>

Local $Read
For $i = 1 To 1155815
    $Read &= Random(0, 9, 1) ; generate a string similar to your text file
Next
Local $X = 1000, $Counter = 0
Do
    $Counter += 1
    $Read = _StringInsert($Read, "ABCDE" & $Counter & ",", $X)
    ; $Read = _StringInsert_MOD($Read, "ABCDE" & $Counter & ",", $X)
    $X += 1000
Until StringLen($Read) < $X ; use Do ... Until instead of For ... Next

; Test if generated string contains last insertion
If StringInStr($Read, "ABCDE" & $Counter & ",") Then
    MsgBox(0, "_StringInsert", "OK last inserted string was found")
Else
    MsgBox(0, "_StringInsert", "Error: Inserted string not found")
EndIf

; modified version of the original _StringInsert()
Func _StringInsert_MOD($sString, $sInsertString, $iPosition)
    ; Casting Int() takes care of String/Int, Numbers
    $iPosition = Int($iPosition)

    ; Retrieve the length of the source string
    Local $iLength = StringLen($sString)

    ; Check the insert position isn't greater than the string length
    If Abs($iPosition) > $iLength Then
        Return SetError(1, 0, $sString) ; Invalid position as it's greater than the string length
    EndIf

    ; Check if the source and insert strings are strings and convert accordingly if not
    If Not IsString($sInsertString) Then $sInsertString = String($sInsertString)
    If Not IsString($sString) Then $sString = String($sString)
    ; Escape all "\" characters in the string to insert - otherwise they do not appear
    $sInsertString = StringReplace($sInsertString, "\", "\\")
    ; Insert the string
    If $iPosition >= 0 Then
        ; Return StringRegExpReplace($sString, "(?s)\A(.{" & $iPosition & "})(.*)\z", "${1}" & $sInsertString & "$2") ; Insert to the left hand side
        Return StringLeft($sString, $iPosition) & $sInsertString & StringRight($sString, $iLength - $iPosition) ; <----- modified here
    Else
        $iPosition = Abs($iPosition)
        ; Return StringRegExpReplace($sString, "(?s)\A(.*)(.{" & - $iPosition & "})\z", "${1}" & $sInsertString & "$2") ; Insert to the right hand side
        Return StringLeft($sString, $iLength - $iPosition) & $sInsertString & StringRight($sString, $iPosition) ; <----- modified here
    EndIf
EndFunc   ;==>_StringInsert_MOD

 

Edited by Chimp

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

Here is a reading from file, and, a writing to file exercise.

#include <FileConstants.au3>
Opt("WinTitleMatchMode", 2) ;1=start, 2=subStr, 3=exact, 4=advanced, -1 to -4=Nocase

Local $Handle = FileOpen("SampleDataFile.txt", $FO_READ)
Local $hFileWrite = FileOpen("InsertTest.txt", $FO_APPEND)
Local $Counter = 0, $Read

While 1
    $Read = FileRead($Handle, 1000)
    If @error Then ExitLoop
    $Counter += 1
    FileWrite($hFileWrite, $Read & "ABCDE" & $Counter & ",")
WEnd

FileClose($Handle)
FileClose($hFileWrite)

; ------ Display results & Clean Up ---------------
Local $iPid = ShellExecute("InsertTest.txt")
WinWait("InsertTest.txt")

MsgBox(0, "Note", $Counter & ' - "ABCDE[nnn]n," inserted.' & @CRLF & @CRLF & 'To end and delete "InsertTest.txt" file.' & @CRLF & 'Press "Ok"', 0)
ProcessClose($iPid)
FileDelete("InsertTest.txt")

Edit: Added the following.

Further to highlight the unnecessary use of the _StringInsert function for this thread, this example uses no file operations to achieve the desired result.

; https://www.autoitscript.com/forum/topic/173630-_stringinsert-stops-working/?do=findComment&comment=1256163
;-------------------- Get $Read Data -------------------
Local $Read ;= FileOpen("SampleDataFile.txt") ; $FO_READ (0) = Read mode (default)
For $i = 1 To 1155815 ; 1155815
    $Read &= Random(0, 9, 1) ; generate a string similar to your text file
Next
;---------------> End of Get $Read Data -----------------

Local $X = 1000, $Counter = 0, $sRetStr = ""

Do
    $Counter += 1
    $sRetStr &= StringLeft($Read, $X) & "ABCDE" & $Counter & ","
    $Read = StringTrimLeft($Read, $X)
Until $Read = ""

MsgBox(0, "Note", $Counter & ' - "ABCDE[nnn]n," inserted.')

 

Edited by Malkey
Additional infomation.
Link to comment
Share on other sites

anyway, this seems the proof that the function _StringInsert() has issues,
or better, maybe is the RegExp engine on which the function relies that accuses some failure if stressed in that way?

it would be interesting if someone could understand where the problem is.... :think:

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

  • Developers

The issue seems to be that the position in the regex is a 16 bits value, so can't be bigger than 65,535.

Initial test:

@@ Debug1(113) : $sInsertString = ABCDE65, : $iPosition = 65000  pat:(?s)\A(.{65000})(.*)\z >Error code: 0
@@ Debug1(113) : $sInsertString = ABCDE66, : $iPosition = 66000  pat:(?s)\A(.{66000})(.*)\z >Error code: 2

More focussed test:

@@ Debug1(113) : $sInsertString = ABCDE6, : $iPosition = 65535  pat:(?s)\A(.{65535})(.*)\z >Error code: 0
@@ Debug1(113) : $sInsertString = ABCDE7, : $iPosition = 65536  pat:(?s)\A(.{65536})(.*)\z >Error code: 2

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

  • Developers

From the PCRE website: http://www.pcre.org/current/doc/html/pcre2limits.html

pcre2limits man page

Return to the PCRE2 index page.

This page is part of the PCRE2 HTML documentation. It was generated automatically from the original man page. If there is any nonsense in it, please consult the man page, in case the conversion went wrong.

SIZE AND OTHER LIMITATIONS

There are some size limitations in PCRE2 but it is hoped that they will never in practice be relevant.

The maximum size of a compiled pattern is approximately 64K code units for the 8-bit and 16-bit libraries if PCRE2 is compiled with the default internal linkage size, which is 2 bytes for these libraries. If you want to process regular expressions that are truly enormous, you can compile PCRE2 with an internal linkage size of 3 or 4 (when building the 16-bit library, 3 is rounded up to 4). See the README file in the source distribution and the pcre2build documentation for details. In these cases the limit is substantially larger. However, the speed of execution is slower. In the 32-bit library, the internal linkage size is always 4.

The maximum length (in code units) of a subject string is one less than the largest number a PCRE2_SIZE variable can hold. PCRE2_SIZE is an unsigned integer type, usually defined as size_t. Its maximum value (that is ~(PCRE2_SIZE)0) is reserved as a special indicator for zero-terminated strings and unset offsets.

Note that when using the traditional matching function, PCRE2 uses recursion to handle subpatterns and indefinite repetition. This means that the available stack space may limit the size of a subject string that can be processed by certain patterns. For a discussion of stack issues, see the pcre2stack documentation.

All values in repeating quantifiers must be less than 65536.

There is no limit to the number of parenthesized subpatterns, but there can be no more than 65535 capturing subpatterns. There is, however, a limit to the depth of nesting of parenthesized subpatterns of all kinds. This is imposed in order to limit the amount of system stack used at compile time. The limit can be specified when PCRE2 is built; the default is 250.

There is a limit to the number of forward references to subsequent subpatterns of around 200,000. Repeated forward references with fixed upper limits, for example, (?2){0,100} when subpattern number 2 is to the right, are included in the count. There is no limit to the number of backward references.

The maximum length of name for a named subpattern is 32 code units, and the maximum number of named subpatterns is 10000.

The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb is 255 for the 8-bit library and 65535 for the 16-bit and 32-bit libraries.

So guess somebody will have to revisit that UDF.

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

  • Developers

I think the limitations of PCRE shouldn't be detailed/mixed with the Autoit3 limitation but maybe a pointer to their webpage would be good when mentioned in the PCRE pages of the helpfile?

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

Few remarks: Jos, beware that refering to the PCRE2 docs may be misleading. PCRE2 is a completely new interface and library; it shares most of PCRE core code but there are significant differences. AutoIt links to PCRE for now and given the work Jon still has for keeping the forum floating, I don't see an urgent need to rewrite that part of internal AutoIt functions.

Besides, PCRE and PCRE2 both have the said limitation. And there is more under the hood: a repeating pattern like (abcdef{42}) is expanded in abcdef repeated 42 times at pattern compile time. That "unrolling of loops" can make the pattern explode well before the 64K limit. I've submitted the idea of introducing a small set of bytecode instructions to avoid unrolling repeats but clearly, Philip didn't implement this yet. From what I can tell he merely concentrated to building a more efficient and powerful interface to the existing engine, buèt the core engine has been left mostly unchanged. There have been a number of dark corners fixes recently and a painful work with EBCDIC support so this idea is still in the todo tray.

Yes, I've mentionned the 64K limit on the number of groups in the help but forgot to mention the 64K limit on repetition. Maybe this is the right time to create a Limitations section for mentionning all this, rather than mentionning limits in the middle of the blurb. Regexp are not very easy to explain in few words and exposing every fine-grain detail would only paraphrasing the voluminous official PCRE documentation that only advance users bother to read fully.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

  • Developers

Thanks for the precision, I came across the 64K limit at several places so assumed that was the issue as my tests revealed. :)

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

Certainly. The "unrolling of repetitions" has another effect on the stack, as previously discussed in a lengthy thread. AutoIt links PCRE with "use stack" compile option. With repeated patterns like (?:.*\N){14000} there is a 14000 stack frames in the compiled code generated by the engine since it may have to backtrack, hence it needs to keep track of where each subpattern starts. Because the stack is limited and not settable at launchtime (unlike Unices) trancexx advised to use the heap instead, even if this has the drawback of silently allowing badly formed patterns to silently run. This abuse of stack is a problem with complex patterns applied to large subjects and I start to think the best would be to follow trancexx advice and get rid of hard crashes in such cases.

As a rather technical aside, the JIT (Just In Time) engine does fold repetitions back into a loop and doesn't have the same problem, but the catch is that JIT runs from the output produced by the core compiler...

Using the heap, switching to PCRE2 interface, allowing callbacks and inspection of internal structures produced on the fly by *MARK, *PRUNE and other backtracking control verbs will need to be considered for the next full release of AutoIt. But I don't want to bother Jon with any of those for the time being.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...