Jump to content

Recommended Posts

Posted (edited)

Hi @jchd

As you master RegEx & StringFormat, could you please have a look at these 2 answers just made by ChatGPT ? Thanks :)

1st question to ChatGPT

Hello, please have a look at this regular expression

^([LRClrc]|[0-9]|%[\-+0\x20#]?\d*(?:\.\d+)?[diouxXeEfgGs])$

Is the third part of the alternation fully compatible with the AutoIt StringFormat function ?

1st answer from GPT-4o mini

The regular expression you provided is designed to match specific patterns, and the third part of the alternation is:

%[\-+0\x20#]?\d*(?:\.\d+)?[diouxXeEfgGs]
 
This part is intended to match format specifiers similar to those used in C-style string formatting functions, which is also relevant for the AutoIt StringFormat function.

Breakdown of the Third Part:

  1. %: Indicates the start of a format specifier.
  2. [\-+0\x20#]?: Matches an optional flag character:
    • -: Left-justify the output.
    • +: Forces to include a sign for numeric types.
    • 0: Pads the output with zeros instead of spaces.
    • \x20: Represents a space character.
    • #: Used for alternate form (e.g., for o, x, X).
  3. \d*: Matches an optional width specifier, which can be one or more digits.
  4. (?:\.\d+)?: Matches an optional precision specifier, which starts with a dot followed by one or more digits.
  5. [diouxXeEfgGs]: Matches the conversion specifier, which can be:
    • d: Decimal integer.
    • i: Integer.
    • o: Octal.
    • u: Unsigned decimal.
    • x: Hexadecimal (lowercase).
    • X: Hexadecimal (uppercase).
    • e: Scientific notation (lowercase).
    • E: Scientific notation (uppercase).
    • f: Floating-point.
    • g: General format (lowercase).
    • G: General format (uppercase).
    • s: String.

Compatibility with AutoIt StringFormat:

The AutoIt StringFormat function uses a similar syntax for format specifiers. The format specifiers in your regex are compatible with AutoIt, as it supports:

  • Numeric types (d, i, u, o, x, X, e, E, f, g, G).
  • Flags (-, +, 0, #).
  • Width and precision specifiers.

Conclusion:

Yes, the third part of the alternation in your regular expression is fully compatible with the AutoIt StringFormat function. It correctly captures the format specifiers that AutoIt can interpret for string formatting.

2nd question to ChatGPT

Hello, in the preceding regular expression [\-+0\x20#]? corresponds to the optional flag character of the AutoIt StringFormat function. Is it correct to use "?" as the repeating character (0 or 1 time) or should we use "*" as the repeating character (0 or more times) to make it fully compatible with the C-style string formatting functions ?

2nd answer from GPT-4o mini

In the context of the AutoIt StringFormat function and C-style string formatting functions, the use of ? for the optional flag character is correct. The ? quantifier means that the preceding element (in this case, the character class [\-+0\x20#]) can appear either 0 or 1 time, which is appropriate for optional flags.

Using * instead of ? would mean that the flag characters could appear 0 or more times, which is not correct for format specifiers in C-style formatting. In these formats, the flags are typically limited to a single occurrence (or none), so ? is the appropriate choice.

Summary

  1. The third part of the regular expression is compatible with AutoIt StringFormat.
  2. Using ? for the optional flag character is correct, as flags should appear 0 or 1 time, not 0 or more times.

==============================================

So jchd, could you please confirm that the optional flag should appear only once in the AutoIt StringFormat function ?

Because in AutoIt help file (topic StringFormat function) we can read this kind of sentence in the optional flag section...

"If 0 and - appear, the 0 is ignored"
"the blank is ignored if both the blank and + flags appear"

...which seems to indicate that several characters are possible as flags ? Do we have useful cases where several characters are really needed in the optional flag section ?

^([LRClrc]|[0-9]|%[\-+0\x20#]?\d*(?:\.\d+)?[diouxXeEfgGs])$

For the record, this Regex pattern is related to @ioa747 thread _StringToTable() as we're working on it, allowing the user to have the possibility to indicate not only [LRClrc] or [0-9] but also a valid StringFormat expression (starting with %) . The final tests look great, fingers crossed.

Thanks for reading and have a great evening :bye:

Edited by pixelsearch
typo

"I think you are searching a bug where there is no bug... don't listen to bad advice."

Posted

Well, this is openning a can of worms!

Nowhere in the AutoIt specification of flags is mentionnend whether more than one occurence of a given flag is permitted or acted upon: is "%+++123.0f" the same as "%+123.0f". Seems easy to check in a few cases, but still without clear spec, we can't rely on that being innocuous.

We do have a hint that more than one distinct flag is supported (albeit ignored) :

  • If 0 and - appear, the 0 is ignored.
  • the blank is ignored if both the blank and + flags appear.

To provide a partial answer we need to test all combinations of flags. From "" to "-+*0#" (where * stands for a 0x20 whitespace for clarirty).

Below is some code to do that and reveal that many flags combinations are not supported:

; There are 5 flag specifiers: '-', '+', '0', ' ', '#' in the StringFormat specification.
; If one assumes that a given flag may only appear once in the flags specification
; and given that more than one flag may be specified, the total numbers of flag strings
; is the power set of the set of flag specifiers. There are therefore 2^5 flag strings to
; test.
; This also assumes that the order of flags in a given flag string is irrelevant.
;
; It is easy to build that power set using binary mapping.

Local $aFlags = ['-', '+', '0', ' ', '#']
Local $aFlagComb[2^5][2]
Local $aComb2
For $i = 0 To 2^5 - 1
    $aComb2 = StringSplit(StringFormat("%05s", _IntToString($i, 2)), "", $STR_NOCOUNT)
    For $j = 0 To 4
        $aFlagComb[$i][0] &= ($aComb2[$j] = 1 ? $aFlags[$j] : "")
    Next
Next

For $i = 0 To UBound($aFlagComb) - 1
    $aFlagComb[$i][1] = StringFormat("%" & $aFlagComb[$i][0] & "10.0f", 123)
    $aFlagComb[$i][0] = StringReplace($aFlagComb[$i][0], " ", "*")
Next

_ArrayDisplay($aFlagComb)

Func _IntToString($i, $base = 16)
    Return DllCall("msvcrt.dll", "wstr:cdecl", "_i64tow", "int64", $i, "wstr", "", "int", $base)[0]
EndFunc   ;==>_IntToString

It is also unclear whether the order of individual flags matter inside a given input. To fully answer this new question, one would need to create new entries in $FlagComb where the length of the flag string is > 1, split that string in individual characters and add new entries made from ArrayPermute.

If you're cautious, also create new entries to cope with duplicate flags in various position(s) inside the flags strings to see if ever some rule emerges.

Making a long answer short: both AI answers are incorrect.

(My) conclusion: never trust a pile on unknown hardware fed with social networks toilet noises.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Posted (edited)

Hello @jchd
I'm gonna test soon the output of your code (not tomorrow as it wil be a great day with family !)

56 minutes ago, jchd said:

[...] whether more than one occurence of a given flag is permitted or acted upon: is "%+++123.0f" the same as "%+123.0f"

A few hours ago, I did ask the bot (for the 3rd time in the same day !) how to prevent a flag from appearing twice (or more) and it answered me the following (untested) that I kept in a text file, just in case, losing the nice output format display :

===========
My question
Hello, this is a regex question : how to check for unique output values from the following pattern ?

The pattern is [\-+0\x20#]*
If the user types "-0" then accept it, because each character is unique in the output
If the user types "--0" then reject it, because the "-" character is not unique in the output

What pattern would allow this ?

==============
ChatGPT answer

To create a regex pattern that checks for unique characters in the input string based on the specified character set [-+0\x20#]*, you can use a negative lookahead assertion to ensure that no character appears more than once.

Here's a regex pattern that accomplishes this:

regex

^(?!.*(.).*\1)[\-+0\x20#]*$

Explanation:

    ^ asserts the start of the string.
    (?!.*(.).*\1) is a negative lookahead that checks for any character (.) followed by any characters .* and then the same character \1 again. If this condition is met, the match fails.
    [\-+0\x20#]* matches any combination of the specified characters (including zero occurrences).
    $ asserts the end of the string.

Usage:

    If the input is "-0", it will match because both characters are unique.
    If the input is "--0", it will not match because the "-" character is repeated.

You can use this regex pattern in your programming language of choice to validate the input accordingly.
===========

By the way and if not mistaken : if correctly placed, the useful flag "0" could always be treated like the following, as it's "eaten" by the next token \d* found in the pattern. For example :

reminder of the actual pattern :
%[\-+0\x20#]?\d*(?:\.\d+)?[diouxXeEfgGs]

reminder of the help file :
"If width is prefixed with 0, zeros are added until the minimum width is reached."

$v = "65535"
ConsoleWrite(">" & StringFormat("%#010x", $v) & "<" & @crlf & @crlf)
>0x0000ffff<

Please note how "%0#10x" won't work in this situation, it would simply output >%0#10x<

Though "#" and "0" are both "flags", the fact that we have to place "0" just before the optional width "10" should always solve this kind of situation, in case we keep the simpler "?" as repeating character for the optional flag class.

For the record, I just discovered ChatGPT today and we started... with an argument, that was fun !
Probably @mikell kicked the chatbox a.. up in the clouds, forcing it to change its answer in this post :D

Thanks for your answer and have a great week-end :bye:

Edited by pixelsearch
typo

"I think you are searching a bug where there is no bug... don't listen to bad advice."

Posted (edited)
On 5/17/2025 at 7:51 PM, jchd said:

We do have a hint that more than one distinct flag is supported (albeit ignored) :

  • If 0 and - appear, the 0 is ignored.
  • the blank is ignored if both the blank and + flags appear.

(albeit ignored) is important in your answer, because indicating these 2 flags at same time will output same as if only 1 flag had been picked up.

Your code is instructive as _ArrayDisplay shows the result of the 32 possible cases (combination of any number of elements in an ensemble composed of 5 elements => 31+1 possibilities)

What we notice is that only 8 combinations (out of 32) display correctly the number 123
  1 = no flag at all
+5 = each flag picked up separately (123. is ok when # is applied to 123 with a "f" type, help file)
+1 = "-0" flags ("If 0 and - appear, the 0 is ignored", help file)
+1 = "+0" flags (sign is correctly displayed and 0's are correctly added) : we'll discuss this case below, in point 2)

We note that "+-" or "-+" don't work (align to left and show sign)
Also "0+10" doesn't work while "+010" works, same for "0#10x" which doesn't work while #010x" works (previous post)

1) These tests push me to keep the ? quantifier, which seems easier to manage than the * quantifier (no need of a complementary test to make sure that a flag is not duplicated) :

[\-+0\x20#]?

2) I'd even go further, by removing the "0" flag from the flags Class, because when needed, 0 can always be placed just before the optional width part, it will be correctly recognized and "eaten" by the \d* token, like this :

[\-+\x20#]?\d+

So this could be the final pattern for ioa747 script :

^([LRClrc]|[0-9]|%[\-+\x20#]?\d*(?:\.\d+)?[diouxXeEfgGs])$

And well... if an issue should appear with this pattern, then it will be time to reopen this discussion :D
Thanks jchd for your help

Edited by pixelsearch
typo

"I think you are searching a bug where there is no bug... don't listen to bad advice."

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   1 member

×
×
  • Create New...