Jump to content

Recommended Posts

Posted

I have a text file and I read each line before doing a StringRegEx()

Each line is similar to this text : 40397782_44_PC6654 97_1044_0040_05_02_001-CH1-Gate1-TOF.tif

I Want extract "0040", I do :

$a_Fit = StringRegExp($s_EndPath, "\d{2,}_(\d{4})_[^\.]+\.tif", 1)
If @error = 0 Then
    ConsoleWrite("$a_Fit :" & $a_Fit[0] & @LF)
EndIf

If I put :

$a_Fit = StringRegExp($s_EndPath, "\d{5,}_(\d{4})_[^\.]+\.tif", 1)

The output is always 0040. Normally the returned value is none because there isn't 5 digits before the first "_"..

After I must extract 1044, so I tested :

$a_Fit = StringRegExp($s_EndPath, "_(\d{2,})_d{4}_[^\.]+\.tif", 1)
    If @error = 0 Then
        ConsoleWrite("$a_Fit :" & $a_Fit[0] & @LF)
    EndIf

It doesn't work fine.

I tested with $ operator at the end and it doesn't work anymore.

The last value I want is "PC6654 97".

As the first 2 didn't work, I didn't try to take out the 3rd one. 

 

Posted (edited)
  On 6/13/2023 at 5:41 PM, _Func said:

I Want extract "0040"...

After I must extract 1044

The last value I want is "PC6654 97"

Expand  

Is your goal to learn regular expressions or is it just to be able to parse the values that you need from the file name? 

Below, you will find an example using StringRegExp() and StringSplit().  They both produce the exact same array of values.  Also, as you can see, there's no need to do multiple function calls against the file name when you can get all of the values at once.

#AutoIt3Wrapper_AU3Check_Parameters=-w 3 -w 4 -w 5 -w 6 -d

#include <Constants.au3>
#include <Array.au3>

Const $FILE_NAME = "40397782_44_PC6654 97_1044_0040_05_02_001-CH1-Gate1-TOF.tif"


stringsplit_example()

regex_example()

Func stringsplit_example()
    Local $aParts = StringSplit($FILE_NAME, "_", $STR_NOCOUNT)

    ;Display parts
    ConsoleWrite("StringSplit Example" & @CRLF)
    ConsoleWrite($aParts[4] & @CRLF)
    ConsoleWrite($aParts[3] & @CRLF)
    ConsoleWrite($aParts[2] & @CRLF)
    ConsoleWrite(@CRLF)

    _ArrayDisplay($aParts, "StringSplit Example")
EndFunc

Func regex_example()
    Local $aParts = StringRegExp($FILE_NAME, "([^_]+)", $STR_REGEXPARRAYGLOBALMATCH)

    ;Display parts
    ConsoleWrite("Regex Example" & @CRLF)
    ConsoleWrite($aParts[4] & @CRLF)
    ConsoleWrite($aParts[3] & @CRLF)
    ConsoleWrite($aParts[2] & @CRLF)
    ConsoleWrite(@CRLF)

    _ArrayDisplay($aParts, "Regex Example")
EndFunc

Console output:

StringSplit Example
0040
1044
PC6654 97

Regex Example
0040
1044
PC6654 97

 

Edited by TheXman
Posted (edited)

I think he is after those specific positions. not the actual numbers in the test pattern.

EDIT, I think Nine has it!

Edited by Shark007
Posted

Thanks to all.

In attachment an extraction of the file I read.

@Nine The Regex works fine with the line I put in the first post, but I saw some lines should not works or should not be processed.

I'm sorry I didn't include a file in the first post.

Thoses lines :

111_1044_4490_00_28_001-CH1-Gate1-AMP.tif

111 isn't a correct number of part.

ESSAI_1044_4490_00_56_001-CH1-Gate2-TOF.tif

ESSAI isn't a correct number of part.

In fact I saw a number of part contain 1 or 2 letters and after many digits before the next underscore.

 

For this lines :

40637383-11_HD112899 Y_1044_4490_00_18_001-CH1-Gate2-TOF.tif

The first group is 1044, the 2nd => 4490 and the last => 00

The correct output should be 1° => HD112899 Y, 2nd => 1044 and the last 4490

 

Other example :

40744918-25_PC944867 8_1044_3035_00_05_002-CH1-Gate1-AMP.tif

The first group is 1044, the 2nd => 3035 and the last => 00

The correct output should be 1° => PC944867 8, 2nd => 1044 and the last 3035

 

Extract.txtFetching info...

Posted
#include <Array.au3>

Local $sRegex = "(?m).+_(.+?)_(1044)_(.+?)_"
Local $sString = "111_1044_4490_00_28_001-CH1-Gate1-AMP.tif" & @CRLF & _
                "40637383-11_HD112899 Y_1044_4490_00_18_001-CH1-Gate2-TOF.tif" & @CRLF & _
                "ESSAI_1044_4490_00_56_001-CH1-Gate2-TOF.tif" & @CRLF & _
                "40397782_44_PC6654 97_1044_0040_05_02_001-CH1-Gate1-TOF.tif"

Local $aArray = StringRegExp($sString, $sRegex, 3)
_ArrayDisplay($aArray)

 

Posted

I changed (1044) with (\d{4}) because it's not always 1044. It's always 4 digits but not always 1044

 

When I changed it doesn't work

Posted
  On 6/13/2023 at 7:28 PM, _Func said:

In fact I saw a number of part contain 1 or 2 letters and after many digits before the next underscore

Expand  

?

#include <Array.au3>

$s = FileRead("Extract.txt")
$res = StringRegExp($s, "(?m)^.+_([A-Z].+?)_(\d+)_(\d+).+$", 3)
;_ArrayDisplay($res)

Local $n = UBound($res), $k = 3
Local $res2D[Ceiling($n/$k)][$k]
For $i = 0 To $n - 1
    $res2D[Int($i / $k)][Mod($i, $k)] = $res[$i]
Next
_ArrayDisplay($res2D)

 

Posted

StringSplit doesn't satisfies the criteria?

;~  For this lines :
;~ 40637383-11_HD112899 Y_1044_4490_00_18_001-CH1-Gate2-TOF.tif
;~ The first group is 1044, the 2nd => 4490 and the last => 00
;~ The correct output should be 1° => HD112899 Y, 2nd => 1044 and the last 4490

Local $aGroup = _GetFileGroup("40637383-11_HD112899 Y_1044_4490_00_18_001-CH1-Gate2-TOF.tif")
ConsoleWrite("$aGroup[0]=" & $aGroup[0] & @CRLF)
ConsoleWrite("$aGroup[1]=" & $aGroup[1] & @CRLF)
ConsoleWrite("$aGroup[2]=" & $aGroup[2] & @CRLF)
ConsoleWrite("$aGroup[3]=" & $aGroup[3] & @CRLF)


Func _GetFileGroup($sFileName)
    Local $aParts = StringSplit($sFileName, "_")
    Local $aResult[4] = [$sFileName, $aParts[2], $aParts[3], $aParts[4]]
    Return $aResult
EndFunc   ;==>_GetFileGroup

 

I know that I know nothing

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...