Jump to content

Extracting part of a file name?


ReconX
 Share

Recommended Posts

I've read a few of the examples after searching, and used them for reference. I'm stuck after trying a few things, and would like some guidance

I am trying to extract portions of a file name. For example; C:\Directory\Mary Poppins (1964).mkv. I would like my ListView to show, Name=Mary Poppins & Year=1964. I have the year extracting, but I cannot figure out the name part of it. 

Here is what I have so far; 

$listview = GUICtrlCreateListView("File Name|Year", -101, 40, 420, 190) ;,$LVS_SORTDESCENDING)
    For $n = 1 To UBound($FileList) - 1
        $extractName = StringMid($FileList[$n], StringInStr($FileList[$n], "(", 2, 1, 1) +1)
        $File_Name = $extractName
        $extractYear = StringMid($FileList[$n], StringInStr($FileList[$n], "(", 2, -1) + 1)
        $File_Year = StringMid($extractYear, 1, 4)
        $item = GUICtrlCreateListViewItem($File_Name & "|" & $File_Year, $listview)
    Next

Any help is appreciated. In the meantime, I'll try to correct it.

Link to comment
Share on other sites

had some sleepless time :P  there you go:

#include <Array.au3>
#include <StringConstants.au3>
#include <File.au3>
#include <ListViewConstants.au3>
Local $aPath[10]


$aPath[0] = "C:\Directory\Mary Poppins (2020).mkv"
$aPath[1] = "C:\Directory\Terminator 3(1234).mkv"
$aPath[2] = "C:\Directory\Pirates of the caribbean 4 (1934).mkv"
$aPath[3] = "C:\Directory\Test film 123(1324).mkv"
$aPath[4] = "C:\Directory\What else (1923).mkv"
$aPath[5] = "C:\Directory\and so on (1969).mkv"
$aPath[6] = "C:\Directory\why i do so much 34 (1999).mkv"
$aPath[7] = "C:\Directory\Mary Poppins (2000).mkv"
$aPath[8] = "C:\Directory\Mary Poppins (2001).mkv"
$aPath[9] = "C:\Directory\Mary Poppins 3(2005).mkv"



$hGUI = GUICreate("RegExp", 450, 450)

$listview = GUICtrlCreateListView("File Name|Year", 0, 0, 400, 400) ;,$LVS_SORTDESCENDING)
Local $aSplitted = _Split($aPath)
For $i = 0 To UBound($aSplitted) - 1
    $item = GUICtrlCreateListViewItem($aSplitted[$i][0] & "|" & $aSplitted[$i][1], $listview)
Next
GUICtrlSendMsg($listview, $LVM_SETCOLUMNWIDTH, 0, -1)
GUICtrlSendMsg($listview, $LVM_SETCOLUMNWIDTH, 0, -2)
GUISetState()

While GUIGetMsg() <> -3

WEnd



Func _Split($aArray)
    Local $aReturn[UBound($aArray)][2]
    For $i = 0 To UBound($aArray) - 1
        Local $_Drive, $_Path, $_Filename, $_Extension
        _PathSplit($aArray[$i], $_Drive, $_Path, $_Filename, $_Extension)
        Local $aName = StringRegExp($_Filename, "(([0-9\w\s])+).[^().]", $STR_REGEXPARRAYGLOBALMATCH, 1)
        If Not @error Then
            $aReturn[$i][0] = $aName[0]
        Else
            $aReturn[$i][0] = "***ERROR***"
        EndIf
        Local $aYear = StringRegExp($_Filename, '\d{4}', $STR_REGEXPARRAYGLOBALMATCH, 1)
        If Not @error Then
            $aReturn[$i][1] = $aYear[0]
        Else
            $aReturn[$i][1] = "***ERROR***"
        EndIf
    Next
    Return $aReturn
EndFunc   ;==>_Split

but by the way

i guess there are better pattern within RegExp... i don't get it to my head and it's so late for me :wacko:

so if anyone has better pattern please post it. i want to improve on that :) 

regards

Edited by Aelc

why do i get garbage when i buy garbage bags? <_<

Link to comment
Share on other sites

@Aelc
Amazing code, but it could be even more compact:

#include <Array.au3>
#include <GUIConstantsEx.au3>
#include <ListViewConstants.au3>
#include <WindowsConstants.au3>

Opt("GUIOnEventMode", 1)

Global $arrPaths[] = ["C:\Directory\Mary Poppins (2020).mkv", _
                      "C:\Directory\Terminator 3(1234).mkv", _
                      "C:\Directory\Pirates of the caribbean 4 (1934).mkv", _
                      "C:\Directory\Test film 123( 1324 ).mkv", _
                      "C:\Directory\What else (1923).mkv", _
                      "C:\Directory\and so on (1969).mkv", _
                      "C:\Directory\why i do so much 34 (1999).mkv", _
                      "C:\Directory\Mary Poppins (2000).mkv", _
                      "C:\Directory\Mary Poppins (2001).mkv", _
                      "C:\Directory\Mary Poppins 3(2005).mkv"]

Global $strPattern = '.*\\+\s*(\b[^(]+\b)\s*\(\s*(\d+)\s*\)\.\S+$'

#Region ### START Koda GUI section ### Form=
Global $frmMain = GUICreate("Films", 405, 293, -1, -1)
GUISetOnEvent($GUI_EVENT_CLOSE, "ExitApplication")
Global $lvFilms = GUICtrlCreateListView("Name|Year", 8, 8, 386, 278, $LVS_REPORT, BitOR($LVS_EX_FULLROWSELECT, $LVS_EX_GRIDLINES))
GUICtrlSendMsg($lvFilms, $LVM_SETCOLUMNWIDTH, 0, 150)
_PopulateListView()
GUISetState(@SW_SHOW)
#EndRegion ### END Koda GUI section ###

While 1
    Sleep(100)
WEnd

Func ExitApplication()
    GUIDelete($frmMain)
    Exit
EndFunc

Func _PopulateListView()

    Local $strListViewItem

    ; Create the ListView Items
    For $i = 0 To UBound($arrPaths) - 1 Step 1
        $strListViewItem = StringRegExpReplace($arrPaths[$i], $strPattern, '$1|$2')
        If Not @error Then GUICtrlCreateListViewItem($strListViewItem, $lvFilms)
    Next

EndFunc

:)

Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Link to comment
Share on other sites

A mix of @Aelc and @FrancescoDiMuro 

#include <File.au3>
#include <Array.au3>
#include <ListViewConstants.au3>

Local $aPath[] = [ _
  "C:\Directory\Mary Poppins (2020).mkv", _
  "C:\Directory\Terminator 3(1234).mkv", _
  "C:\Directory\Pirates of the caribbean 4 (1934).mkv", _
  "C:\Directory\Test film 123(1324).mkv", _
  "C:\Directory\What else (1923).mkv", _
  "C:\Directory\and so on (1969).mkv", _
  "C:\Directory\why i do so much 34 (1999).mkv", _
  "C:\Directory\Mary Poppins (2000).mkv", _
  "C:\Directory\Mary Poppins (2001).mkv", _
  "C:\Directory\Mary Poppins 3(2005).mkv" ]

Local $hGUI = GUICreate("RegExp", 450, 450)

Local $listview = GUICtrlCreateListView("File Name|Year", 0, 0, 400, 400) ;,$LVS_SORTDESCENDING)
Local $sFilePath, $sDrive, $sDir, $sFileName, $sExtension
For $i = 0 To UBound($aPath) - 1
  _PathSplit($aPath[$i], $sDrive, $sDir, $sFileName, $sExtension)
  GUICtrlCreateListViewItem(_ArrayToString(StringRegExp($sFileName, "([^\(]*)\((\d+)\)", 1)), $listview)
Next
GUICtrlSendMsg($listview, $LVM_SETCOLUMNWIDTH, 0, -1)
GUISetState()

While GUIGetMsg() <> -3
WEnd

 

Link to comment
Share on other sites

3 hours ago, Nine said:

"([^\(]*)\((\d+)\)"

Hi Nine :)
In case the filename contains 2 left parenthesis, for example...

"C:\Directory\Mary Poppins (again) (2001).mkv"

...then this pattern could/should take care of it :

"(.*)\((\d+)\)"

Edit: trying now a wider approach without _PathSplit()

#include <Array.au3>
#include <ListViewConstants.au3>

Local $aPath[] = [ _
  "C:\Directory\Mary Poppins (2020).mkv", _
  "C:\Directory\Terminator 3(1234).mkv", _
  "C:\Directory\Pirates of the caribbean 4 (1934).mkv", _
  "C:\Directory\Test film 123(1324).mkv", _
  "C:\Directory\What else (1923).mkv", _
  "C:\Directory\and so on (1969).mkv", _
  "C:\Directory\why i do so much 34 (1999).mkv", _
  "C:\Directory\Mary Poppins (2000).mkv", _
  "C:\Directory\Mary Poppins (again) (2001).mkv", _
  "C:\Directory\Mary Poppins 3(2005).mkv" ]

Local $hGUI = GUICreate("RegExp", 450, 450)

Local $listview = GUICtrlCreateListView("File Name|Year", 0, 0, 400, 400) ;,$LVS_SORTDESCENDING)
For $i = 0 To UBound($aPath) - 1
  GUICtrlCreateListViewItem(_ArrayToString(StringRegExp($aPath[$i], "(?:.*)\\(.*)\((\d+)\)", 1)), $listview)
Next
GUICtrlSendMsg($listview, $LVM_SETCOLUMNWIDTH, 0, -1)
GUISetState()

While GUIGetMsg() <> -3
WEnd

Personal comments on the Regex part (let's hope they're correct) :

(?:.*)  non-capturing group for all characters found from the beginning of the path until...

\\  last antislash found (last because * is greedy in the line above)

(.*)  1st capturing group (file name) composed of all characters after last antislash until...

\(  last left parenthesis found (last because * is greedy in the line above)

(\d+)  2nd capturing group (year) composed of all decimal digits until...

\)  a right parenthesis has been found.

Edited by pixelsearch
tried a wider approach
Link to comment
Share on other sites

@Nine : I didn't test that one, good point :)
There's surely a RegEx way to test if there's a "\" within the string but modifying this very RegEx pattern just to do that is beyond my expertise (it would require to study again all that damn RegEx thing !)

Meanwhile, using the Ternary operator to add a virtual "\" in front of the string (when none is found in the string) should solve the case you described.

GUICtrlCreateListViewItem(_ArrayToString(StringRegExp( _
    ((StringInStr($aPath[$i], "\", 1)) ? ($aPath[$i]) : ("\" & $aPath[$i])) , _
    "(?:.*)\\(.*)\((\d+)\)", 1)), $listview)

IIRC, you wrote somewhere (or was it Zedna ?) that the 3rd parameter of StringInStr (case sensitive or not) should always be 1 (case sensitive) when "the substring to search for" isn't a letter, so it will speed up the function StringInStr to its max, as used just above. I'll try to find the link where it was discussed and thoroughly tested.

Edited by pixelsearch
typo
Link to comment
Share on other sites

There without split :

#include <ListViewConstants.au3>

Local $aPath[] = [ _
  "C:\Directory\Mary Poppins (2020).mkv", _
  "C:\Directory\Terminator 3(1234).mkv", _
  "C:\Directory\Pirates of the caribbean 4 (1934).mkv", _
  "C:\Directory\Test film 123(1324).mkv", _
  "C:\Directory\What else (again) (1923).mkv", _
  "C:\Directory\and so on (1969).mkv", _
  "C:\Directory\why i do so much 34 (1999).mkv", _
  "Mary Poppins (new) (2020).mkv", _
  "C:\Directory\Mary Poppins (2000).mkv", _
  "C:\Directory\Mary Poppins (2001).mkv", _
  "C:\Directory\Mary Poppins 3(2005).mkv" ]

Local $hGUI = GUICreate("RegExp", 450, 450)

Local $listview = GUICtrlCreateListView("File Name|Year", 0, 0, 400, 400) ;,$LVS_SORTDESCENDING)
For $i = 0 To UBound($aPath) - 1
  GUICtrlCreateListViewItem(_ArrayToString(StringRegExp($aPath[$i], "(?:.*\\)?(.+)\((\d*)", 1)), $listview)
Next
GUICtrlSendMsg($listview, $LVM_SETCOLUMNWIDTH, 0, -1)
GUISetState()

While GUIGetMsg() <> -3
WEnd

 

Link to comment
Share on other sites

@mikell  Excellent. So obvious when you think about it.  But if it was your intention to remove space at the end of the film title, it does not.

However "([^\\]*?)\h*\((\d+)" does.  :guitar:

Link to comment
Share on other sites

So, I also slept on it and figured it out before coming back to this thread. I try what I can before posting. I come back and see all of this awesome coding compared to mine. 🤣

$listview = GUICtrlCreateListView("File Name|Year", -101, 40, 420, 190) ;,$LVS_SORTDESCENDING)
    For $n = 1 To UBound($FileList) - 1
        $File_Name = StringLeft($FileList[$n], StringInStr($FileList[$n], "(") - 2)
        $extractYear = StringMid($FileList[$n], StringInStr($FileList[$n], "(", 2, -1) + 1)
        $File_Year = StringMid($extractYear, 1, 4)
        $item = GUICtrlCreateListViewItem($File_Name & "|" & $File_Year, $listview)
        _GUICtrlListView_SetColumnWidth($listview, 0, 351)
    Next

I will use and mess around with the answers given to me, so that I can understand them a bit better. Thank you all!

Edited by ReconX
Link to comment
Share on other sites

@ReconX :   Glad it worked for you  :)

@mikell :  could you please explain why you choosed  ([^\\]*?)  and not  ([^\\]*)  , Thanks.

Edit : forget it mikell, I guess it's for not capturing the eventual spaces following the file name (just tested it) as you introduced \h* in the pattern to take care of them.

Edited by pixelsearch
Maybe I got the answer why mikell used *? and not * only
Link to comment
Share on other sites

@pixelsearch: So does this code split up the file name into two different strings? I'm reading the code and I can see where it splits it, but if they are strings, I am unsure how it displays them. 

Using this code:

Local $listview = GUICtrlCreateListView("File Name|Year", -101, 40, 420, 190) ;,$LVS_SORTDESCENDING)
    For $i = 0 To UBound($FileList) - 1
        GUICtrlCreateListViewItem(_ArrayToString(StringRegExp($FileList[$i], "(?:.*\\)?(.+)\((\d*)", 1)), $listview)
    Next
    GUICtrlSendMsg($listview, $LVM_SETCOLUMNWIDTH, 0, -1)
    GUISetState()

Filename is split here?

(StringRegExp($FileList[$i], "(?:.*\\)?(.+)\((\d*)", 1)), $listview)

But where does it call the two strings back to be displayed? 

EDIT: The code works, I'm just trying to understand it. :P

Edited by ReconX
Link to comment
Share on other sites

Well StringRegExp returns an array, which includes the name in assignment [0] and the year in assignment [1]

That is what i meant that there are better pattern and like the masters :graduated: shown it split it perfectly in this 2 strings which are just converted to a single string with

_ArrayToString()

 and inserted in the GUICtrlCreateListView () call

 

If you have to understand some codes it's helpful to display the results of functions after every to see what happened :) 

Local $listview = GUICtrlCreateListView("File Name|Year", -101, 40, 420, 190) ;,$LVS_SORTDESCENDING)
    For $i = 0 To UBound($aPath) - 1
        Local $aArray = StringRegExp($aPath[$i], "(?:.*\\)?(.+)\((\d*)", 1)
        For $y = 0 To Ubound($aArray) -1
            ConsoleWrite ("returned value StringRegExp " & $y & ": " & $aArray[$y] & @CRLF )
        Next
        Local $sString = _ArrayToString($aArray)
        ConsoleWrite ("returned value _ArrayToString " & $sString & @CRLF )
        GUICtrlCreateListViewItem($sString, $listview)
    Next
    GUICtrlSendMsg($listview, $LVM_SETCOLUMNWIDTH, 0, -1)
    GUISetState()

regards

Edited by Aelc

why do i get garbage when i buy garbage bags? <_<

Link to comment
Share on other sites

@ReconX : exactly, the StringRegExp pattern splits the filename in 2.

* The 1st string is the filename : this string will start from the 1st character following the last "\" until the last "(" is found.
And as Nine noted, if there is no path (i.e. no "\") in your initial string (ex: "Mary Poppins (2004).mkv") then the filename string will start from the very 1st character.

* The 2nd string is the year, i.e. every digit following the last "(" in your initial string.

In fact  StringRegExp generates an "Array of matches", then Nine's use of _ArrayToString() recreates a string (with a | delimiter, same as your code) to populate each row (2 columns) of the listview.

RegExp is very powerful but it's not really easy (at least for me). Glad we have some RegExp gurus on this site, they always find a RegExp solution when it's possible. Lucky us !

Link to comment
Share on other sites

15 hours ago, Nine said:

if it was your intention to remove space at the end of the film title

Yes it was, I edited my code just before reading your post  :>

BTW the final goal could be something like this

#include <Array.au3>
#include <ListViewConstants.au3>

Local $aPath[] = [ _
  "C:\Directory\test\Mary Poppins (2020).mkv", _
  "C:\Directory\Terminator 3(1234).mkv", _
  "C:\Directory\Pirates of the caribbean 4       (1934).mkv", _
  "C:\Directory\Test film 123(1324).mkv", _
  "What else (1923).mkv", _
  "C:\Directory\and so on (1969).mkv", _
  "C:\Directory\why i do so much 34 (1999).mkv", _
  "C:\Directory\Mary Poppins (2000).mkv", _
  "C:\Directory\Mary Poppins (again) (2001).mkv", _
  "C:\Directory\Mary Poppins 3(2005).mkv" ]


Local $hGUI = GUICreate("RegExp", 450, 450)

Local $listview = GUICtrlCreateListView("Path|File Name|Year", 0, 0, 400, 400) ;,$LVS_SORTDESCENDING)
For $i = 0 To UBound($aPath) - 1
  GUICtrlCreateListViewItem(StringRegExpReplace(_ArrayToString(StringRegExp($aPath[$i], "(.+\\)?([^\\]*?)\h*\((\d+)\)", 1)), "^(?=\|)", @scriptdir & "\\"), $listview)
Next
GUICtrlSendMsg($listview, $LVM_SETCOLUMNWIDTH, 0, -1)
GUISetState()

While GUIGetMsg() <> -3
WEnd

:D

 

Link to comment
Share on other sites

11 hours ago, mikell said:

BTW the final goal could be something like this

There is still a small imperfection (sorry ;) )

Let's assume the ScriptDir is C:\AutoIt\Projects\Test , then the output will be C:AutoItProjectsTest\

EDIT : Here an abbreviated script to show the issue :

Local $sString = "|What else (1923).mkv"

If StringRegExp($sString, "^(?=\|)") Then
    ConsoleWrite("==> String: " & $sString & @CRLF)
    ConsoleWrite("+ positive look-ahead matches" & @CRLF)
    ConsoleWrite("+ ScriptDir = " & @ScriptDir & @CRLF)
    ConsoleWrite(StringRegExpReplace ($sString, "^(?=\|)", @ScriptDir & "\\") & @CRLF)
    ; (just a test) replace first occurrence of | with StringReplace :
    ConsoleWrite(StringReplace($sString, "|", @ScriptDir & "\", 1) & @CRLF)
EndIf

 

Edited by Musashi

Musashi-C64.png

"In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...