Jump to content
Sign in to follow this  
Ascend4nt

Excluding dot with StringRegExp

Recommended Posts

Ascend4nt

Okay, so I've been looking EVERYWHERE for a way to implement this in AutoIT and I'm still clueless. Regular Expressions bewilder the heck out of me. I've been trying to exclude filenames with a dot (period or '.') in their name, and I can't figure out how to do it. I've tried every combination I can imagine and tried to understand things through web examples and whatnot, but to no avail.

I hoped "[^\.]" would do it, but no - still, every filename with dots in them are captured. I just want files without extensions, period. I tried appending "/z" to it, prefixing it with ".*" and everything I can imagine. Does anyone have a clue how this can be done?

Thanks!

Ascend4nt

Share this post


Link to post
Share on other sites
dbzfanatic

Since extensions are 99% of the time 3 alphanumeric characters why not just use this?

StringTrimRight($string,4)

Edit: If you still want to use StringRegExp try this pattern: (?i)([[:alnum:]]*)\.

The data:

ilia.avi

movie.mp3

lol.mp4.mpg

The Results:

ilia

movie

lol

mp4

Edited by dbzfanatic

Share this post


Link to post
Share on other sites
enaiman

$text = "hhghjdfg.hgj.aaa"
$test = StringRegExp($text, "((\..{1,3}\z))", 1)
If @error Then
    MsgBox(0, "error", @error)
Else
    MsgBox(0, "", $test[1])
EndIf

... always trial and error for me ...


SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script

wannabe "Unbeatable" Tic-Tac-Toe

Paper-Scissor-Rock ... try to beat it anyway :)

Share this post


Link to post
Share on other sites
dbzfanatic

$text = "hhghjdfg.hgj.aaa"
$test = StringRegExp($text, "((\..{1,3}\z))", 1)
If @error Then
    MsgBox(0, "error", @error)
Else
    MsgBox(0, "", $test[1])
EndIf

... always trial and error for me ...

Your pattern seems to just return the extensions instead of the names.

Share this post


Link to post
Share on other sites
enaiman

That was the point - OP said he wanted to exclude files with extension - I gave him a way to check a if a file name has an extension or not.


SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script

wannabe "Unbeatable" Tic-Tac-Toe

Paper-Scissor-Rock ... try to beat it anyway :)

Share this post


Link to post
Share on other sites
Xenobiologist

The question is what should be the result?

Try this pattern : .*?\..*?\..*

Mega


Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites
Ascend4nt

Hey, thanks everyone for trying to figure this out.. I'm sorry that I hadn't totally expressed what I needed. ;)

Basically I want a True/False return from StringRegExp() indicating a match or no match (flag=0).

I've tried all of the Regular Expressions you've all posted, thank you again for trying :D

Unfortunately, I get a match in all cases, except for Xenobiologist's approach (which crashed AutoIT). =\

BTW, I've thought about using [:alnum:], but I want to include all of the Unicode symbols that NTFS based file system supports (including regular things like accent ' or comma , or special things like the © Copyright symbol etc)... and I don't need extensions, I just want to exclude any file that has a dot in it. (including "this.is.a.file" and "readme.txt") It should only ring true on filenames like "readme" or "I have no periods because I'm pregnant" ;)

Again, much appreciated guys/gals. Its good to know there's people willing to try =)

Ascend4nt

Share this post


Link to post
Share on other sites
Xenobiologist

Hi,

if you are just looking for one single dot then just use StringInStr which might be even faster than RegExp.

I thought you want to exclude all files containing a dot in the name like

bla.hugo.txt

whatfile.xls

and do not exclude

hugo.txt

Mega


Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites
Robjong

hey,

i take it you also want to check for other invalid filename characters,

here is an example:

$filename = 'some file'

If _StringInvalidChars($filename) Then 
    ConsoleWrite('Filename includes invalid chars.' & @CRLF)
Else
    ConsoleWrite('Filename does not include invalid chars.' & @CRLF)
EndIf

; Invalid chars: * ? " < > | \ / .
Func _StringInvalidChars($string)
    If StringRegExp($string, '[*\?"\<>|\\\/\.]') Then 
        Return 1
    Else
        Return 0
    EndIf
EndFunc

hope it is what you were looking for,

Robjong.

Share this post


Link to post
Share on other sites
Ascend4nt

Again, thanks everyone. This is proving to be an impossible chore, yet seemingly such a simple concept that you'd think there *must* be some Regular Expression that can be used.

Unfortunately, every one of the regular expressions given here returns True, even when two of the ones that cause crashes (from Xenobiologist & trancexx) give partial results. (Both crash at about 3000 compares - and yeah, the high number of results is because I'm using a modified filesearch function (original design courtesy of Weapon-X - here)).

In any case, this looks like the only case I can think of where a search would necessitate the need for an optional mechanism for compares. It just bugs the heck out of me that Regular Expressions can't return False for something with a simple dot in it.. argh..

Oh, btw Robjong - a '.' isn't an invalid filename character, its just something I wanted to exclude. Its funny you should mention that type of function though, as I just finished writing a "_FilePathnameValid()' function a day ago which is bit more complicated (disassembles the path piece by piece checking for full/relative paths, invalid characters, spaces before/after filenames, extensions without filenames, etc)

Thanks y'all

Ascend4nt

Share this post


Link to post
Share on other sites
SmOke_N

When using the others RegEx's there seems to be a bug in the 3rd and 4th parameter of RegExp using zero as its value.

Zero is default, so try just doing something like:

If StringRegExp($s_value, ".*?\..*?\.\w+\z") Then

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites
AdmiralAlkex

@ascendant

If Xenobiologist and trancexx code crashes after 3000 results it must be something else that don't work, are you perhaps using an old version of AutoIt? Try updating so the control limit goes from 4000 to 65000 (or so)

Share this post


Link to post
Share on other sites
Ascend4nt

SmOke_n, thx for the try - but it keeps returning True for files with dots in it.

Admiral Alex, this is running version 3.2.12.1. I was also a little quick to assume it was ~3000... on closing and retesting it actually varies, which is odd.. at first it had higher numbers, then it would decrease more and more with less results before crashing. Very odd. Its almost as if time is a factor.. because the file results come quicker the more times you run them on the same files.

And like I said, all the expressions keep returning True for those that it is able to scan that contain dots in the filename. The StringRegExp function itself is passed as arguments both the file (or folder) name returned from FileFindNextFile() [checking for @error before continuing], and compares that name against the regular expression.

I've tested the function up and down using dozens of different regular expressions that haven't had any problems.. it's when I use unusual or 'invalid' expressions that it crashes.

Share this post


Link to post
Share on other sites
SmOke_N

SmOke_n, thx for the try - but it keeps returning True for files with dots in it.

Yes it does... That's how I wrote it. You can't make an exception?

If StringRegExp($s_value, ".*?\..*?\.\w+\z") = 0 Then
That will only allow files without them in it.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites
trancexx

Again, thanks everyone. This is proving to be an impossible chore, yet seemingly such a simple concept that you'd think there *must* be some Regular Expression that can be used.

Unfortunately, every one of the regular expressions given here returns True, even when two of the ones that cause crashes (from Xenobiologist & trancexx) give partial results. (Both crash at about 3000 compares - and yeah, the high number of results is because I'm using a modified filesearch function (original design courtesy of Weapon-X - here)).

In any case, this looks like the only case I can think of where a search would necessitate the need for an optional mechanism for compares. It just bugs the heck out of me that Regular Expressions can't return False for something with a simple dot in it.. argh..

Oh, btw Robjong - a '.' isn't an invalid filename character, its just something I wanted to exclude. Its funny you should mention that type of function though, as I just finished writing a "_FilePathnameValid()' function a day ago which is bit more complicated (disassembles the path piece by piece checking for full/relative paths, invalid characters, spaces before/after filenames, extensions without filenames, etc)

Thanks y'all

Ascend4nt

Ok, I understand that bad regular expression can crash ... everything, but it is not likely that good one would do that. As AdmiralAlkex suggested, problem is then on something else.

Another thing, that pattern in my previous post is to exclude files that have dot ('.') in their names filename = name + . + extension (name is the part of the filename before last dot, and extension follows last occurrence of the dot).

So, if I'm reading you correctly (now finally - sorry), you want to exclude files with no extensions, therefore no dot should appear in the filename. Patern for that would be "\."

$file_name = "fg.nh.fg"
Dim $end_1 = 0, $end_2 = 0


$start_1 = TimerInit()
If StringRegExp($file_name, "\.", 0) = 1 Then
    $end_1 = TimerDiff($start_1)
EndIf

$start_2 = TimerInit()
If StringInStr($file_name, '.') <> 0 Then
    $end_2 = TimerDiff($start_2)
EndIf

MsgBox(0, "Return time", 'StringRegExp  - ' & $end_1 & ' sec' & @CRLF & 'StringInStr  - ' & $end_2 & ' sec')

Is that what you were looking for?

And use FileGetAttrib() to eliminate folders

$attrib = FileGetAttrib($file)
If StringInStr($attrib, 'D') <> 0 Then
    MsgBox(0, '', "This is folder")
EndIf
Edited by trancexx

♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites
trancexx

btw, edit function on this forum... lol

half of that post before is missing, hope you did get it before destruction


♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites
enaiman

Allthis long thread could have been avoided if the OP had the good will to research the StringRegExp just a little because all codes posted works and all they need is a small modification (flag set to 0 for match/no match)

That's what I did with my example.

$text = "hhghjdfg.tr"
If StringRegExp($text, "((\..{0,3}\z))", 0) Then
    MsgBox(0, "No Match", $text&" file has extension")
Else
    MsgBox(0, "Match", $text&" file has NO extension")
EndIf

This expression will look for the end of name and anything like: "asdf.", "asdf.a", "asdf.as", "asdf.asd" will return a match while "asdf.asdf" will return none (the definition of "extension" is that it includes 1 to 3 characters - that's why a 4 character is not considered an extension) - the number of characters after the "." is established here {0,3}

Using my pattern "asdf.asd.sdf" will match an extension and the extension will be "sdf".

TO OP - do your homework and read about StringRegExp because you have everything you need within this thread.

Edited by enaiman

SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script

wannabe "Unbeatable" Tic-Tac-Toe

Paper-Scissor-Rock ... try to beat it anyway :)

Share this post


Link to post
Share on other sites
Ascend4nt

Allthis long thread could have been avoided if the OP had the good will to research the StringRegExp just a little because all codes posted works and all they need is a small modification (flag set to 0 for match/no match)

That's what I did with my example.

$text = "hhghjdfg.tr"
If StringRegExp($text, "((\..{0,3}\z))", 0) Then
    MsgBox(0, "No Match", $text&" file has extension")
Else
    MsgBox(0, "Match", $text&" file has NO extension")
EndIf

This expression will look for the end of name and anything like: "asdf.", "asdf.a", "asdf.as", "asdf.asd" will return a match while "asdf.asdf" will return none (the definition of "extension" is that it includes 1 to 3 characters - that's why a 4 character is not considered an extension) - the number of characters after the "." is established here {0,3}

Using my pattern "asdf.asd.sdf" will match an extension and the extension will be "sdf".

TO OP - do your homework and read about StringRegExp because you have everything you need within this thread.

Seems some here should try running their patterns on a few thousand strings to see what happens ;)

And I *did* say in my original post that I looked everywhere - on the forums, on the web, so yeah - homework was done (for quite a while actually)

I don't know why you refer to me as 'OP' enaiman.. I can be addressed directly if you don't mind by my nickname. Thanks anyway for trying to make a newbie feel at home though.

And as trancexx pointed out, extensions are not defined as 1-3 characters anymore.. you're thinking of the old DOS 8.3 days...

Edited by ascendant

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.