Sign in to follow this  
Followers 0
webb34

RegEx Issue. Possible use of lookarounds

2 posts in this topic

So I have a CSV of call logs taht keeps track of who is calling where and when within our system. I am using it to generate web page(that part's done). But I wanted to make things simpler. When the program starts up, after asking where the CSV is, and where to save the HTML file, I want it to ask which "Department" to use. I know there are 14 columns, and the 6th column contains the department that is calling(the value I want to use). My issue is trying to only get one occurrence of each unique department name. I know there are a few departments, but I would rather adjust for future departments, and also I need to learn RegEx sooner or later. Unfortunately, I keep running into issues.

The closest I've gotten is

([^,\n]+)(?=((,[^,\n]*){8}\s*$))(?!(.*\1(,[^,\n]*){8}\s*$))

But that only works in regexpal, and even then only once multiline mode is turned on, and I make it so the dot matches everything including new lines.

I believe that if I find 

([^,\n]+)(?=((,[^,\n]*){8}[\n$]))

then clearly it should be the 6th column out of 14 because I am hitting 8 more commas, that may or may not have characters after them, and then the end of the end, or a new line character. This works in AutoIt, but when I add 

(?!([.\n]*\1(,[^,\n]*){8}[\n$]))

it changes nothing. I would expect this to ignore a match if the match was followed by any characters(including new lines), then the match found in group 1, then 8 commas with any(or no) characters after them, and then either a new line or end of line. But I am apparently missing something here.

Any help is greatly appreciated.

 

To be clear, this is my function call:

Local $departments = StringRegExp($openedContents, '([^,\n]+)(?=((,[^,\n]*){8}[\n$]))(?!([.\n]*\1(,[^,\n]*){8}[\n$]))', 3)

$openedContents is simply the CSV file put into a string using FileRead.

Share this post


Link to post
Share on other sites



Nevermind. I figured it out. Apparently [.n] does not mean any character including new lines. I'm guessing since . is the equivalent of [^n] that adding a n into the character list does nothing, but I realized this was the issue and I added (?s) to the beginning so that . would now match all characters including new lines. Of course, I didn't know (?s) apparently doesn't count as a group, so I switched 1 to 2 and that still didn't work.

So in the end I ended up with

Local $departments = StringRegExp($openedContents, '(?s)([^,\n]+)(?=((,[^,\n]*){8}[\n$]))(?!(.*\1(,[^,\n]*){8}[\n$]))', 3)

and now every 3rd result in the returned array, starting at 0(i.e. $departments[0], $departments[3], $departments[6], ... $departments[i*3]) is a unique department name.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0