Jump to content

Recommended Posts

Posted (edited)

I am trying to get the name and email of a person from these string types

NAPD Associate - Dan Unanue [du@xymeetgrp.com]
NAPD Associate - Dave Chodosch, CIMA® [dc@xymeetgrp.com]
NAPD Associate - Will Gilmartin, CFA [wfg@xymeetgrp.com]
NAPD Associate - Bill Tsukudi-Boudreau, CFA [blmb@xymeetgrp.com]
NAPD Associate - Saul West Jr. [Saul_Best@xymeetgrp.com]
NAPD Associate - Reid Moon [Reid.Moon@xymeetgrp.com]
NAPD Associate - Jane Horwitz-Marcus AIFA®, AIF®, PPC® [jdbh@xymeetgrp.com]
NAPD Associate - John Valencia, CFP®, CIMA®, CRPC®, CRPS®, AIF® [jmv@xymeetgrp.com]
NAPD Associate - Jason Terry (JTAK) [Jason.Terry@xymeetgrp.com]
Eddie Wolf [esw@xymeetgrp.com]
Sales Member - Scott Naples [SN@xymeetgrp.com]

with my attempts i am able to strip out the first clause (NAPD Associate and Sales Member) and the licenses (CFP, CIMA.. etc) but not the special characters after the license.. if i try to use \W it impacts the email

$var = StringRegExpReplace($string, "[\w\s]* - ", "")
;~  ConsoleWrite($var & @CRLF)

    $var2 = StringRegExpReplace($var, ",[\w\s]*", "")
    ConsoleWrite($var2 & @CRLF)

any help is greatly appreciated

Edited by gcue
Posted (edited)

I think this might be what you are looking for on your second replace -- match everything after the comma that is NOT a left brace [.

$var2 = StringRegExpReplace($var, ", [^\[]+", "")

 

Although, I would personally do a single StringRegExp and pull out the groups I am interested in.
*This is assuming you are missing commas in your example data.

#include <StringConstants.au3>
#include <Array.au3>

$str = "NAPD Associate - Dan Unanue, [du@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Dave Chodosch, CIMA® [dc@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Will Gilmartin, CFA [wfg@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Bill Tsukudi-Boudreau, CFA [blmb@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Saul West Jr., [Saul_Best@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Reid Moon, [Reid.Moon@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Jane Horwitz-Marcus, AIFA®, AIF®, PPC® [jdbh@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - John Valencia, CFP®, CIMA®, CRPC®, CRPS®, AIF® [jmv@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Jason Terry, (JTAK) [Jason.Terry@xymeetgrp.com]" & @CRLF & _
"Eddie Wolf [esw@xymeetgrp.com]" & @CRLF & _
"Sales Member - Scott Naples, [SN@xymeetgrp.com]"

$aNames = StringSplit($str, @CRLF,  $STR_ENTIRESPLIT+$STR_NOCOUNT )
For $name in $aNames
    $aMatches = StringRegExp($name, "^.*?- (.*?),.*?\[(.*?)\]", $STR_REGEXPARRAYMATCH)
    if UBound($aMatches) = 2 Then
        ConsoleWrite( $aMatches[0] & " - " & $aMatches[1] & @CRLF)
    Else
        ConsoleWrite( "**Could not parse: " & $name & @CRLF)
    EndIf
Next

 

Edited by kurtykurtyboy
Posted (edited)

Or in a single pass :

#include <Array.au3>

Local $str = "NAPD Associate - Dan Unanue [du@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Dave Chodosch, CIMA® [dc@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Will Gilmartin, CFA [wfg@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Bill Tsukudi-Boudreau, CFA [blmb@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Saul West Jr. [Saul_Best@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Reid Moon [Reid.Moon@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Jane Horwitz-Marcus, AIFA®, AIF®, PPC® [jdbh@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - John Valencia, CFP®, CIMA®, CRPC®, CRPS®, AIF® [jmv@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Jason Terry (JTAK) [Jason.Terry@xymeetgrp.com]" & @CRLF & _
"Eddie Wolf [esw@xymeetgrp.com]" & @CRLF & _
"Sales Member - Scott Naples [SN@xymeetgrp.com]"

Local $aMatches = StringRegExp($str, "(?:.* - )?(.+?[^,\[\(]*).*?[^\[]*\[(.*)\]", $STR_REGEXPARRAYGLOBALMATCH)

_ArrayDisplay($aMatches)

 

Edited by Nine
Posted

My 2 cents (using the exact strings provided in post #1 )

#include <Array.au3>

$str = "NAPD Associate - Dan Unanue [du@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Dave Chodosch, CIMA® [dc@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Will Gilmartin, CFA [wfg@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Bill Tsukudi-Boudreau, CFA [blmb@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Saul West Jr. [Saul_Best@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Reid Moon [Reid.Moon@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Jane Horwitz-Marcus AIFA®, AIF®, PPC® [jdbh@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - John Valencia, CFP®, CIMA®, CRPC®, CRPS®, AIF® [jmv@xymeetgrp.com]" & @CRLF & _
"NAPD Associate - Jason Terry (JTAK) [Jason.Terry@xymeetgrp.com]" & @CRLF & _
"Eddie Wolf [esw@xymeetgrp.com]" & @CRLF & _
"Sales Member - Scott Naples [SN@xymeetgrp.com]"

$res = StringRegExp($str, '(?mx)^(?:.*-\h)?   ((?:[\w\h\.-]+(?=[,\h\[]))+)  .*?  \[([^\]]+)', 3)

Local $res2D[Ceiling(UBound($res)/2)][2]
For $i = 0 To UBound($res) - 1
    $res2D[Int($i / 2)][Mod($i, 2)] = $res[$i]
Next
_ArrayDisplay($res2D)

 

Posted

thank you!  all elegant solutions.. a bit harder for me to follow when the expression has so many symbols :)

currently the data set is not too large so a string works (thanks again for that!).. it is possible it will grow to a much larger data set since this is a fairly new roll out.. here is a working code (sorry i did not do this last time) also sorry i should have thought of this possibility sooner.

i tried to plugin a few of the expression string clauses you guys used but could not get it working consistently even with the small data set i have provided

#include <array.au3>

$records_array = Get_Array()

For $x = 0 To UBound($records_array) - 1
    $full_name = StringRegExpReplace($records_array[$x], "..", "")
    $email = StringRegExpReplace($records_array[$x], "..", "")

    ConsoleWrite("Full Name: " & $full_name & @crlf)
    ConsoleWrite("Email: " & $email & @crlf)
    ConsoleWrite(@crlf & @crlf)
Next

Func Get_Array()

    Local $array[11]

    $array[0] = "NAPD Associate - Dan Unanue [du@xymeetgrp.com]"
    $array[1] = "NAPD Associate - Dave Chodosch, CIMA® [dc@xymeetgrp.com]"
    $array[2] = "NAPD Associate - Will Gilmartin, CFA [wfg@xymeetgrp.com]"
    $array[3] = "NAPD Associate - Bill Tsukudi-Boudreau, CFA [blmb@xymeetgrp.com]"
    $array[4] = "NAPD Associate - Saul West Jr. [Saul_Best@xymeetgrp.com]"
    $array[5] = "NAPD Associate - Reid Moon [Reid.Moon@xymeetgrp.com]"
    $array[6] = "NAPD Associate - Jane Horwitz-Marcus AIFA®, AIF®, PPC® [jdbh@xymeetgrp.com]"
    $array[7] = "NAPD Associate - John Valencia, CFP®, CIMA®, CRPC®, CRPS®, AIF® [jmv@xymeetgrp.com]"
    $array[8] = "NAPD Associate - Jason Terry (JTAK) [Jason.Terry@xymeetgrp.com]"
    $array[9] = "Eddie Wolf [esw@xymeetgrp.com]"
    $array[10] = "Sales Member - Scott Naples [SN@xymeetgrp.com]"

    Return $array

EndFunc   ;==>Get_Array

thank you very much!

 

Posted

If you really want to go that route, I would suggest adapting Nine's method above. This is much more robust (and efficient?) than the multiple replacements method for a larger dataset.

#include <array.au3>

$records_array = Get_Array()

Local $aMatches
For $x = 0 To UBound($records_array) - 1
    $aMatches = StringRegExp($records_array[$x], "(?:.* - )?(.+?[^,\[\(]*).*?[^\[]*\[(.*)\]", $STR_REGEXPARRAYGLOBALMATCH)

    $full_name = $aMatches[0]
    $email = $aMatches[1]

    ConsoleWrite("Full Name: " & $full_name & @crlf)
    ConsoleWrite("Email: " & $email & @crlf)
    ConsoleWrite(@crlf & @crlf)
Next

Func Get_Array()

    Local $array[11]

    $array[0] = "NAPD Associate - Dan Unanue [du@xymeetgrp.com]"
    $array[1] = "NAPD Associate - Dave Chodosch, CIMA® [dc@xymeetgrp.com]"
    $array[2] = "NAPD Associate - Will Gilmartin, CFA [wfg@xymeetgrp.com]"
    $array[3] = "NAPD Associate - Bill Tsukudi-Boudreau, CFA [blmb@xymeetgrp.com]"
    $array[4] = "NAPD Associate - Saul West Jr. [Saul_Best@xymeetgrp.com]"
    $array[5] = "NAPD Associate - Reid Moon [Reid.Moon@xymeetgrp.com]"
    $array[6] = "NAPD Associate - Jane Horwitz-Marcus AIFA®, AIF®, PPC® [jdbh@xymeetgrp.com]"
    $array[7] = "NAPD Associate - John Valencia, CFP®, CIMA®, CRPC®, CRPS®, AIF® [jmv@xymeetgrp.com]"
    $array[8] = "NAPD Associate - Jason Terry (JTAK) [Jason.Terry@xymeetgrp.com]"
    $array[9] = "Eddie Wolf [esw@xymeetgrp.com]"
    $array[10] = "Sales Member - Scott Naples [SN@xymeetgrp.com]"

    Return $array

EndFunc   ;==>Get_Array

 

Posted

hmm actually i did notice something in the example provided.. the name (Jane Horwitz-Marcus AIFA®) still came up with the license (AIFA®)

perhaps because she had more licenses?  I think there will be users with multiple licenses.. so tricky!

Posted
14 minutes ago, Nine said:

Here a better pattern

"(?:.*- )?([\w\h\.-]+(?=[\h,])).*\[(.*)\]"

Let me know...

looks great Nine!  i ran it across a larger data set and all licenses have been removed!

i am trying to understand your syntax - this clause addresses name right?

?([\w\h\.-]+(?=[\h,])).*\

if for whatever reason someone does not have a name associated (we are planning on importing some more older data later in the month - so names might not be available)

they would probably look something like this (where we will put email instead of name)

"NAPD Associate - pzo@xymeetgrp.com [pzo@xymeetgrp.com]"

Posted
6 minutes ago, Deye said:

you can leave them in the name for reference, And here is one more way to do it:

Local $aName = StringSplit(StringRegExpReplace($records_array[$x], "\w.*- |\]", ""), "[")
    $full_name = $aName[1]
    $email = $aName[2]

 

actually we have to remove the licenses - bc the data repository we are importing them wont handle them well

thanks for the suggestion

Posted
2 minutes ago, Deye said:

Sure, updated the above to show no license

hmm strange.. still seeing license for the following 

Will Gilmartin CFA 
Bill Tsukudi-Boudreau CFA 

Posted (edited)
20 minutes ago, gcue said:

i am trying to understand your syntax - this clause addresses name right?

Yes, it lists all authorized characters in the name, then looking forward, needs to have a space or a comma. 

If you're importing email addresses as well, you just need to add the allowed characters in an email address.

In this particular case, just add @

"(?:.*- )?([\w\h\.-@]+(?=[\h,])).*\[(.*)\]"

 

Edited by Nine
typos
Posted
2 minutes ago, Nine said:

Yes, it lists all authorized characters in the name, then looking forward, needs to have a space or a comma. 

If your importing email addresses as well, ou just need to add the allowed characters in an email address.

In this particular case, just add @

"(?:.*- )?([\w\h\.-@]+(?=[\h,])).*\[(.*)\]"

 

yes that would work!!!!

thank you soooo much!!!🥳

Posted
1 hour ago, gcue said:

the name (Jane Horwitz-Marcus AIFA®) still came up with the license (AIFA®)

Looks like you didn't even try my code ...

Posted
7 hours ago, mikell said:

Looks like you didn't even try my code ...

Of course I did! It was one of the elegant methods I mentioned :) I tried it as a string and it worked beautifully.  As our project continues we came to find out we may be dealing with a larger data set which is why I needed a for loop. Thank you again for your quick help and expertise

Posted
21 hours ago, Nine said:

Woot, Cat shows its claws 

Bah.... such things can happen when you spend some time to provide a working solution and get no return  :idiot:

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...