Sign in to follow this  
Followers 0
twinturbosubaru

Find email address in body of email

25 posts in this topic

#1 ·  Posted (edited)

Hi folks, I have a task presented to me to try and find an email address in the body of an email.

I have started by using the _pop3 udf which seems good, but I am having trouble figuring out how to actually find the address in the string returned.

Currently I can get the string which is the body of the email, and I can find the initial text which dictates where the address will be, but I can't figure out how to grab the text from that point on up to the end of the line.

I obviously don't know how long the email address will be, and I only know an email address will have an @ symbol in it, I can't be sure of anything else about it.

In the email the address will be preceded by the text on the same line as follows:

Email Address: fred@joebloggs.com

Currently I am just using a StringInStr function to find the "Email Address: " bit, but I don't know how to then retrieve everything after that until either a whitespace character or the end of the line.

I would really appreciate any advice anybody could offer.

Thanks

Paul

Edited by twinturbosubaru

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Hi folks, I have a task presented to me to try and find an email address in the body of an email.

I have started by using the _pop3 udf which seems good, but I am having trouble figuring out how to actually find the address in the string returned.

Currently I can get the string which is the body of the email, and I can find the initial text which dictates where the address will be, but I can't figure out how to grab the text from that point on up to the end of the line.

I obviously don't know how long the email address will be, and I only know an email address will have an @ symbol in it, I can't be sure of anything else about it.

In the email the address will be preceded by the text on the same line as follows:

Email Address: fred@joebloggs.com

Currently I am just using a StringInStr function to find the "Email Address: " bit, but I don't know how to then retrieve everything after that until either a whitespace character or the end of the line.

I would really appreciate any advice anybody could offer.

Thanks

Paul

i think stringsplit($body, @cr & @lf & ':;., ') will get you along way.

Then all you have to do, is check each element in the array containing stringinstr($element, '@') for a valid username and domain.

Edited by Djarlo

Share this post


Link to post
Share on other sites

Thanks for the suggestion, I will give that a try !

Regards

Paul

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Thanks for the suggestion, I will give that a try !

Regards

Paul

Surely there will be better ways, but hey it works :-)

#Include <Array.au3>
$body = 'Forward email to email1@hotmail.com;email2@hotmail.com;email3@hotmail.com' & @CRLF & @CRLF
$body &= 'Hi im sending you this email to check a script, reply to me at:myemail@live.com.'& @CRLF & @CRLF
$body &= 'regards, the Queen of Britain'
MsgBox(64,'Body',$body)
$adresses = _adressesInBody($body)
_ArrayDisplay($adresses)

Func _adressesInBody($body)
    Local $i, $adress, $rc[1] = [0]
    $body = stringsplit($body, @cr & @lf & ':;, ')
    If @error Then Return SetError(1,0,0)
    For $i = 1 To $body[0]
        If StringInStr ($body[$i],'@') Then
            $adress = StringSplit($body[$i],'@',2)
            If UBound($adress) <> 2 Then ContinueLoop
            If $adress[0] = '' Then ContinueLoop
            If StringRight($adress[1],1) = '.' Then $adress[1] = StringReplace($adress[1],'.','',-1)
            If StringInStr($adress[1],'.') = 0 Then ContinueLoop
            If StringRight($body[$i],1) = '.' Then $body[$i] = StringReplace($body[$i],'.','',-1)
            If StringLeft($body[$i],1) = '.' Then $body[$i] = StringReplace($body[$i],'.','',1)
            $rc[0] = $rc[0] +1
            _ArrayAdd($rc, $body[$i])
        EndIf
    Next
    Return $rc
EndFunc

[Edit] to remove last char dot from domain before checking for dot

Edited by Djarlo

Share this post


Link to post
Share on other sites

twinturbosubaru,

You need to use a StringRegExp - which is not something you can pick up just like that! :P

Here is an example of an SRE working:

$sText = "blahblahblah" & @CRLF & _
         "Email Address: fred@joebloggs.com" & @CRLF & _
         "blahblahblah"

$aAddress = StringRegExp($sText, "(?U).*Address:\s(.*)\v.*", 1)

MsgBox(0, "Address", $aAddress[0])

It works liek this:

(?U)         - look for the smallest number of characters to match
.*Address:\s - look for any number of characters followed by "Address:" and a space
(.*)         - find the smallest number of characters before...(this is the bit we are looking for so it is in ()
\v.*         - an EOL followed by any number of characters

And as you can see it only finds one such string - the address itself. :)

If you post the sanitized text of an e-mail, we can refine the SRE further. :)

M23

1 person likes this

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Wow, that's impressive, that StringRegExp is certainly daunting, I have written a lot of stuff with this scripting language over the years and it's been fantastic, but it seems to just get better and better and more complicated :)

An example of an email is:

Daytime number: 12345678

Mobile number: 99883321

Email Address: fred.bloggs@gmail.com

Preferred Contact Time: 12:30PM

It's the email I have to pull out, but to add some minor complications, the next step I have to do is figure out if there isn't an email address, what is their phone number instead.....it never ends, but I figure if I can get the email one sorted that will get me well on the way to getting the phone number one sorted as well.

Thanks heaps for your help

Regards

Paul

Share this post


Link to post
Share on other sites

twinturbosubaru,

SREs are daunting at first, but well worth getting to grips with - if only to my amateurish level. :)

You could start here - a nice tutorial site. :)

Anyway - here is something for you to be getting on with:

;#cs
$sText = "Daytime number: 12345678" & @CRLF & _
         "Mobile number: 99883321" & @CRLF & _
         "Email Address: fred.bloggs@gmail.com" & @CRLF & _
         "Preferred Contact Time: 12:30PM"
;#ce
#cs
$sText = "Daytime number: 12345678" & @CRLF & _
         "Mobile number: 99883321" & @CRLF & _
         "Email Address:" & @CRLF & _
         "Preferred Contact Time: 12:30PM"
#ce
#cs
$sText = "Daytime number:" & @CRLF & _
         "Mobile number: 99883321" & @CRLF & _
         "Email Address:" & @CRLF & _
         "Preferred Contact Time: 12:30PM"
#ce
#cs
$sText = "Daytime number:" & @CRLF & _
         "Mobile number:" & @CRLF & _
         "Email Address:" & @CRLF & _
         "Preferred Contact Time: 12:30PM"
#ce

$aAddress = StringRegExp($sText, "(?i)(?U).*Address:\s(.*)\v.*", 1)

If Not $aAddress[0] Then

    $aDayNumber = StringRegExp($sText, "(?i)(?U).*?Daytime number:\s(.*)\v.*", 1)

    If Not $aDayNumber[0] Then

        $aMobNumber = StringRegExp($sText, "(?i)(?U).*?Mobile number:\s(.*)\v.*", 1)

        If Not $aMobNumber[0] Then

            MsgBox(0, "Ooops", "Nothing found")

        Else

            MsgBox(0, "Mobile No", $aMobNumber[0])

        EndIf

    Else

        MsgBox(0, "Day No", $aDayNumber[0])

    EndIf

Else

    MsgBox(0, "E-mail", $aAddress[0])

EndIf

Just get one of the texts uncommented and you can see what happens in the various cases - I leave it to you to decipher the SREs (you had a big clue last time and they are not very different!). :D

Djarlo,

Same goes for you - SREs are worth trying to get a handle on. :P

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Thanks for everybodys help, this is how I have ended up which works, sort of....

The code runs fine, but, when I use the $i variable when retrieving the pop3 message I get an error about "subscript used with non-array variable" and I'm not sure what I have done wrong.

If I simply put a number, like 1 or two in place of the $i in the function to retrieve the message it works perfectly.

Any suggestions would be greatly appreciated as I just can't figure it out....

; Connecting to POP3
_pop3Connect($MyPopServer, $MyLogin, $MyPasswd)
If @error Then
    MsgBox(16, "Error. Code " & @error, "Unable to connect to " & $MyPopServer)
    Exit
Else
    ConsoleWrite("Connected to server pop3 " & $MyPopServer & @CR)
EndIf

; Get total messages in mailbox
Local $sMessages = _Pop3Stat()
$Last = int($sMessages[1])
MsgBox(0, "Number of Messages", $Last & " Messages")

; Define first and last message numbers
$i2 = $Last + 1
$i = 1

Do
; Retrieve message and loop until last message number
$body = _Pop3Retr($i)
If Not @error Then
    $i = $i + 1
    $DaytimeNumber = StringRegExp($body, "(?U).*Daytime number:\s(.*)\v.*", 1)
    $aMobileNumber = StringRegExp($body, "(?U).*Mobile Number:\s(.*)\v.*", 1)
    $aEmailAddress = StringRegExp($body, "(?U).*Email Address:\s(.*)\v.*", 1)
    $aContactTime = StringRegExp($body, "(?U).*Preferred Contact Time:\s(.*)\v.*", 1)
    ConsoleWrite("Details: " & $DaytimeNumber[0] & @CR)
Else
    ConsoleWrite("Retr commande failed" & @CR)
    Exit
EndIf
; Exit loop when last message number
Until $i = $i2

; Closing connection
ConsoleWrite(_Pop3Quit() & @CRLF)

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

Djarlo,

Same goes for you - SREs are worth trying to get a handle on. :)

M23

Thanks for the link, yes i see it get used allot.

And usually, using it, saves me a ton of code-writing, so i have used it in my scripts occasionally, copied from example scripts posted here related to what i whas working on.

But in all honesty i dont know what those lines exactly do lol. but ill keep trying to make sense out of it all :-p

I think i have been using Autoit for 3 years now and this is the only thing so far i haven't managed to get a grasp on yet. but i will get it sooner or later :)

Edited by Djarlo

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

Thanks for everybodys help, this is how I have ended up which works, sort of....

The code runs fine, but, when I use the $i variable when retrieving the pop3 message I get an error about "subscript used with non-array variable" and I'm not sure what I have done wrong.

If I simply put a number, like 1 or two in place of the $i in the function to retrieve the message it works perfectly.

Any suggestions would be greatly appreciated as I just can't figure it out....

; Connecting to POP3
_pop3Connect($MyPopServer, $MyLogin, $MyPasswd)
If @error Then
    MsgBox(16, "Error. Code " & @error, "Unable to connect to " & $MyPopServer)
    Exit
Else
    ConsoleWrite("Connected to server pop3 " & $MyPopServer & @CR)
EndIf

; Get total messages in mailbox
Local $sMessages = _Pop3Stat()
$Last = int($sMessages[1])
MsgBox(0, "Number of Messages", $Last & " Messages")

; Define first and last message numbers
$i2 = $Last + 1
$i = 1

Do
; Retrieve message and loop until last message number
$body = _Pop3Retr($i)
If Not @error Then
    $i = $i + 1
    $DaytimeNumber = StringRegExp($body, "(?U).*Daytime number:\s(.*)\v.*", 1)
    $aMobileNumber = StringRegExp($body, "(?U).*Mobile Number:\s(.*)\v.*", 1)
    $aEmailAddress = StringRegExp($body, "(?U).*Email Address:\s(.*)\v.*", 1)
    $aContactTime = StringRegExp($body, "(?U).*Preferred Contact Time:\s(.*)\v.*", 1)
    ConsoleWrite("Details: " & $DaytimeNumber[0] & @CR)
Else
    ConsoleWrite("Retr commande failed" & @CR)
    Exit
EndIf
; Exit loop when last message number
Until $i = $i2

; Closing connection
ConsoleWrite(_Pop3Quit() & @CRLF)

if @error your $i doesnt increment, is it supposed to?

try:

; Connecting to POP3
_pop3Connect($MyPopServer, $MyLogin, $MyPasswd)
If @error Then
    MsgBox(16, "Error. Code " & @error, "Unable to connect to " & $MyPopServer)
    Exit
Else
    ConsoleWrite("Connected to server pop3 " & $MyPopServer & @CR)
EndIf

; Get total messages in mailbox
Local $sMessages = _Pop3Stat()
$Last = Int($sMessages[1])
MsgBox(0, "Number of Messages", $Last & " Messages")

; Define first and last message numbers
$i2 = $Last + 1

For $i = 1 To $i2
    ; Retrieve message and loop until last message number
    $body = _Pop3Retr($i)
    If Not @error Then
        $DaytimeNumber = StringRegExp($body, "(?U).*Daytime number:\s(.*)\v.*", 1)
        $aMobileNumber = StringRegExp($body, "(?U).*Mobile Number:\s(.*)\v.*", 1)
        $aEmailAddress = StringRegExp($body, "(?U).*Email Address:\s(.*)\v.*", 1)
        $aContactTime = StringRegExp($body, "(?U).*Preferred Contact Time:\s(.*)\v.*", 1)
        ConsoleWrite("Details: " & $DaytimeNumber[0] & @CR)
    Else
        ConsoleWrite("Retr commande failed" & @CR)
        Exit
    EndIf
    ; Exit loop when last message number
Next

; Closing connection
ConsoleWrite(_Pop3Quit() & @CRLF)

[Edit] also paste your error please so we know what variable.

[Edit2] check post below

Edited by Djarlo

Share this post


Link to post
Share on other sites

twinturbosubaru,

I do not use the PoP3 UDF so this is a shot in the dark, but I would suggest that the problem is the limits you set for the _Pop3Retr loop - you are obviously trying to read an email which is not there. :)

I would try something like this:

Local $sMessages = _Pop3Stat()
$Last = int($sMessages[1])
MsgBox(0, "Number of Messages", $Last & " Messages")

For $i = 1 To $Last
    $body = _Pop3Retr($i)
    ; Retrieve message and loop until last message number
    $body = _Pop3Retr($i)
    If Not @error Then
        ; SRE code
    Else
        Exit
    EndIf
Next

Now your loop will only run to the value in $Last and you should not overrun the number of e-mails waiting to be read. :)

Let me know if it works - I will try and find the PoP3 UDF and see if I have guessed correctly. :P

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

No, it will just exit out because there is problem retrieving the message.

Regards

Paul

Share this post


Link to post
Share on other sites

OK, I tried that, this is what happens.

When I add this code to your piece of code, there is no error, but I get no result.

$DaytimeNumber = StringRegExp($body, "(?U).*Daytime number:\s(.*)\v.*", 1)
        ConsoleWrite("Details: " & $DaytimeNumber & @CR)

When the code looks like this, it errors out.

I think it has something to do with trying to use an array when it's a string or the other way round or something crazy.....

$DaytimeNumber = StringRegExp($body, "(?U).*Daytime number:\s(.*)\v.*", 1)
        ConsoleWrite("Details: " & $DaytimeNumber[0] & @CR)

The only difference is the ConsoleWrite, it should be an array element I would have thought, but that's what generates the error.

Regards

Paul

Share this post


Link to post
Share on other sites

Unless it's something with the way the POP3 udf works inside it's functions perhaps.

Paul

Share this post


Link to post
Share on other sites

Sorry, didn't see that request for the error:

C:\temp\POP3 Reader\pop3_reader4.au3 (36) : ==> Subscript used with non-Array variable.:

ConsoleWrite("Details: " & $DaytimeNumber[0] & @CR)

ConsoleWrite("Details: " & $DaytimeNumber^ ERROR

Line 36 is:

ConsoleWrite("Details: " & $DaytimeNumber[0] & @CR)

Regards

Paul

Share this post


Link to post
Share on other sites

Unfortunately I have tried that, when I don't put it as an array I don't get any results from the retrieval of the message details, and to then fix that I have to change the $i in the _Pop3Retr($i) to a physical number of the message, however I need to loop through all the messages to process them and if I set a static value here I can't do it then.

Regards

Paul

Share this post


Link to post
Share on other sites

#19 ·  Posted (edited)

twinturbosubaru,

You are correct - you are not getting an array returned from the SRE so AutoIt does not have an element to read. I would suggest that the format of the e-mail is not as suggested earlier and so the SRE is failing - you can check by looking for @error after the StringRegExp line. Could you please ConsoleWrite the body of one of them and post it - SREs are very sensitive to the pattern used and you do need to get it correct. :)

By the way, the SRE for the time should read as follows if the format is as previously posted:

$aContactTime = StringRegExp($sText, "(?U).*Preferred Contact Time:\s(.*)\z", 1)

I told you they were sensitive! :)

Djarlo,

Using the 1 flag with an SRE forces it to return an array. Please do not muddy the water! :P

M23

Edit: Typnig!

Edited by Melba23

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Mmmm, not sure I really understand what you are saying sorry.....but here is as much of the email as I can give you due to privacy reasons as it's an actual email.

Thanks a lot for your help.

Regards

Paul

Content-Type: text/plain;

charset="us-ascii"

Content-Transfer-Encoding: quoted-printable

Dear xxxxxxxxxxxx,

This is a lead alert email from xxxxxxxxxxxx. We have matched you with =

a customer with the following project information.

If you have any questions regarding this lead you can contact the =

xxxxx team on xxxxxxxxxxxx or via xxxxxxxxxxxx@xxxxxxxxxx

_________________________________________________________________________=

_____________

John Smith

Daytime number: 0444 111 222

Mobile number:=20

Email Address: myaddress@thedomain.com

Preferred Contact Time:=20

_________________________________________________________________________=

_____________

Job Details

Conveyancing at suburb21, state postcode

Buy / Sell - Sell - Dwelling - Under $1M

Primary reason for needing a conveyancer:Selling a property

Type of the property:Unit / Townhouse

Property location:

The property is:Investment Property

Approximate value of the property:$500,000 - $1 Million

Is the property Torrens or Strata title?:Strata Title

Start Timeframe:Immediately

Project Stage:Ready to hire a provider

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0