leuce Posted November 27, 2009 Share Posted November 27, 2009 (edited) G'day everyone I was interested to know which mail programs are most popular among the people who send me mail, particularly those who belong to traffic-intensive mailing lists such as Yahoogroups or Googlegroups. I use Thunderbird, which uses MBOX as its mail format, which is a plaintext format, and most mail clients identify themselves in the mail headers. Unfortunately they do so in a variety of ways... some use "X-Mailer", others use "User-Agent", and still others use other header types. So, this script compares a list of mail clients with all lines in an MBOX file that contains a colon. If you can think of better, easier ways of counting mail clients, let me know. Issues: * I have 2 GB RAM on my computer, and the script refuses to open an MBOX file of 200 000 KB (saying it's too big). * If there is a colon in the body of a mail, and the name of a mail client, the script will count it. * The results for Elmo includes the results for Elm. * I have no idea how Global works (hence all them Globals). * You need a list of mail clients (see attached list (feel free to improve it)). expandcollapse popup#Include <Timers.au3> #cs MUAcount, by Samuel Murray - A small program that counts the number of occurances of mail clients' names in an MBOX file in all lines that contain a colon. Tweaks: First check to see if mail client name occurs in the entire MBOX file before checking the MBOX file line by line; TrayTip or ToolTip to tell the user about the progress. #ce Global $i Global $j Global $k Global $l Global $mboxfile Global $mboxfileopen Global $mboxfileread Global $mboxfilesplit Global $mboxfileread Global $clientlistfile Global $clientlistfileopen Global $clientlistfileread Global $clientlistfilesplit Global $clientlistfileread Global $writefileopen Global $writefileread $k = 0 $l = 0 ; First, open the MBOX file and segment it by line $mboxfile = FileOpenDialog ("Select the MBOX file", "", "All (*.*)") ; to specify more file types, do this: "All (*.*)|Text files (*.txt)" $mboxfileopen = FileOpen ($mboxfile, 0) ; use 0 for ANSI, 32 for UTF16LE and 128 for UTF8 $mboxfileread = FileRead ($mboxfileopen) $mboxfilesplit = StringSplit ($mboxfileread, @CRLF, 1) ; you can also use @CR and @LF for other line endings MsgBox (0, "Number of lines in MBOX file", $mboxfilesplit[0], 0) ; TrayTip ("Number of lines in MBOX file", $mboxfilesplit[0] & " lines will be checked.", 20) ; I think the initial TrayTip may cause the script to malfunction if the user takes too long (not sure) ; Next, read the mail client list and segment it by line $clientlistfile = FileOpenDialog ("Select the list of mail clients", "", "All (*.*)") $clientlistfileopen = FileOpen ($clientlistfile, 0) $clientlistfileread = FileRead ($clientlistfileopen) $clientlistfilesplit = StringSplit ($clientlistfileread, @CRLF, 1) ; MsgBox (0, "Number of mail clients to check for", $clientlistfilesplit[0], 0) $writefileopen = FileOpen ("countfile.txt", 1) FileWrite ($writefileopen, @CRLF & "==" & @CRLF & @CRLF & "There are the number of times that a mail client's name occurs in a line in the MBOX file that also contains a colon. Most lines with colons that also contain a mail client's name is a line from the header, which is probably a line that identifies the mail client of the sender. The count is therefore only approximate, but generally close to the truth. Another problem is the fact that the count for 'Elmo' will include the count for 'Elm', unfortunately. By default, the search tries to be case-sensitive." & @CRLF & @CRLF) FileClose ($writefileopen) $starttime = _Timer_Init() ; Next, check the two arrays against each other, and if a match, write it to a file. For $j = 1 to $clientlistfilesplit[0] If StringInStr ($mboxfileread, $clientlistfilesplit[$j]) Then For $i = 1 to $mboxfilesplit[0] If StringInStr ($mboxfilesplit[$i], ":", 0) Then $l = $l + 1 If StringInStr ($mboxfilesplit[$i], $clientlistfilesplit[$j], 1) Then ; 0 locale-default, 1 case-sensitive, 2 case-insensitive $k = $k + 1 EndIf EndIf Next $writefileopen = FileOpen ("countfile.txt", 1) $writefilewrite = FileWrite ($writefileopen, $clientlistfilesplit[$j] & @TAB & $k & @CRLF) Sleep ("100") FileClose ($writefileopen) ; TrayTip ("Mail client found", $clientlistfilesplit[$j] & " found " & $k & " times", 1) ToolTip ($clientlistfilesplit[$j] & " found " & $k & " times", 0, 0) $k = 0 Else ; TrayTip ("Mail client not found", $clientlistfilesplit[$j] & " not found", 1) ToolTip ($clientlistfilesplit[$j] & " not found", 0, 0) $writefileopen = FileOpen ("countfile.txt", 1) $writefilewrite = FileWrite ($writefileopen, $clientlistfilesplit[$j] & @TAB & "0" & @CRLF) Sleep ("100") FileClose ($writefileopen) EndIf ; If you want the ToolTip to go away after 1 second, do this: ; Sleep ("1000") ; ToolTip ("") Next ; MsgBox (0, "Report on MBOX file", $l & " lines checked, in " & _Timer_Diff($starttime) / 1000 & " seconds.", 0) TrayTip ("Done!", $l & " lines checked, in " & _Timer_Diff($starttime) & " miliseconds.", 1) If anyone knows of a freeware program that does this (or more), please let me know. :-) Samuelmailprogs.zip Edited November 27, 2009 by leuce Link to comment Share on other sites More sharing options...
leuce Posted November 27, 2009 Author Share Posted November 27, 2009 Someone from another mailing list said that this also works on some computers: grep X-Mailer <mailboxfile(s)> | sort | uniq -c | sort -nr One must just remember to repeat the action for User-Agent and for X-MimeOLE. And this line does not compare the MBOX files to a list of mail programs -- it merely gives a list of them (and every version of a program is counted as a separate client). Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now