Jump to content

Recommended Posts

Posted

Hi everyone I am looking to make a email parser in AutoIt. I was tasked with pulling just emails and the document name from about 1.5tb of documents. Doing this by hand sucks a ton. So what I was looking to do was make a program that can read .txt, .csv and .sql files to have it pull all the emails and write to a new file with the name of the file in the 2nd column of the csv. 

A example would be like folder A has text documents called 1.txt, 2.txt, 3.txt and I need them those documents to put output in 1 file called whatever I name it and contain the emails + document name like 

Johndoe@whatever.com, documentA
janedoe@whatever.com, documentA

Can this be done? and if so where should I start. Any help would be great. 

Posted

1.5tb of documents :o
What do you want to do with the output of your script (mail address plus document name). Depending on the further processing we might recommend another format like a database, Excel etc.
Do you need the document nam so you can later link/refer to this document?
 

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Posted

@FrancescoDiMuro I already have all the .txt, csv and SQL files I just need to be able to extract all the emails from them and create a output file with the name of the document like above.

 

@water the output into a csv is what I am going for so I can upload them to a database. Thats why I created the 2 fields of email & document name. These documents also have a lot of other text in them and its a pain in the butt to deal with. 

Posted
3 hours ago, FrancescoDiMuro said:

@OpSecMonkey

I suggest you to take a look at _FileListToArray() to get the list of all the files you want to read, parse, and process.

File*/_File* functions to read the content of the files, and to create/write to the outuput file :)

Thanks @FrancescoDiMuro I will check that out. 

Posted (edited)

You should be able to get something out of this working example.

#include <Array.au3>
#include <File.au3>
#include <MsgBoxConstants.au3>
Opt("WinTitleMatchMode", -2) ;1=start, 2=subStr, 3=exact, 4=advanced, -1 to -4=Nocase (Allows title in WinWaitActive() to match)

; ----------- Create test directory and files ---------------
Local Const $sFilePath = @ScriptDir & "\TestFolder_A"
For $i = 1 To 6
    $hFileOpen = FileOpen($sFilePath & "\" & $i & ".txt", 9)
    If Mod($i, 2) Then ; If $i is odd number.
        FileWrite($sFilePath & "\" & $i & ".txt", "web address is name" & $i & "@whatever.com in this file")
    Else
        FileWrite($sFilePath & "\" & $i & ".txt", "No web address is in this file")
    EndIf
    FileClose($hFileOpen)
Next
; ----------- End of Create test directory and files ---------------

$aFileList = _Files2Array($sFilePath)
;_ArrayDisplay($aFileList, "$aFileList")

$ResultFile = $sFilePath & "\ResultFile.txt" ; The required file.
$hFileOpen = FileOpen($ResultFile, 10) ; 10 = $FO_OVERWRITE (2) = Write mode (erase previous contents) plus $FO_CREATEPATH (8) = Create directory structure if it doesn't exist.
For $i = 1 To $aFileList[0]
    $aEMail = StringRegExp(FileRead($aFileList[$i]), "\S+@\S+", 1) ; Captures 1st occurrence of email address in file (if present, returns an array).
    If IsArray($aEMail) Then
        FileWriteLine($hFileOpen, $aEMail[0] & ", " & $aFileList[$i])
    EndIf
Next
FileClose($hFileOpen)

; Display required file
ShellExecute($ResultFile)
$hWnd = WinWaitActive("ResultFile.txt", "", 10)

; Clean up the created directory and files
$iMsgRet = MsgBox(1, "Delete directory", 'Press "Ok" to delete the directory:' & @CRLF & '"' & $sFilePath & '"', 0, $hWnd)
If $iMsgRet = 1 Then ; If "Ok" is pressed.
    DirRemove($sFilePath, 1) ; 1 = $DIR_REMOVE)
    WinClose($hWnd) ; Close the application that the .txt file was opened in by using ShellExecute().
EndIf


Func _Files2Array($sDir)
    ; List all the files and folders in the $sDir using the default parameters and return the full path.
    Local $aFileList = _FileListToArray($sDir, Default, Default, True)
    If @error = 1 Then
        MsgBox($MB_SYSTEMMODAL, "", "Path was invalid.")
        Exit
    EndIf
    If @error = 4 Then
        MsgBox($MB_SYSTEMMODAL, "", "No file(s) were found.")
        Exit
    EndIf
    Return $aFileList
EndFunc   ;==>_Files2Array

 

Edited by Malkey
Tidied up script & added comments.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...