OpSecMonkey Posted September 15, 2018 Posted September 15, 2018 Hi everyone I am looking to make a email parser in AutoIt. I was tasked with pulling just emails and the document name from about 1.5tb of documents. Doing this by hand sucks a ton. So what I was looking to do was make a program that can read .txt, .csv and .sql files to have it pull all the emails and write to a new file with the name of the file in the 2nd column of the csv. A example would be like folder A has text documents called 1.txt, 2.txt, 3.txt and I need them those documents to put output in 1 file called whatever I name it and contain the emails + document name like Johndoe@whatever.com, documentA janedoe@whatever.com, documentA Can this be done? and if so where should I start. Any help would be great.
FrancescoDiMuro Posted September 15, 2018 Posted September 15, 2018 Hi @OpSecMonkey Yes, it is possible with AutoIt. Do you already have .txt files, or you have to generate them ( from emails )? Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette
water Posted September 15, 2018 Posted September 15, 2018 1.5tb of documents What do you want to do with the output of your script (mail address plus document name). Depending on the further processing we might recommend another format like a database, Excel etc. Do you need the document nam so you can later link/refer to this document? My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
OpSecMonkey Posted September 15, 2018 Author Posted September 15, 2018 @FrancescoDiMuro I already have all the .txt, csv and SQL files I just need to be able to extract all the emails from them and create a output file with the name of the document like above. @water the output into a csv is what I am going for so I can upload them to a database. Thats why I created the 2 fields of email & document name. These documents also have a lot of other text in them and its a pain in the butt to deal with.
FrancescoDiMuro Posted September 15, 2018 Posted September 15, 2018 (edited) @OpSecMonkey I suggest you to take a look at _FileListToArray() to get the list of all the files you want to read, parse, and process. File*/_File* functions to read the content of the files, and to create/write to the outuput file Edited September 15, 2018 by FrancescoDiMuro Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette
OpSecMonkey Posted September 15, 2018 Author Posted September 15, 2018 3 hours ago, FrancescoDiMuro said: @OpSecMonkey I suggest you to take a look at _FileListToArray() to get the list of all the files you want to read, parse, and process. File*/_File* functions to read the content of the files, and to create/write to the outuput file Thanks @FrancescoDiMuro I will check that out.
Malkey Posted September 16, 2018 Posted September 16, 2018 (edited) You should be able to get something out of this working example. expandcollapse popup#include <Array.au3> #include <File.au3> #include <MsgBoxConstants.au3> Opt("WinTitleMatchMode", -2) ;1=start, 2=subStr, 3=exact, 4=advanced, -1 to -4=Nocase (Allows title in WinWaitActive() to match) ; ----------- Create test directory and files --------------- Local Const $sFilePath = @ScriptDir & "\TestFolder_A" For $i = 1 To 6 $hFileOpen = FileOpen($sFilePath & "\" & $i & ".txt", 9) If Mod($i, 2) Then ; If $i is odd number. FileWrite($sFilePath & "\" & $i & ".txt", "web address is name" & $i & "@whatever.com in this file") Else FileWrite($sFilePath & "\" & $i & ".txt", "No web address is in this file") EndIf FileClose($hFileOpen) Next ; ----------- End of Create test directory and files --------------- $aFileList = _Files2Array($sFilePath) ;_ArrayDisplay($aFileList, "$aFileList") $ResultFile = $sFilePath & "\ResultFile.txt" ; The required file. $hFileOpen = FileOpen($ResultFile, 10) ; 10 = $FO_OVERWRITE (2) = Write mode (erase previous contents) plus $FO_CREATEPATH (8) = Create directory structure if it doesn't exist. For $i = 1 To $aFileList[0] $aEMail = StringRegExp(FileRead($aFileList[$i]), "\S+@\S+", 1) ; Captures 1st occurrence of email address in file (if present, returns an array). If IsArray($aEMail) Then FileWriteLine($hFileOpen, $aEMail[0] & ", " & $aFileList[$i]) EndIf Next FileClose($hFileOpen) ; Display required file ShellExecute($ResultFile) $hWnd = WinWaitActive("ResultFile.txt", "", 10) ; Clean up the created directory and files $iMsgRet = MsgBox(1, "Delete directory", 'Press "Ok" to delete the directory:' & @CRLF & '"' & $sFilePath & '"', 0, $hWnd) If $iMsgRet = 1 Then ; If "Ok" is pressed. DirRemove($sFilePath, 1) ; 1 = $DIR_REMOVE) WinClose($hWnd) ; Close the application that the .txt file was opened in by using ShellExecute(). EndIf Func _Files2Array($sDir) ; List all the files and folders in the $sDir using the default parameters and return the full path. Local $aFileList = _FileListToArray($sDir, Default, Default, True) If @error = 1 Then MsgBox($MB_SYSTEMMODAL, "", "Path was invalid.") Exit EndIf If @error = 4 Then MsgBox($MB_SYSTEMMODAL, "", "No file(s) were found.") Exit EndIf Return $aFileList EndFunc ;==>_Files2Array Edited September 16, 2018 by Malkey Tidied up script & added comments.
OpSecMonkey Posted September 18, 2018 Author Posted September 18, 2018 Thanks @Malkey I will give this a look tonight in my hotel room. On the road this week.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now