Jump to content



Photo

[SOLVED] Improving PCRE knowledge, parsing HOSTS file for hostnames.


  • Please log in to reply
47 replies to this topic

#1 guinness

guinness

    guinness

  • MVPs
  • 10,299 posts

Posted 20 May 2012 - 05:17 PM

The following code works very well from what I've tested, but I'm wondering if there's a different/better approach to parsing the HOSTS file for the hostnames inside the file. I know there is a HOSTS UDF by JScript, but if there are 5 hostsnames on a single line e.g. 127.0.0.1 www.text.com www.test.com www.example.com it will fail where mine doesn't.

Thanks to anyone who offers improvements.

Working code.
AutoIt         
#include <Array.au3> ; For _ArrayDisplay. Example() Func Example()     Local $sData = FileRead(@SystemDir & '\drivers\etc\HOSTS')     ; Strip comments e.g. # This is a single comment. #     $sData = StringRegExpReplace($sData, '#.*', '')     ; Strip forwarding IP address e.g. 127.0.0.1     $sData = StringRegExpReplace($sData, '(?m)^\s*(?:\d{1,3}\.){3}\d{1,3}', '')     ; Parse HOSTS file to create an array with the hostnames in the file.     Local $aArray = StringRegExp(' HOSTS_Count ' & $sData, '(?:[\w\.\-/]{3,})', 3) ; Past ideas: '((?:\d{1,3}\.){3}\d{1,3})\s+(.*?)\r\n' ... '((?:[\w\-]*\.+)+[\w\-]*)'     If @error Then         MsgBox(4096, '', 'An error occurred when parsing the HOSTS file.')     Else         ; Replace the 0th element with the number of items in the array.         $aArray[0] = UBound($aArray, 1) - 1         ; Ask whether or not to display the array using _ArrayDisplay, can be slow depending on how large your HOSTS file is.         If MsgBox(4 + 4096, '', 'Would you like to display the Array? (Could be slow depending on how large you HOSTS file is.)') = 6 Then             _ArrayDisplay($aArray)         EndIf     EndIf EndFunc   ;==>Example


Updated: With suggestions by Spiff59.

Edited by guinness, 24 May 2012 - 10:47 PM.

Example List: _AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_DesktopDimensions()_DisplayPassword()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUISetIcon()_Icon_Clear()/_Icon_Set()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringIsValid()_StringReplaceWholeWord()_StringStripChar()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()AutoIt SearchAutoIt3 PortableAutoItWinGetTitle()/AutoItWinSetTitle()CodingFileInstallrGeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIGetBkColor()LockFile()PasteBinSciTE JumpSignature CreatorWM_COPYDATAMore Examples...Updated: 11/04/2013






#2 Spiff59

Spiff59

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 1,312 posts

Posted 20 May 2012 - 10:35 PM

I notice your SRE to strip comments seems to leave the end-of-line characters in place, so wouldn't inserting "" instead of @CRLF just leave less data to parse later?

On the last SRE, I'd think appending a dummy entry to the front of $sData would save you having to Redim() the array, or shift the first entry to the end (leaving the list out of sync with the actual file), something like?
Local $aArray = StringRegExp("xxx" & $sData, '(?:[w.-/]{3,})', 3) $aArray[0] = UBound($aArray, 1) - 1

Edited by Spiff59, 20 May 2012 - 10:43 PM.


#3 guinness

guinness

    guinness

  • MVPs
  • 10,299 posts

Posted 21 May 2012 - 05:31 AM

I notice your SRE to strip comments seems to leave the end-of-line characters in place, so wouldn't inserting "" instead of @CRLF just leave less data to parse later?

The reason I did this was another SRE was parsing both the forwarding address and hostname, though it's not needed anymore. Changed.

On the last SRE, I'd think appending a dummy entry to the front of $sData would save you having to Redim() the array, or shift the first entry to the end (leaving the list out of sync with the actual file), something like?

I normally do this, no idea why I didn't do it this time. Thanks Spiff59.

Example List: _AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_DesktopDimensions()_DisplayPassword()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUISetIcon()_Icon_Clear()/_Icon_Set()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringIsValid()_StringReplaceWholeWord()_StringStripChar()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()AutoIt SearchAutoIt3 PortableAutoItWinGetTitle()/AutoItWinSetTitle()CodingFileInstallrGeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIGetBkColor()LockFile()PasteBinSciTE JumpSignature CreatorWM_COPYDATAMore Examples...Updated: 11/04/2013


#4 ripdad

ripdad

    Member

  • Active Members
  • PipPipPipPipPipPip
  • 537 posts

Posted 21 May 2012 - 11:30 PM

been playing with this some -- maybe it will help.
AutoIt         
#include  ; For _ArrayDisplay. Example() Func Example()     Local $aArray, $sData = '127.0.0.1 DummyEntry' & @CRLF     ;$sData &= FileRead(@SystemDir & 'driversetcHOSTS.');<- dot ;#cs     $sData &= '# string one' & @CRLF     $sData &= '   # string two' & @CRLF     $sData &= @CRLF & @CRLF & @CRLF     $sData &= ' 127.0.0.1   localhost' & @CRLF     $sData &= ' 127.0.0.1' & @TAB & ' www.url1.com # string three' & @CRLF     $sData &= '  127.0.0.1' & @TAB & '  1www.url2.com # string four' & @CRLF     $sData &= '   127.0.0.1' & @TAB & '   123.url3.com/page.htm' & @CRLF     $sData &= ' 127.0.0.1' & @TAB & '    url4.com/folder/text.txt' & @CRLF     $sData &= '127.0.0.1    www.some-site.com/file.zip' & @CRLF     $sData &= '# string five' & @CRLF ;#ce     $aArray = StringRegExp($sData, '(?:d{1,3}.){3}d{1,3}W+(.*?)[^dw-.]', 3)     $aArray[0] = UBound($aArray, 1) - 1     _ArrayDisplay($aArray) EndFunc

Edited by ripdad, 21 May 2012 - 11:34 PM.

I'm pretty sure this script has "some flaws" (somewhere). Welcome to programming!

#5 guinness

guinness

    guinness

  • MVPs
  • 10,299 posts

Posted 21 May 2012 - 11:34 PM

Thanks ripdad, a different approach with the SRE, exactly what I was looking for to improve my knowledge.

Edit: I think you're missing a forward slash in your SRE if I'm not mistaken & w includes digits so no need for d. Also you can have multiple host names on a single line e.g. 127.0.0.1 localhost localhost2 localhost3, which isn't being picked up with your SRE.

Thanks for participating.

Edited by guinness, 21 May 2012 - 11:43 PM.

Example List: _AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_DesktopDimensions()_DisplayPassword()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUISetIcon()_Icon_Clear()/_Icon_Set()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringIsValid()_StringReplaceWholeWord()_StringStripChar()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()AutoIt SearchAutoIt3 PortableAutoItWinGetTitle()/AutoItWinSetTitle()CodingFileInstallrGeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIGetBkColor()LockFile()PasteBinSciTE JumpSignature CreatorWM_COPYDATAMore Examples...Updated: 11/04/2013


#6 ripdad

ripdad

    Member

  • Active Members
  • PipPipPipPipPipPip
  • 537 posts

Posted 21 May 2012 - 11:42 PM

If your hosts file is fairly clean, it should work. (I have seen some messy ones in the past)

I left the forward-slash out on purpose. I only wanted the root website address.
You can add it in if you wish.
I'm pretty sure this script has "some flaws" (somewhere). Welcome to programming!

#7 guinness

guinness

    guinness

  • MVPs
  • 10,299 posts

Posted 21 May 2012 - 11:45 PM

See above for additional comments, but otherwise a different insight into PCRE.

Example List: _AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_DesktopDimensions()_DisplayPassword()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUISetIcon()_Icon_Clear()/_Icon_Set()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringIsValid()_StringReplaceWholeWord()_StringStripChar()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()AutoIt SearchAutoIt3 PortableAutoItWinGetTitle()/AutoItWinSetTitle()CodingFileInstallrGeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIGetBkColor()LockFile()PasteBinSciTE JumpSignature CreatorWM_COPYDATAMore Examples...Updated: 11/04/2013


#8 ripdad

ripdad

    Member

  • Active Members
  • PipPipPipPipPipPip
  • 537 posts

Posted 21 May 2012 - 11:50 PM

-"Thanks for participating"
glad to be of service.

-"multiple host names on a single line"
I put each entry on it's own line, but I threw it out there ... modify how you like.
I'm pretty sure this script has "some flaws" (somewhere). Welcome to programming!

#9 jchd

jchd

    Whatever your capacity, resistance is futile.

  • MVPs
  • 3,250 posts

Posted 22 May 2012 - 12:17 AM

@ripad,

Also a commented line will silently be processed as if uncommented:
'# string five 127.0.0.1 www.absolutely-forbiden.site'
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQL tutorial (covers generic SQL, but most of it apply to SQLite as well)An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious workPCRE v8.32 regexp pattern documentation. AutoIt uses a slightly older version so that more advanced features are not all available.RegExp tutorial: enough to get started

#10 ripdad

ripdad

    Member

  • Active Members
  • PipPipPipPipPipPip
  • 537 posts

Posted 22 May 2012 - 12:29 AM

jchd,

I didn't see that one coming.
I don't have sites that are commented, so I didn't think of it.

Thanks for pointing it out.
I'm pretty sure this script has "some flaws" (somewhere). Welcome to programming!

#11 Spiff59

Spiff59

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 1,312 posts

Posted 22 May 2012 - 02:12 PM

Does this work?
Local $aArray = StringRegExp('HOSTS_Count ' & $sData, '[S*]+', 3)

Edited by Spiff59, 22 May 2012 - 02:13 PM.


#12 guinness

guinness

    guinness

  • MVPs
  • 10,299 posts

Posted 22 May 2012 - 02:15 PM

Looks as though that SRE works just like mine does.

Edited by guinness, 23 May 2012 - 06:03 AM.

Example List: _AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_DesktopDimensions()_DisplayPassword()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUISetIcon()_Icon_Clear()/_Icon_Set()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringIsValid()_StringReplaceWholeWord()_StringStripChar()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()AutoIt SearchAutoIt3 PortableAutoItWinGetTitle()/AutoItWinSetTitle()CodingFileInstallrGeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIGetBkColor()LockFile()PasteBinSciTE JumpSignature CreatorWM_COPYDATAMore Examples...Updated: 11/04/2013


#13 jchd

jchd

    Whatever your capacity, resistance is futile.

  • MVPs
  • 3,250 posts

Posted 23 May 2012 - 02:38 AM

I've found interesting trying to come up with a one-liner regexp, just for the fun. And here it is (AFAIK it works):
#include <Array.au3> ; For _ArrayDisplay. Example() Func Example()     Local $sData = FileRead(@SystemDir & 'driversetcHOSTS')     ; Parse HOSTS file to create an array with the hostnames in the file. Local $aArray = StringRegExp($sData, "(?im)G(?:(?:^s*#.*s*)*|(?:^s*)*)*(?:|(?:d{1,3}.){3}d{1,3}s+)([[:alpha:]][w.-/]{2,})s*", 3)     If @error > 1 Then         MsgBox(4096, '', 'An error ' & @error & ' occurred when parsing the HOSTS file.')     Else         _ArrayDisplay($aArray)     EndIf EndFunc   ;==>Example

In fact it isn't that tricky. If someone has a counter-example, please post it.
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQL tutorial (covers generic SQL, but most of it apply to SQLite as well)An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious workPCRE v8.32 regexp pattern documentation. AutoIt uses a slightly older version so that more advanced features are not all available.RegExp tutorial: enough to get started

#14 Spiff59

Spiff59

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 1,312 posts

Posted 23 May 2012 - 04:46 AM

Ouch! That one hurts my eyes to look at and might cause insanity were I to try and decode it!
Kudos for being able to assemble that!. It is dropping the last entry from my HOSTS file though.

#15 guinness

guinness

    guinness

  • MVPs
  • 10,299 posts

Posted 23 May 2012 - 06:27 AM

It returned @error 1 (I had to change @error > 1 to see it the MsgBox.) The reason being is you can have multiple hostnames on a single line >> http://technet.microsoft.com/en-us/library/cc958812.aspx which is what I do to optimise the HOSTS file.

Otherwise it works jchd if a single hostname is used, thanks for your input.

Edited by guinness, 23 May 2012 - 06:29 AM.

Example List: _AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_DesktopDimensions()_DisplayPassword()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUISetIcon()_Icon_Clear()/_Icon_Set()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringIsValid()_StringReplaceWholeWord()_StringStripChar()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()AutoIt SearchAutoIt3 PortableAutoItWinGetTitle()/AutoItWinSetTitle()CodingFileInstallrGeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIGetBkColor()LockFile()PasteBinSciTE JumpSignature CreatorWM_COPYDATAMore Examples...Updated: 11/04/2013


#16 UEZ

UEZ

    Never say never

  • MVPs
  • 3,602 posts

Posted 23 May 2012 - 07:57 AM

Here is what I got but I cannot get rid of the #... suffix if available.

AutoIt         
#include <array.au3> $sHosts = _ '# Copyright © 1993-1999 Microsoft Corp.' & @CRLF & _ '#' & @CRLF & _ '# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.' & @CRLF & _ '#' & @CRLF & _ '# This file contains the mappings of IP addresses to host names. Each' & @CRLF & _ '# entry should be kept on an individual line. The IP address should' & @CRLF & _ '# be placed in the first column followed by the corresponding host name.' & @CRLF & _ '# The IP address and the host name should be separated by at least one' & @CRLF & _ '# space.' & @CRLF & _ '#' & @CRLF & _ '# Additionally, comments (such as these) may be inserted on individual' & @CRLF & _ '# lines or following the machine name denoted by a "#" symbol.' & @CRLF & _ '#' & @CRLF & _ '# For example:' & @CRLF & _ '#' & @CRLF & _ '#      102.54.94.97     rhino.acme.com          # source server' & @CRLF & _ '#       38.25.63.10     x.acme.com              # x client host' & @CRLF & _ '127.0.0.1       localhost  #current host' & @CRLF & _ '10.10.10.10 nirvana.somewhere-test.com' & @CRLF & _  '                          20.20.20.20                                   www.text.com www.test.com www.example.com #dummy' & @CRLF & _ '10.10.10.20 test' $aFilterHosts = StringRegExp($sHosts, "(?m:^)s*d+.d+.d+.d+s*(.*)", 3) _ArrayDisplay($aFilterHosts)


Any help?

Br,
UEZ

Edited by UEZ, 23 May 2012 - 07:59 AM.

 
The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯


#17 jchd

jchd

    Whatever your capacity, resistance is futile.

  • MVPs
  • 3,250 posts

Posted 23 May 2012 - 10:06 AM

@guinness, Spiff59

Strange, that was precisely why I had some fun working on it. It works fine here: multiple entries on the same IP work and last site works too. Please can you both post a sample file making it fail (post a file, not a copy'n'paste else the forum is likely to reformat whitespaces, which may make a difference in this case).

@UEZ,

Hi, the problem with the line-by-line approach (^ anchoring) is that you get a single result entry in case of multiple targets for the same IP. That's the whole purpose of my use of G (which would gain to be documented in our help, BTW). Then all the mess around the capture only serve to "eat" unsignificant portions. At least it was the idea... sigh!

Edited by jchd, 23 May 2012 - 10:15 AM.

SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQL tutorial (covers generic SQL, but most of it apply to SQLite as well)An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious workPCRE v8.32 regexp pattern documentation. AutoIt uses a slightly older version so that more advanced features are not all available.RegExp tutorial: enough to get started

#18 Spiff59

Spiff59

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 1,312 posts

Posted 23 May 2012 - 01:48 PM

@guinness, Spiff59

My actual hosts file is boring, so I snagged the first one off the web and then "doubled-up" one line by adding the testo.com address. Your mega-regexp loses the last entry with this file:

Attached Files



#19 BrewManNH

BrewManNH

    באָבקעס מיט קודוצ׳ה

  • MVPs
  • 6,829 posts

Posted 23 May 2012 - 01:58 PM

I tested jchd's script with that hosts.txt file and it was able to grab all entries as well as the doubled up entry for testco.com, except for the last entry "127.0.0.1 www.abcsearch.com #[Restricted Zone site]"

[0]|localhost
[1]|test.bleepingcomputer.com
[2]|admin.abcsearch.com
[3]|www.testo.com
[4]|www3.abcsearch.com


I even tested it with the blank lines at the end of the file removed and no CRLF at the end of the last entry and it picked up everything but the last one. The regex is way beyond me so I can't figure out why.

How to ask questions the smart way!

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.

Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.

_FileGetProperty - Retrieve the properties of a file SciTE Toolbar - A toolbar demo for use with the SciTE editorGUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.

GUIToolTip UDF Demo - Demo script to show how to use the GUIToolTip UDF to create and use customized tooltips.

Posted Image


#20 UEZ

UEZ

    Never say never

  • MVPs
  • 3,602 posts

Posted 23 May 2012 - 03:10 PM

My test hosts file in post#16 is also not working with jchd's regex.

Br,
UEZ

 
The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users