Jump to content

How to efectively find anagrams from words list?


E1M1
 Share

Recommended Posts

Hello. I am trying to find anagrams from words list but I cant come up with efective algorithm. I am not sure if my code does the right thing at all. But main problem is time. If you run it with words.txt below it would take hours to get te result.

here's what I have sofar:

#include <array.au3>
$words = StringSplit(FileRead("words.txt"),@CRLF,1)

for $word in $words
$arr1 = StringSplit($word,"")
$arr1 = _ArraySort($arr1,0,1)
$anagrams = 0
for $word2 in $words
     $arr2 = StringSplit($word2,"")
     $arr2 = _ArraySort($arr2,0,1)
     If _ArrayToString($arr1,"",1) == _ArrayToString($arr2,"",1) and $word2 <> $word Then
         $anagrams += 1
     EndIf
Next
if $anagrams > 0 Then
     ConsoleWrite($word&@CRLF)
EndIf
Next

words.txt (UTF-8)

Edit:

Idea is that yous split words into char array and then sort array to see if arr1 == arr2. if yes then anagram is found. Problem is that you need for in other for.

Edited by E1M1

edited

Link to comment
Share on other sites

I would search the net for "anagram builder visual basic". If you find an algorithm it should be easy to translate it to AutoIt.

Here you get a bunch of code in different languages.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

E1M1,

Just some thoughts...

you need to define what you mean by "anagram" and

Spiff59 uses a technique to mimic array by creating "dynamic variables". The increase in processing speed is astounding.

kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

I cleaned-up one of my posts from the "descramble" thread. It loads and preprocesses your 30,000 word list in 12 seconds then does searches allowing multiple hits fairly quickly:

#include<array.au3>
#include<file.au3>
#include <GUIConstants.au3>
#include <ButtonConstants.au3>

Global $aDictionary[1]

GUICreate("Input the Scrambled Word", 320,120, @DesktopWidth/2-160, @DesktopHeight/2-45, -1, 0x00000018); WS_EX_ACCEPTFILES
$input = GUICtrlCreateInput("", 10, 5, 300, 20)
$output = GUICtrlCreateLabel( "", 10, 40, 300, 20)
$btn = GUICtrlCreateButton("SEARCH", 10, 70, 60, 20, $BS_DEFPUSHBUTTON)
$count = GUICtrlCreateLabel( "", 10, 95, 300, 20)
GUISetState ()

GUICtrlSetState($input, $GUI_DISABLE)
GUICtrlSetData($output,"Loading dictionary...")
GuiCtrlSetColor($output, 0xFF0000); red
$timer = TimerInit()
Process_Dictionary()
$timer = TimerDiff($timer) / 1000
GUICtrlSetData($output,"")
GuiCtrlSetColor($output, 0x000000); black
GUICtrlSetData($count,"Words loaded: " & $aDictionary[0][0] & " (in " & Round($timer, 2) & " seconds)")
GUICtrlSetState($input, $GUI_ENABLE)
GUICtrlSetState($input, $GUI_FOCUS)

While 1
$msg = GUIGetMsg()
Switch $msg
     Case $btn
         $str = Search_Dictionary(GUICtrlRead($input))
         GUICtrlSetData($output, $str)
     Case $GUI_EVENT_CLOSE
         ExitLoop
EndSwitch
WEnd
Exit

;-------------------------------------------------------------------------------
Func Search_Dictionary($word)
Local $i = StringSplit($word,"", 2)
_ArraySort($i)
$word = _ArrayToString($i,"")
$i = _ArraySearch($aDictionary, $word, 1, 0, 0, 0, 1, 1)
If @error Then
     $str = "<not found>"
Else
     $str = $aDictionary[$i][0]
     For $i = ($i + 1) to $aDictionary[0][0]
         If $aDictionary[$i][1] <> $word Then ExitLoop
         $str &= ", " & $aDictionary[$i][0]
     Next
    EndIf
Return $str
EndFunc

;-------------------------------------------------------------------------------
Func Process_Dictionary()
Local $aWords
_FileReadToArray("words.txt", $aWords)
Redim $aDictionary[$aWords[0] + 1][2]
$aDictionary[0][0] = $aWords[0]
For $x = 1 to $aWords[0]
     $aDictionary[$x][0] = $aWords[$x]
     $i = StringSplit($aWords[$x], "", 2)
     _ArraySort($i)
     $aDictionary[$x][1] = _ArrayToString($i, "")
Next
_ArraySort($aDictionary, 0, 1, 0, 1)
EndFunc

@Kylomas - I have been on a "kick" lately promoting the use of Assign(), IsDeclared() and Eval(). Multiple alternate indexes for a ListView allowing sorting by any column in record time () was the most fun to date.

Edit: moved gui stuff out of the load subfunction

Edit2: I guess I shouldn't be surprised. This is much faster loading the dictionary, and does almost instantaneous lookups:

#include<array.au3>
#include<file.au3>
#include <GUIConstants.au3>
#include <ButtonConstants.au3>

Global $aDictionary[1], $iWords

GUICreate("Input the Scrambled Word", 320,120, @DesktopWidth/2-160, @DesktopHeight/2-45, -1, 0x00000018); WS_EX_ACCEPTFILES
$input = GUICtrlCreateInput("", 10,  5, 300, 20)
$output = GUICtrlCreateLabel( "", 10,  40, 300, 20)
$btn = GUICtrlCreateButton("SEARCH", 10,  70, 60, 20, $BS_DEFPUSHBUTTON)
$count = GUICtrlCreateLabel( "", 10,  95, 300, 20)
GUISetState ()
GUICtrlSetState($input, $GUI_DISABLE)
GUICtrlSetData($output,"Loading dictionary...")
GuiCtrlSetColor($output, 0xFF0000); red
$timer = TimerInit()
$iWords = Process_Dictionary()
$timer = TimerDiff($timer) / 1000
GUICtrlSetData($output,"")
GuiCtrlSetColor($output, 0x000000); black
GUICtrlSetData($count,"Words loaded: " & $iWords & " in " & Round($timer, 2) & " seconds)")
GUICtrlSetState($input, $GUI_ENABLE)
GUICtrlSetState($input, $GUI_FOCUS)

While 1
    $msg = GUIGetMsg()
    Switch $msg
        Case $btn
            $str = Search_Dictionary(GUICtrlRead($input))
            GUICtrlSetData($output, $str)
        Case $GUI_EVENT_CLOSE
            ExitLoop
    EndSwitch
WEnd
Exit

;-------------------------------------------------------------------------------
Func Search_Dictionary($word)
    Local $sVar = StringSplit($word, "", 2)
    _ArraySort($sVar)
    $sVar = "_" & _ArrayToString($sVar, "")
    If IsDeclared($sVar) Then
        $str = Eval($sVar)
    Else
        $str = "<not found>"
    EndIf
    Return $str
EndFunc

;-------------------------------------------------------------------------------
Func Process_Dictionary()
    Local $aWords, $sVar
    _FileReadToArray("words.txt", $aWords)
    For $x = 1 to $aWords[0]
        $sVar = StringSplit($aWords[$x], "", 2)
        _ArraySort($sVar)
        $sVar = "_" & _ArrayToString($sVar, "")
        If IsDeclared($sVar) Then
            Assign($sVar, Eval($sVar) & ", " & $aWords[$x], 2)
        Else
            Assign($sVar, $aWords[$x], 2)
        EndIf
    Next
    Return $aWords[0]
EndFunc
Edited by Spiff59
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...