Jump to content
Sign in to follow this  
E1M1

How to efectively find anagrams from words list?

Recommended Posts

E1M1

Hello. I am trying to find anagrams from words list but I cant come up with efective algorithm. I am not sure if my code does the right thing at all. But main problem is time. If you run it with words.txt below it would take hours to get te result.

here's what I have sofar:

#include <array.au3>
$words = StringSplit(FileRead("words.txt"),@CRLF,1)

for $word in $words
$arr1 = StringSplit($word,"")
$arr1 = _ArraySort($arr1,0,1)
$anagrams = 0
for $word2 in $words
     $arr2 = StringSplit($word2,"")
     $arr2 = _ArraySort($arr2,0,1)
     If _ArrayToString($arr1,"",1) == _ArrayToString($arr2,"",1) and $word2 <> $word Then
         $anagrams += 1
     EndIf
Next
if $anagrams > 0 Then
     ConsoleWrite($word&@CRLF)
EndIf
Next

words.txt (UTF-8)

Edit:

Idea is that yous split words into char array and then sort array to see if arr1 == arr2. if yes then anagram is found. Problem is that you need for in other for.

Edited by E1M1

edited

Share this post


Link to post
Share on other sites
water

I would search the net for "anagram builder visual basic". If you find an algorithm it should be easy to translate it to AutoIt.

Here you get a bunch of code in different languages.


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-06-01 - Version 1.4.9.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-01-27 - Version 1.3.3.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites
E1M1

Thank you.


edited

Share this post


Link to post
Share on other sites
Spiff59

There were some related threads here as well, search on "descramble"

Share this post


Link to post
Share on other sites
kylomas

E1M1,

Just some thoughts...

you need to define what you mean by "anagram" and

Spiff59 uses a technique to mimic array by creating "dynamic variables". The increase in processing speed is astounding.

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Spiff59

I cleaned-up one of my posts from the "descramble" thread. It loads and preprocesses your 30,000 word list in 12 seconds then does searches allowing multiple hits fairly quickly:

#include<array.au3>
#include<file.au3>
#include <GUIConstants.au3>
#include <ButtonConstants.au3>

Global $aDictionary[1]

GUICreate("Input the Scrambled Word", 320,120, @DesktopWidth/2-160, @DesktopHeight/2-45, -1, 0x00000018); WS_EX_ACCEPTFILES
$input = GUICtrlCreateInput("", 10, 5, 300, 20)
$output = GUICtrlCreateLabel( "", 10, 40, 300, 20)
$btn = GUICtrlCreateButton("SEARCH", 10, 70, 60, 20, $BS_DEFPUSHBUTTON)
$count = GUICtrlCreateLabel( "", 10, 95, 300, 20)
GUISetState ()

GUICtrlSetState($input, $GUI_DISABLE)
GUICtrlSetData($output,"Loading dictionary...")
GuiCtrlSetColor($output, 0xFF0000); red
$timer = TimerInit()
Process_Dictionary()
$timer = TimerDiff($timer) / 1000
GUICtrlSetData($output,"")
GuiCtrlSetColor($output, 0x000000); black
GUICtrlSetData($count,"Words loaded: " & $aDictionary[0][0] & " (in " & Round($timer, 2) & " seconds)")
GUICtrlSetState($input, $GUI_ENABLE)
GUICtrlSetState($input, $GUI_FOCUS)

While 1
$msg = GUIGetMsg()
Switch $msg
     Case $btn
         $str = Search_Dictionary(GUICtrlRead($input))
         GUICtrlSetData($output, $str)
     Case $GUI_EVENT_CLOSE
         ExitLoop
EndSwitch
WEnd
Exit

;-------------------------------------------------------------------------------
Func Search_Dictionary($word)
Local $i = StringSplit($word,"", 2)
_ArraySort($i)
$word = _ArrayToString($i,"")
$i = _ArraySearch($aDictionary, $word, 1, 0, 0, 0, 1, 1)
If @error Then
     $str = "<not found>"
Else
     $str = $aDictionary[$i][0]
     For $i = ($i + 1) to $aDictionary[0][0]
         If $aDictionary[$i][1] <> $word Then ExitLoop
         $str &= ", " & $aDictionary[$i][0]
     Next
    EndIf
Return $str
EndFunc

;-------------------------------------------------------------------------------
Func Process_Dictionary()
Local $aWords
_FileReadToArray("words.txt", $aWords)
Redim $aDictionary[$aWords[0] + 1][2]
$aDictionary[0][0] = $aWords[0]
For $x = 1 to $aWords[0]
     $aDictionary[$x][0] = $aWords[$x]
     $i = StringSplit($aWords[$x], "", 2)
     _ArraySort($i)
     $aDictionary[$x][1] = _ArrayToString($i, "")
Next
_ArraySort($aDictionary, 0, 1, 0, 1)
EndFunc

@Kylomas - I have been on a "kick" lately promoting the use of Assign(), IsDeclared() and Eval(). Multiple alternate indexes for a ListView allowing sorting by any column in record time () was the most fun to date.

Edit: moved gui stuff out of the load subfunction

Edit2: I guess I shouldn't be surprised. This is much faster loading the dictionary, and does almost instantaneous lookups:

#include<array.au3>
#include<file.au3>
#include <GUIConstants.au3>
#include <ButtonConstants.au3>

Global $aDictionary[1], $iWords

GUICreate("Input the Scrambled Word", 320,120, @DesktopWidth/2-160, @DesktopHeight/2-45, -1, 0x00000018); WS_EX_ACCEPTFILES
$input = GUICtrlCreateInput("", 10,  5, 300, 20)
$output = GUICtrlCreateLabel( "", 10,  40, 300, 20)
$btn = GUICtrlCreateButton("SEARCH", 10,  70, 60, 20, $BS_DEFPUSHBUTTON)
$count = GUICtrlCreateLabel( "", 10,  95, 300, 20)
GUISetState ()
GUICtrlSetState($input, $GUI_DISABLE)
GUICtrlSetData($output,"Loading dictionary...")
GuiCtrlSetColor($output, 0xFF0000); red
$timer = TimerInit()
$iWords = Process_Dictionary()
$timer = TimerDiff($timer) / 1000
GUICtrlSetData($output,"")
GuiCtrlSetColor($output, 0x000000); black
GUICtrlSetData($count,"Words loaded: " & $iWords & " in " & Round($timer, 2) & " seconds)")
GUICtrlSetState($input, $GUI_ENABLE)
GUICtrlSetState($input, $GUI_FOCUS)

While 1
    $msg = GUIGetMsg()
    Switch $msg
        Case $btn
            $str = Search_Dictionary(GUICtrlRead($input))
            GUICtrlSetData($output, $str)
        Case $GUI_EVENT_CLOSE
            ExitLoop
    EndSwitch
WEnd
Exit

;-------------------------------------------------------------------------------
Func Search_Dictionary($word)
    Local $sVar = StringSplit($word, "", 2)
    _ArraySort($sVar)
    $sVar = "_" & _ArrayToString($sVar, "")
    If IsDeclared($sVar) Then
        $str = Eval($sVar)
    Else
        $str = "<not found>"
    EndIf
    Return $str
EndFunc

;-------------------------------------------------------------------------------
Func Process_Dictionary()
    Local $aWords, $sVar
    _FileReadToArray("words.txt", $aWords)
    For $x = 1 to $aWords[0]
        $sVar = StringSplit($aWords[$x], "", 2)
        _ArraySort($sVar)
        $sVar = "_" & _ArrayToString($sVar, "")
        If IsDeclared($sVar) Then
            Assign($sVar, Eval($sVar) & ", " & $aWords[$x], 2)
        Else
            Assign($sVar, $aWords[$x], 2)
        EndIf
    Next
    Return $aWords[0]
EndFunc
Edited by Spiff59

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×