Jump to content

Words Filter


SuPr3M
 Share

Recommended Posts

Hi everyone

I am trying to load list items (which are strings) from a text file (.txt)

Well the idea is to transform a dictionarry of words into many words in order to perform some tests on them like filter the words starting with a voyelle and stuff like that...

I tried to use an editable place where to paste the dictionnary (GuiCtrlCreateEdit) then filter each word but that didnt work out or i couldnt make it work...

My question is quite simple ... How the hell can that be done ?

Thanks in advance for any help or support!

Link to comment
Share on other sites

#include <EditConstants.au3>
#include <GUIConstantsEx.au3>

Global Const $sString = 'Welcome to the AutoIt Script home page - the home of AutoIt scripting and related applications.' & @CRLF & @CRLF & _
    'This site provides everything you need to get started with AutoIt and features great user support via the forum.' & @CRLF & @CRLF & _
    'AutoIt' & @CRLF & @CRLF & 'AutoIt is a freeware Windows automation language.' & _
    'It can be used to script most simple Windows-based tasks (great for PC rollouts or home automation).' & @CRLF & @CRLF & _
    'AutoIt has been in popular use since 1999 and continues to provide users and administrators with an easy ' & _
    'way to script the Windows GUI. In February 2004 the latest version of AutoIt - known as AutoIt v3 - was ' & _
    'released and added powerful scripting features.' & @CRLF & @CRLF & _
    'AutoIt v3 was developed in a small team with the help of contributors around the world and this has led to a great ' & _
    'set of help files, examples, support forum, mailing list, editor files, and third-party utilities.  Oh, and lets not' & _
    'forget some nice graphics and wallpapers too!'
    
Global $hGUI
Global $Button, $Edit, $ComboBox

$hGUI = GUICreate('Title', 400, 550)
$Edit = GUICtrlCreateEdit($sString, 10, 10, 380, 340)
$Button = GUICtrlCreateButton('&Filter', 110, 360, 80, 25)
$ComboBox = GUICtrlCreateCombo('', 200, 360, 120, 100)
GUICtrlSetData($ComboBox, 'Consonants|Vowels', 'Vowels')

GUISetState()
While 1
    Switch GUIGetMsg()
        Case $Button
            Switch GUICtrlRead($ComboBox)
                Case 'consonants'
                    GUICtrlSetData($Edit, StringRegExpReplace($sString, '(?i)\b[aeiou]\w*', ''))
                Case 'vowels'
                    GUICtrlSetData($Edit, StringRegExpReplace($sString, '(?i)\b[bcdfgh-np-tv-z]\w*', ''))
                Case Else
                    MsgBox(0x10, 'Select option', 'Select option to filter in text')
            EndSwitch
        Case $GUI_EVENT_CLOSE
            GUIDelete()
            Exit
    EndSwitch
WEnd

Link to comment
Share on other sites

Case 'consonants'

GUICtrlSetData($Edit, StringRegExpReplace($sString, '(?i)\b[aeiou]\w*', ''))

Case 'vowels'

GUICtrlSetData($Edit, StringRegExpReplace($sString, '(?i)\b[bcdfgh-np-tv-z]\w*', ''))

Ok now that's kinda chinese for me...

Thanks a bunch anyways

I will try to find out the rest myself

Link to comment
Share on other sites

Alright I really learned how to kindly play with characters matching in auto it Now I need to find how to play with characters counting

Example : Removing all the words that have more than 8 characters and less than 6characters.

thanks in advance for help

Link to comment
Share on other sites

You use the quantity operators: {}, +, *..

Removing words which are consisting of more than 8 characters:

StringRegExpReplace($sString, '\w{9,}', '')

Less than 6 characters:

StringRegExpReplace($sString, '\w{1,5}\b', '')

..or both:

StringRegExpReplace($sString, '\w{9,}|\w{1,5}\b', '')

\w{9,} means match at minimum 9 alphanumerical characters but can consume up to an arbitrary word length (above 8 characters).

\w{1,5}\b means match between a single alphanumerical character and at most 5 alphanumerical characters but only at a word boundary, so it won't match "abcde_anotherpart" because of an underscore which \w matches as well.

Edit: Moi mistake, the patterns should be:

\b\w{9,} and \b\w{1,5}\b, because of bumpalong RegExp mechanism.

Edited by Authenticity
Link to comment
Share on other sites

That worked thanks a lot !

One more thing i was tryin to figure out ... How to replace or to delete all the words not containing a special set of characters or containing them and have a character or more over them even if the character is one of the set and comes twice ...

yea man ... I know lol

Here is an example to maybe make it clearer

Example : 1="sea" 2="seat" 3="tase" 4="asset" 5="step" if the set of characters is [aest] it should only keep 1,2&3 and not 4.

Link to comment
Share on other sites

I think it's a little bit complicated than you may think. It's not a problem to match this particular case using a fix string alternation. First removing all the words with length greater then 4 and then running a second match and replace on the modified string. The problem appears when the pattern should match to an arbitrary characters set size. I guess that playing with _ArrayPermute() or _ArrayCombinations() may be the right direction, in conjunction with string concatenation operations to assemble a nice giant pattern of all valid permutations.

Edit: I think this is what you're requiring. I hope it's a correct pattern:

Global $sString = 'sea seat tase seas asset step rol seas ssse se sa eas eat'
Global $sPatt = '\b(?:((?:([aest])(?!\w*?\2)){1,4})|\w+)\b'

$sString = StringRegExpReplace($sString, $sPatt, '\1')
ConsoleWrite(@error & @TAB & @extended & @CRLF & $sString & @CRLF)
Edited by Authenticity
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...