Sign in to follow this  
Followers 0
forummember

How do I remove all duplicate words in a string?

6 posts in this topic

#1 ·  Posted (edited)

I'm looking for a way to clean a string, so it will only contain unique words.

I'm trying to use StringRexp, but can't get it to work.

Right now I have something like

dim $string = "one one two one three 123"
dim $pattern = "(?\b\w+\b)(?!.+\b\k\b)"
dim $regxp = StringRegExp($string,$pattern, 3)

Can someone give me some help please?! :)

Edited by forummember

Share this post


Link to post
Share on other sites



I don't know, if it's possible only with RegEx.

But so it works:

dim $string = "one one two one three 123"
dim $pattern = "(\b\w+\b)"
dim $return = StringRegExp($string, $pattern, 3)
dim $obj = ObjCreate("System.Collections.ArrayList")
For $i = 0 To UBound($return) -1
    If Not $obj.Contains($return[$i]) Then $obj.Add($return[$i])
Next
dim $cleared = ''
For $word In $obj
    $cleared &= $word & ' '
Next
ConsoleWrite($cleared & @CRLF)

Best Regards BugFix  

Share this post


Link to post
Share on other sites

Big Thx!

worked perfectly! :)

Share this post


Link to post
Share on other sites

Big Thx!

worked perfectly! :)

a simpler and easier script (maybe not as good though)

Local $sString = InputBox ("Test", "Insert string here!", "one two three one 123"), $aSplit = StringSplit ($sString, " "), $new = "", $i

For $i = 1 to $aSplit[0]
   For $x = 1 to $i - 1
      If $aSplit[$x] = $aSplit[$i] Then ContinueLoop 2
   Next
   $new &= $aSplit[$i] & " "
Next
$new = StringTrimRight ($new, 1)
MsgBox (0, "result", $new)

No objects, so sould be a bit easier to use!

MDiesel

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Pure RegExp only, but the resulting order may not be what you want

#include <array.au3>
dim $string = "one one two one three 123"
dim $pattern = "(\b\w+\b)(?!.+\1)"
Dim $sResult = ""
$aReturn = StringRegExp($string, $pattern,3)
For $i = 0 to Ubound($aReturn)-1
        ConsoleWrite($aReturn[$i] & @CRLF)
        $sResult &= $aReturn[$i]  & " "
Next
MsgBox(0,"",$sResult)

Edit: change the pattern to

$pattern = "(?i)(\b\w+\b)(?!.+\1)"

to ignore case sensitivity

Edited by ResNullius

Share this post


Link to post
Share on other sites

Thanks to you all!

I like mdiesels solution since it allows characters other than a-z, A-Z, 0-9. For example åäö ÅÄÖ and so on...

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0