Jump to content

_GetIntersection()


Recommended Posts

I have testing around to grab the first array that this code makes.

But without any results.

I need to check if words.txt with many words is in the badwords.txt

And if they are in that file, they should writes to a new file.

Like they are filtered. But i don't know how.

I have tested This method too.

But that doesn't work good. It takes some random words and not all that is in the badwords.txt.

I hope someone understand how i mean. :)

Example:

if ("word from words.txt is in badwords.txt") then

write this word to a new file.

end

#include <array.au3>
#Include <File.au3>
dim $1, $2
_FileReadToArray("words.txt", $1)
_FileReadToArray("badwords.txt", $2)

Local $ret = _GetIntersection($1, $2, 0, @crlf)
If IsArray($ret) Then
_arraydisplay($ret)
EndIf

;==================================================================================================
; Function Name:   _GetIntersection($Set1, $Set2 [, $GetAll=0 [, $Delim=Default]])
; Description:: Detect from 2 sets
;                 - Intersection (elements are contains in both sets)
;                 - Difference 1 (elements are contains only in $Set1)
;                 - Difference 2 (elements are contains only in $Set2)
; Parameter(s): $Set1   set 1 (1D-array or delimited string)
;                 $Set2  set 2 (1D-array or delimited string)
;     optional:   $GetAll  0 - only one occurence of every different element are shown (Default)
;                          1 - all elements of differences are shown
;     optional:   $Delim   Delimiter for strings (Default use the separator character set by Opt("GUIDataSeparatorChar") )
; Return Value(s): Succes   2D-array    [i][0]=Intersection
;                                      [i][1]=Difference 1
;                                      [i][2]=Difference 2
;                 Failure  -1   @error  set, that was given as array, is'nt 1D-array
; Note:         Comparison is case-sensitiv! - i.e. Number 9 is different to string '9'!
; Author(s):       BugFix (bugfix@autoit.de)
;==================================================================================================
Func _GetIntersection(ByRef $Set1, ByRef $Set2, $GetAll=0, $Delim=Default)
    Local $o1 = ObjCreate("System.Collections.ArrayList")
    Local $o2 = ObjCreate("System.Collections.ArrayList")
    Local $oUnion = ObjCreate("System.Collections.ArrayList")
    Local $oDiff1 = ObjCreate("System.Collections.ArrayList")
    Local $oDiff2 = ObjCreate("System.Collections.ArrayList")
    Local $tmp, $i
    If $GetAll <> 1 Then $GetAll = 0
    If $Delim = Default Then $Delim = Opt("GUIDataSeparatorChar")
    If Not IsArray($Set1) Then
        If Not StringInStr($Set1, $Delim) Then
            $o1.Add($Set1)
        Else
            $tmp = StringSplit($Set1, $Delim, 1)
            For $i = 1 To UBound($tmp) -1
                $o1.Add($tmp[$i])
            Next
        EndIf
    Else
        If UBound($Set1, 0) > 1 Then Return SetError(1,0,-1)
        For $i = 0 To UBound($Set1) -1
            $o1.Add($Set1[$i])
        Next
    EndIf
    If Not IsArray($Set2) Then
        If Not StringInStr($Set2, $Delim) Then
            $o2.Add($Set2)
        Else
            $tmp = StringSplit($Set2, $Delim, 1)
            For $i = 1 To UBound($tmp) -1
                $o2.Add($tmp[$i])
            Next
        EndIf
    Else
        If UBound($Set2, 0) > 1 Then Return SetError(1,0,-1)
        For $i = 0 To UBound($Set2) -1
            $o2.Add($Set2[$i])
        Next
    EndIf
    For $tmp In $o1
        If $o2.Contains($tmp) And Not $oUnion.Contains($tmp) Then $oUnion.Add($tmp)
    Next
    For $tmp In $o2
        If $o1.Contains($tmp) And Not $oUnion.Contains($tmp) Then $oUnion.Add($tmp)
    Next
    For $tmp In $o1
        If $GetAll Then
            If Not $oUnion.Contains($tmp) Then $oDiff1.Add($tmp)
        Else
            If Not $oUnion.Contains($tmp) And Not $oDiff1.Contains($tmp) Then $oDiff1.Add($tmp)
        EndIf
    Next
    For $tmp In $o2
        If $GetAll Then
            If Not $oUnion.Contains($tmp) Then $oDiff2.Add($tmp)
        Else
            If Not $oUnion.Contains($tmp) And Not $oDiff2.Contains($tmp) Then $oDiff2.Add($tmp)
        EndIf
    Next
    Local $UBound[3] = [$oDiff1.Count,$oDiff2.Count,$oUnion.Count], $max = 1
    For $i = 0 To UBound($UBound) -1
        If $UBound[$i] > $max Then $max = $UBound[$i]
    Next
    Local $aOut[$max][3]
    If $oUnion.Count > 0 Then
        $i = 0
        For $tmp In $oUnion
            $aOut[$i][0] = $tmp
            $i += 1
        Next
    EndIf
    If $oDiff1.Count > 0 Then
        $i = 0
        For $tmp In $oDiff1
            $aOut[$i][1] = $tmp
            $i += 1
        Next
    EndIf
    If $oDiff2.Count > 0 Then
        $i = 0
        For $tmp In $oDiff2
            $aOut[$i][2] = $tmp
            $i += 1
        Next
    EndIf
    Return $aOut
EndFunc ;==>_GetIntersection
Link to comment
Share on other sites

try this:

$sFirstFile = ''                ;Replace with the path of your first file
$sSecondFile = ''               ;Replace with the path of your second file

$sFirstWords = FileRead($sFirstFile)
$sSecondWords = FileRead($sSecondFile)

$sComparisonFile = @ScriptDir&'\Comparison.txt'

$hComparison = FileOpen($sComparisonFile, 10)
$aStrWords = StringRegExp($sFirstWords, '\b\w+\b', 3)

For $i = 0 To UBound($aStrWords)-1
    If StringInStr($sSecondWords, $aStrWords[$i]) Then
        FileWrite($hFile, $aStrWords[$i]&@CRLF)
        ContinueLoop
    EndIf
Next
Link to comment
Share on other sites

$sWords = @DesktopDir & "\words.txt";; use your own files here
$sBad = @DesktopDir & "\Bad.txt"
$sNewFile = @DesktopDir & "\NewList.txt"

$aBad = StringRegExp(FileRead($sBad), "(?i)\b\w+\b", 3)
If NOT @Error Then
    $sStr = FileRead($sWords)
    For $i = 0 To Ubound($aBad) -1
        If StringRegExp($sStr, "(?i)\b" & $aBad[$i] & ".*") Then FileWriteLine($sNewFile, $aBad[$i])
    Next
EndIf

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

geosofts works, but it doesn't take the whole word, it removing special characters.

And i'm not good at regexp.

If the word is "omg\".!fuck" it will be stripped and only contain "omg"

I think i writed wrong i see, If it's not in the badword.txt, it should be writed to the new file.

Edited by cparadis
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...