Jump to content
Sign in to follow this  
phatzilla

Optimizing array difference search speed (or other solutions welcome!)

Recommended Posts

Hi gang,

I'm in a bit of a pickle here, the gist of it is that i have one "main" file with ~200k lines, and literally hundreds of other files each ranging from 1k - 100k lines.

I need to go through each of the "other" files and (separately) compare them to the "main" file, and save the differences between the two (no duplicates)

 

The issue is that each comparison (especially if the "other" file has 50+ k lines) takes over a minute each... 

Is there anyway to cut this time down? As far as i know im using the most optimized array difference script


Here's a rough mockup of the script im currently using

note: the main file has all unique lines, and the "other" files wont ever have any lines that *DO NOT* appear in the main file, if that helps

#include <array.au3>
#include <file.au3>



global $info_file
global $compare_file
global $Differece

$info_file = FileReadToArray("info-file(200k lines).txt")




$compare_file = FileReadToArray("compare-file(100k lines).txt")




$Differece = _Diff($info_file, $compare_file, 0) ; get the difference between 2 arrays, NO duplicates
_ArrayDisplay($Differece)







;=================================================
; Function Name:   _Diff($Set1, $Set2 [, $GetAll=0 [, $Delim=Default]])
; Description::    Find values in $Set1 that do not occur in $Set2
; Parameter(s):    $Set1    set 1 (1D-array or delimited string)
;                  $Set2    set 2 (1D-array or delimited string)
;      optional:   $GetAll  0 - only one occurence of every difference are shown (Default)
;                           1 - all differences are shown, allowing duplicates
;      optional:   $Delim   Delimiter for strings (Default use the separator character set by Opt("GUIDataSeparatorChar") )
; Return Value(s): Succes   1D-array of values from $Set1 that do not occur in $Set2
;                  Failure  -1  @error  set, that was given as array, isn't 1D-array
; Note:            Comparison is case-sensitive! - i.e. Number 9 is different to string '9'!
; Author(s):       BugFix (bugfix@autoit.de) Modified by ParoXsitiC for Faster _Diff (Formally _GetIntersection)
;=================================================
Func _Diff(ByRef $Set1, ByRef $Set2, $GetAll = 0, $Delim = Default)
    Local $o1 = ObjCreate("System.Collections.ArrayList")
    Local $o2 = ObjCreate("System.Collections.ArrayList")
    Local $oDiff1 = ObjCreate("System.Collections.ArrayList")
    Local $tmp, $i
    If $GetAll <> 1 Then $GetAll = 0
    If $Delim = Default Then $Delim = Opt("GUIDataSeparatorChar")
    If Not IsArray($Set1) Then
        If Not StringInStr($Set1, $Delim) Then
            $o1.Add($Set1)
        Else
            $tmp = StringSplit($Set1, $Delim, 1)
            For $i = 1 To UBound($tmp) - 1
                $o1.Add($tmp[$i])
            Next
        EndIf
    Else
        If UBound($Set1, 0) > 1 Then Return SetError(1, 0, -1)
        For $i = 0 To UBound($Set1) - 1
            $o1.Add($Set1[$i])
        Next
    EndIf

    If Not IsArray($Set2) Then
        If Not StringInStr($Set2, $Delim) Then
            $o2.Add($Set2)
        Else
            $tmp = StringSplit($Set2, $Delim, 1)
            For $i = 1 To UBound($tmp) - 1
                $o2.Add($tmp[$i])
            Next
        EndIf
    Else
        If UBound($Set2, 0) > 1 Then Return SetError(1, 0, -1)
        For $i = 0 To UBound($Set2) - 1
            $o2.Add($Set2[$i])
        Next
    EndIf

    For $tmp In $o1
        If Not $o2.Contains($tmp) And ($GetAll Or Not $oDiff1.Contains($tmp)) Then $oDiff1.Add($tmp)
    Next

    If $oDiff1.Count <= 0 Then Return 0

    Local $aOut[$oDiff1.Count]
    $i = 0
    For $tmp In $oDiff1
        $aOut[$i] = $tmp
        $i += 1
    Next
    Return $aOut
EndFunc   ;==>_Diff

 

Edited by phatzilla

Share this post


Link to post
Share on other sites

Merge the 2 arrays into one, then run that merged array through _ArrayUnique to get the array with no repeats.


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

_ArrayUnique uses a scripting dictionary

 


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

BrewManNH,

Will _arrayunique remove ALL duplicate entries?   If i recall, it simply keeps one of the entries.

If you run:

1
3
3
2

through _arrayunique, you'll get

1
3
2

I need it to be
1
2
 

 

 

What i need is a new array that's the difference between "main file" and "other file".

So if main file array is

2
1
4
5
3

and other file array is

2
3
1

new array should be

4
5

 

In other words, i dont want to preserve any instances of the duplicate lines.  the _Diff function achieves this, but i need to get it a bit faster, if possible...

Or, failing that, a new solution.

 

Edited by phatzilla

Share this post


Link to post
Share on other sites

What's in the files that you're comparing between the 2? What do some of the entries look like? I personally would sort the array to be searched and then use _ArrayBinarySearch to quickly search the array for any entries that are duplicates. I'd need a sample of a couple of files you're comparing, even generic ones that the format is similar to what you're working with, I can probably create something that will do this very fast. Using objects on arrays probably isn't very efficient.


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

_ArrayUnique uses a scripting dictionary

I know, I said directly because the OP asked for "save the differences between the two"  and _ArrayUnique doesn't do that

Edited by mikell

Share this post


Link to post
Share on other sites

How do lines differ, typically? Are they 99% the same or much more random? Are the differences located near the head, near the tail or can they differ anywhere in between?

This can dramatically change the speed of comparisons. Have you tried comparing the hash of the strings (MD5 will do here): comparing MD5s is pretty fast.

Depending on actual contents, SQLite might prove useful for your use case.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Actually, i just adjusted my script based on mikells suggestion....



 

global $a = FileReadToArray("info-file(200k lines).txt")
global $b = FileReadToArray("compare-file(100k lines).txt")

Local $sda = ObjCreate("Scripting.Dictionary")
Local $sdb = ObjCreate("Scripting.Dictionary")
Local $sdc = ObjCreate("Scripting.Dictionary")

For $i In $a
    $sda.Item($i)
Next
For $i In $b
    $sdb.Item($i)
Next

For $i In $a
    If $sdb.Exists($i) Then $sdc.Item($i)
Next
$asd3 = $sdc.Keys()

For $i In $asd3
    If $sda.Exists($i) Then $sda.Remove($i)
    If $sdb.Exists($i) Then $sdb.Remove($i)
Next
$asd1 = $sda.Keys()
$asd2 = $sdb.Keys()

_ArrayDisplay($asd1, "$asd1")
_ArrayDisplay($asd2, "$asd2")
_ArrayDisplay($asd3, "$asd3")


Now this is much faster... $ASD1 is the only array that i need as it contains the lines that are *ONLY* found in "info" file (because the "other" compare file will never contain any lines that aren't found in info file).  Is there any way to speed this script up?  I dont really need the other 2 new arrays asd2,asd3.  asd1 contains all the information i need.

Edited by phatzilla

Share this post


Link to post
Share on other sites

I have a  small doubt. You people see the above code ?. Is this safe to use the same variable name twice or thrice in the same script ?. I mean "$i". 

Edit - Never mind i got the answer. 

Edited by kcvinu

My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Share this post


Link to post
Share on other sites

kcvinu,

To answer your question for any later readers: Yes, it is quite safe to use the same variable name in several For...Next loops as long as each loop is entirely separate and has terminated before the next one starts. If there is any overlap or nesting then the loop count variables must be differentiated.

M23


Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

In this case you can spare significant steps:

Local $a = FileReadToArray("info-file(200k lines).txt")
Local $sda = ObjCreate("Scripting.Dictionary")

For $i In $a
    $sda.Item($i)
Next

; for each "compare-file" do
; see _FileListToArray and friends

    ; also $a content is no more needed,
    ; we reuse it for all target files
    $a = FileReadToArray("compare-file(100k lines).txt")

    For $i In $a
        If $sda.Exists($i) Then $sda.Remove($i)
    Next
; next file

Local $asd1 = $sda.Keys()
$sda = 0
_ArrayDisplay($asd1, "$asd1")

 


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

@Melba23 Thank you for the clarification. 


My Contributions

UDF Link Viewer   --- A tool to visit the links of some most important UDFs 

 Includer_2  ----- A tool to type the #include statement automatically 

 Digits To Date  ----- date from 3 integer values

PrintList ----- prints arrays into console for testing.

 Alert  ------ An alternative for MsgBox 

 MousePosition ------- A simple tooltip display of mouse position

GRM Helper -------- A littile tool to help writing code with GUIRegisterMsg function

Access_UDF  -------- An UDF for working with access database files. (.*accdb only)

 

Share this post


Link to post
Share on other sites

phatzilla,

Just for grins can you run your largest files through the following.  I'm interested in comparing the results to the dictionary method...

#include <FileConstants.au3>
#include <sqlite.au3>
#include <array.au3>

Local $sSQLiteMod = 'C:\Program Files (x86)\AutoIt3\install\Extras\SQLite\sqlite3.exe'

; do NOT use AutoIt macros for the next two file names

Local $sMasterFile = "fully.qualified.filename" ; your file with ALL entries
Local $sDetailFile = "fully.qualified.filename" ; your file with SOME entries

$st = TimerInit()

Local $sIn = 'create table if not exists mstr (c1);' & _
        'create table if not exists det (c1);' & _
        @CRLF & '.import ' & $sMasterFile & ' MSTR' & _
        @CRLF & '.import ' & $sDetailFile & ' DET' & _
        @CRLF & 'SELECT DISTINCT c1 FROM mstr WHERE c1 Not IN (SELECT DISTINCT c1 FROM det);'

Local $Out = _TempFile()
Local $in = _TempFile()

FileWrite($in, $sIn)

Local $sCmd = @ComSpec & ' /c "' & $sSQLiteMod & '" < ' & $in & ' > ' & $Out
Local $pid = RunWait($sCmd, @WorkingDir, @SW_HIDE)
ProcessWaitClose($pid)
ConsoleWrite('Time to find entries unique to mstr = ' & Round(TimerDiff($st) / 1000, 3) & ' seconds' & @CRLF)
ConsoleWrite(StdoutRead($pid) & @CRLF)
ConsoleWrite('! out ' & @CRLF & FileRead($Out) & @CRLF)

FileDelete($Out)
FileDelete($in)

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

For this kind of handling, which requires to be fast enough, I use an external program from the CoreUtils for Windows (http://gnuwin32.sourceforge.net/packages/coreutils.htm) : uniq.exe. This tool can quickly extract/remove duplicate lines in a text file. In your case, you just have to put the 2 file contents in 1 file and remove the duplicates (it requires the file to be sorted first)

So, using the same idea than BrewManNH with _ArrayUniq, you can do something like this :

#Include <Array.au3>

Local $sMainFile = @ScriptDir & "\mainFile.txt"
Local $sPartialFile = @ScriptDir & "\secondFile.txt"

Local $sTempFile = @ScriptDir & "\tempFile.txt", $sResultFile = @ScriptDir & "\resultFile.txt"

; Concatenate the 2 files in 1
Local $sfile = FileWrite($sTempFile, FileRead($sMainFile) & FileRead($sPartialFile) )

; Sort the file and then remove all duplicate lines
RunWait(@ComSpec & ' /c sort "' & $sTempFile & '" | uniq -u > "' & $sResultFile & '"', "C:\Program Files (x86)\GnuWin32\bin", @SW_HIDE)
FileDelete($sTempFile)

Local $aResult = FileReadToArray($sResultFile)
_ArrayDisplay($aResult)

I know the problem is solved, but maybe someone else could be interested by this method...

Edited by jguinch

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By roeselpi
      hello again,
      it has been a long time since i have been here and a long time since i last used autoit. ever so often when the time allows me to, then i follow up on an idea that i had a long time ago. i have done all the work on paper but now it is up to writing it in autoit and i keep stumbling over many little issues here and there. sometimes after a few days i will try again and get a step further but sometimes it just will not help no matter how long i try and think about a solution. for most of you it will be the basics but for me it is not all that easy, but at least i give it a try.
      right, down to business:
      here is my code:
      #include <MsgBoxConstants.au3> #include <StringConstants.au3> #include <Array.au3> #include <String.au3> ; ; PART 1: define replacements and check with msgbox ; Global $y, $z $y = "Yes" $z = "No" MsgBox(0,"replacements", $y & @CRLF & $z) ;the replacements in a message box ; ; PART 2: set the texts and check via console and msgbox ; Global $my1string = "abab" ;the first specified text MsgBox(0,"my1string", $my1string) ;the message box to output the first specified text Global $my2string = "icic" ;the second specified text MsgBox(0,"my2string", $my2string) ;the message box to output the second specified text ; ; PART 3: transform the strings to individual arrays ; $my1array = StringSplit($my1string, "") $my1array[0] = "" _ArrayDelete($my1array, 0) _ArrayDisplay($my1array, "my1array") ;the display of the first specified array $my2array = StringSplit($my2string, "") $my2array[0] = "" _ArrayDelete($my2array, 0) _ArrayDisplay($my2array, "my2array") ;the display of the first specified array ; ; PART 4: create an empty array for filling ; Global $OutputArray[4] $OutputArray[0] = "" _ArrayDisplay($OutputArray, "OutputArray") ;the display of the first specified array ; ; PART 5: compare & fill empty OutputArray with data after evaluation ; Global $i, $j, $k For $i = 0 to UBound($my1array) -1 For $j = 0 to UBound($my2array) -1 For $k = 0 to UBound($OutputArray) -1 If $my1array[$i] = "a" And $my2array[$j] = "i" Then $OutputArray[$k] = $y Else $OutputArray[$k] = $z EndIf Next Next Next _ArrayDisplay($OutputArray, "OutputArray") ;the display of the Newly filled Array In "Part 2" i make a string that is converted to an array in "Part 3" ... Now, I know that "a" and "i" are always in the exact same spot in both arrays and so i wanted to compare this and make a further array to document my findings by saying "yes" or "no" ... however my new array keeps saying just "no" allthough i can clearly see and know that it should say:
      yes no yes no my guess is that there is something wrong within my for-loops and that the counting is somehow "off" i guess that when the first for-loop is finished it reaches the second whilst the second for-loop is checking the first which would explain why it always says "no" instead of seeing the obvious.
      so my question would be: what is wrong with my for-loop? or where am i making an error that ultimately gives me the wrong results?
      help is much appreciated.
      kind regards
      roeselpi
       
       
      PS: sorry for my not so great english spelling ... stupid german sitting here trying out intermediate english skills.
    • By Jangal
      Hello friends
      This app is slow
      How to increase its speed?
       
      #include <Array.au3> #include <StringConstants.au3> #include <File.au3> #include <String.au3> Global $aWord[][2]  = [[1, "google"],[2,"hello"]]


        Global $sFileName = @ScriptDir & "\1.txt" ; 2MB Text File Local $sFileRead = FileRead($sFileName) Local $res = StringRegExp($sFileRead, "(*UCP)\b[\pL\d]{2,}", 3) _ArrayDisplay($res)   for $sWord in $res     $iIndex = _ArraySearch($aWord, $sWord, 0, 0, 0, 0, 1, 2)     ;MsgBox(0,0,$iIndex)     if $iIndex == -1 Then         Local $aFill = [[0,$sWord]]         _ArrayAdd($aWord,$aFill)        ;      Else         $aWord[$iIndex][0] +=1     EndIf   Next _ArrayDisplay($aWord)

      1.txt
    • By nacerbaaziz
      goodmorning; autoit team
      please their are any youtube search way working?
      because i was using the get source and split it to get the result but know it does not working any way.
      is the youtube disabled that? and is their any other simple way to do that?
      i tested all examples found in this post but also it don't work
      https://www.autoitscript.com/forum/topic/123945-youtube-search/
      and here is the example that i use to and it don't work any more
      local $hSearchOpenHNDL, $hSearchConnect, $sSearchGet local $a_UrlsArray[1][5] local $b_ButtonsDisabled = false, $b_SearchBTNFocus = false, $b_SearchListFocus, $h_SearchFocusHND local $Return = "0" local $s_OpenStringY = "/feed/trending" local $ChannelUrl = "", $channelName = "" if Not ($a_YoutubeSearchArray[0][0] = 0) then GUICtrlSetData($searchInp, $s_youtubeSearchLastSearch) $s_OpenStringY = "/results?search_query=" & StringReplace(GUICtrlRead($searchInp), " ", "+") $a_UrlsArray = $a_YoutubeSearchArray for $i = 1 to $a_UrlsArray[0][0] _GUICtrlListBox_AddString($SearchList, $a_UrlsArray[$i][0] & $a_UrlsArray[$i][2] & $a_UrlsArray[$i][3]) next _GUICtrlListBox_SetCurSel($SearchList, $I_youtubeSearchLastIndex-1) GUICtrlSetState($SearchList, $GUI_focus) else if Ping("youtube.com", 1000) > 1 then $hSearchOpenHNDL = _WinHttpOpen('') if not (@Error) then $hSearchConnect = _WinHttpConnect($hSearchOpenHNDL, "youtube.com") if Not (@Error) then $sSearchGet = _WinHttpSimpleRequest($hSearchConnect, "get", $s_OpenStringY) if not (@Error) then local $a_strings = _StringBetween($sSearchGet, '<a href="/watch', "<ul") local $title = "" local $url = "" local $length = "" local $result = "" GUICtrlSetData($SearchList, "") ReDim $a_UrlsArray[1][5] for $i = 0 to UBound($a_strings)-1 $url = _StringBetween($a_strings[$i], "?", '"') if @error then ContinueLoop $url = "https://www.youtube.com/watch?" & $url[0] $title = _StringBetween($a_strings[$i], 'dir="', '</a>') if @error then ContinueLoop $title = $title[0] $title = StringRegExpReplace($title, '(.*\"\>)', "") if StringRegExp($a_strings[$i], '[0-9]{1,2}\:[0-9]{1,2}\:[0-9]{1,2}', 0) = 1 then $length = StringRegExp($a_strings[$i], '[0-9]{1,2}\:[0-9]{1,2}\:[0-9]{1,2}', 2) elseIf StringRegExp($a_strings[$i], '[0-9]{1,2}\:[0-9]{1,2}', 0) = 1 then $length = StringRegExp($a_strings[$i], '[0-9]{1,2}\:[0-9]{1,2}', 2) else $length = "" endIf if IsArray($length) then $length = ": (" & $length[0] & ")" else $length = "" endIf $ChannelUrl = stringRegexpReplace($a_strings[$i], '(^.*?<a.*?\"(\/user|\/channel))+', "$2") $channelName = stringRegexpReplace($ChannelUrl, '(.*?\".*?>)(.*</a>)+', "$2") $ChannelUrl = stringRegexpReplace($ChannelUrl, '(\".*)+', "") $channelName = stringRegexpReplace($channelName, '(</a>.*)+', "") $result &= $title & @crlf & $url & @crlf ReDim $a_UrlsArray[UBound($a_UrlsArray)+1][5] $a_UrlsArray[UBound($a_UrlsArray)-1][0] = $title $a_UrlsArray[UBound($a_UrlsArray)-1][1] = $url $a_UrlsArray[UBound($a_UrlsArray)-1][2] = $length if not ($channelName = "") then $a_UrlsArray[UBound($a_UrlsArray)-1][3] = ", (" & $channelName & ")" if not ($channelUrl = "") then $a_UrlsArray[UBound($a_UrlsArray)-1][4] = "https://www.youtube.com" & $channelUrl $a_UrlsArray[0][0] = UBound($a_UrlsArray)-1 _GUICtrlListBox_AddString($SearchList, $a_UrlsArray[UBound($a_UrlsArray)-1][0] & $length & $a_UrlsArray[UBound($a_UrlsArray)-1][3]) next endIf endIf endIf endIf endIf  
      i hope any one can help me
      thanks in advance
    • By Colduction
      Hi dear friends!, i'm sorry for creating a new thread (a new problem), i have over than 9 lists that i want to combine them to be this (in this example, there are 3 test files):


      I've written a little code for splitting main information, but i really confused how to make results as "Output.txt", here is that code:
       
      $sRegex_1 = StringRegExp(FileRead("1.txt"), '(?s:(?<=\=\=\r\n)(.*?)(?=\r\n\=\=))', 3) $sRegex_2 = StringRegExp(FileRead("2.txt"), '(?s:(?<=\=\=\r\n)(.*?)(?=\r\n\=\=))', 3) $sRegex_3 = StringRegExp(FileRead("3.txt"), '(?s:(?<=\=\=\r\n)(.*?)(?=\r\n\=\=))', 3) For $i = 0 To UBound($sRegex_1) - 1 ConsoleWrite($sRegex_1[$i] & @CRLF) For $j = 0 To UBound($sRegex_2) - 1 ConsoleWrite($sRegex_2[$j] & @CRLF) For $k = 0 To UBound($sRegex_3) - 1 ConsoleWrite($sRegex_3[$k] & @CRLF) Next Next Next  
    • By nacerbaaziz
      hello evrybody
      here is an example about how to split your texts using a delimiter with the ability to select how much of delimiters shows in each colum  with $i_number
      e.g
      you have a long text and you want to split it in an array
      that evry colum have a number (n) of lines
      i made a function that do that for you
      just call it with a three params
      $s_text
      your text
      $i_number
      the number that you want to put in each col
      $s_siparator
      the siparator
      default is "|"
      here is the function with example
      i hope that it will be useful for you
       
      ****
       
      #include <Array.au3> $s_txt = "some text1some text2|some text3|some text4|some text5|some text6" $array = splitText($s_txt, 2) _ArrayDisplay($array) Func splitText($s_text, $i_number, $s_siparator = "|") Local $a_TXT = StringSplit($s_text, $s_siparator) Local $a_Return[$a_TXT[0] + 1] If ($a_TXT[0] <= $i_number) Or ($i_number <= 0) Then ReDim $a_Return[2] $a_Return[0] = 1 $a_Return[1] = $s_text Return $a_Return EndIf Local $i_Processed = 1, $i_arrayProcessed = 1 Do For $i = $i_Processed To ($i_Processed + $i_number) - 1 If ($a_TXT[0] < $i) Then ExitLoop If Not ($a_Return[$i_arrayProcessed]) Then $a_Return[$i_arrayProcessed] = $a_TXT[$i] Else $a_Return[$i_arrayProcessed] &= $s_siparator & $a_TXT[$i] EndIf $i_Processed += 1 Next $i_arrayProcessed += 1 Until ($a_TXT[0] < $i_Processed) ReDim $a_Return[$i_arrayProcessed] $a_Return[0] = $i_arrayProcessed - 1 Return $a_Return EndFunc ;==>splitText
      accept my greetings
      thanks to
      @Dan_555
      for his notes
       
×
×
  • Create New...