Jump to content

Command line tools to generate combinations ?


 Share

Recommended Posts

Hello all,

      I am working on a project with AutoIt and part of it need generation 8 numbers combinations out of 70 numbers. I tried use AutoIt's built in function _arraycombinations but it is very slow. I would like to know is there any command line program that take a text file as parameter (or from console) so that I can execute it through AutoIt to generate combinations? Thanks.


Regds
LAM Chi-fung

Link to comment
Share on other sites

Unfortunately, I can't follow your description completely.

What do the 70 numbers look like and according to which rule should the combinations be generated. Please describe your problem with examples and/or post your current code (even if it is slow).

Musashi-C64.png

"In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."

Link to comment
Share on other sites

51 minutes ago, Musashi said:

Unfortunately, I can't follow your description completely.

What do the 70 numbers look like and according to which rule should the combinations be generated. Please describe your problem with examples and/or post your current code (even if it is slow).

I am using this code to generate combinations:

$ac=_arraycombinations($CombWork,8,",")

where $CombWork contain 70 numbers (all <100).

 

Regds

LAM Chi-fung

Link to comment
Share on other sites

If this help you don’t know but anyway.....

Local $string = 'A,B,C,D,E,a,b,c,d,1,2,3,4,@,+'
Local $StrLength = StringLen($string)
Local $NoLoop = 5000
Local $No_ofDigits = 8
Local $sText
Local $index
For $i = 1 To $NoLoop
    $sText = ''
    For $k = 1 To $No_ofDigits
        $index = Random ( 1, $StrLength, 1)
        If Mod($index, 2) = 0 Then $index = $index + 1
        $sText &= StringMid($string, $index, 1)
    Next
    ConsoleWrite( $sText & @CRLF )
Next

 

Edited by jugador
Link to comment
Share on other sites

@CFLam You do realize that generating combinations of 8 out of 70 elements will produce 9.44035092 E+9 combinations (about 10,000,000,000) ?  And how fast do you want the result ?  And what is the end result support of those combinations ?  A file ?  A RDBMS ? Certainly not an array I hope !

There is certainly a good reason for asking help for this, but I am curious to know the purpose of the exercice

I just noticed that you have worked on this before : 

https://www.autoitscript.com/forum/topic/193388-generate-unique-8-numbers-combination-from-8-arrays-issue-slow/

 

Edited by Nine
Link to comment
Share on other sites

3 hours ago, Nine said:

@CFLam You do realize that generating combinations of 8 out of 70 elements will produce 9.44035092 E+9 combinations (about 10,000,000,000) ?  And how fast do you want the result ?  And what is the end result support of those combinations ?  A file ?  A RDBMS ? Certainly not an array I hope !

There is certainly a good reason for asking help for this, but I am curious to know the purpose of the exercice

I just noticed that you have worked on this before : 

https://www.autoitscript.com/forum/topic/193388-generate-unique-8-numbers-combination-from-8-arrays-issue-slow/

 

The end result will put into a text file and upcoming program will read in the file line by line to process. I need to perform around 4,000 nCr of it (i.e. perform 4000 70C8) and if each nCr need over an hours to do it..... therefore I am looking for a cmd substitute of _arraycombinations.

 

The post is for a project I worked in 2018 and the current issue is not related to previous topic.

 

Regds

LAM Chi-fung

Link to comment
Share on other sites

4 hours ago, jugador said:

If this help you don’t know but anyway.....

Local $string = 'A,B,C,D,E,a,b,c,d,1,2,3,4,@,+'
Local $StrLength = StringLen($string)
Local $NoLoop = 5000
Local $No_ofDigits = 8
Local $sText
Local $index
For $i = 1 To $NoLoop
    $sText = ''
    For $k = 1 To $No_ofDigits
        $index = Random ( 1, $StrLength, 1)
        If Mod($index, 2) = 0 Then $index = $index + 1
        $sText &= StringMid($string, $index, 1)
    Next
    ConsoleWrite( $sText & @CRLF )
Next

 

Thanks but I need generate all combinations from a $string has 70 numbers...

Link to comment
Share on other sites

32 minutes ago, CFLam said:

I need to perform around 4,000 nCr of it (i.e. perform 4000 70C8)

So you need to create 4000 files containing around 10B lines each based on random selection of 70 numbers < 100 ?  Is that it ?

Link to comment
Share on other sites

32 minutes ago, Nine said:

So you need to create 4000 files containing around 10B lines each based on random selection of 70 numbers < 100 ?  Is that it ?

In worst case, yes, actually need not to perform all 4,000 combinations at 1 time, I can analysis result and proceed e.g. generate 200 files first, read in (analysis) them and if not satisfy, generate next 200 files.

Just find that I can reduce the 70C8 to few (3~5) 50C6 for my project, i.e. from 4,000's 70C8 to 12,000~20,000 50C6, but _arraycombinations still very slow for 50C6...

 

Regds
LAM Chi-fung

Link to comment
Share on other sites

Ok, here my latest test (under 25 secs) for a 50C6.

Script 1 (need to be compiled into PrintCombi.exe) :

#include <Constants.au3>

Const $iN = 6, $iR = 50
Local $aArray[$iR]
For $i = 0 To UBound($aArray) - 1
  $aArray[$i] = $i + 1
Next

Local $iStart = $CmdLine[1], $iEnd = $CmdLine[2]
;Local $iStart = 1, $iEnd = 44

Local $sOutName = $iStart & ".txt"
Local $hOutFile = FileOpen($sOutName, $FO_OVERWRITE)

Local $iNum = 0

Local $hTimer = TimerInit(), $sResult

For $i = $iStart - 1 To $iEnd - 1
  $sResult = ""
  For $j = $i + 1 To $iR - $iN + 1
    For $k = $j + 1 To $iR - $iN + 2
      For $l = $k + 1 To $iR - $iN + 3
        For $m = $l + 1 To $iR - $iN + 4
          For $n = $m + 1 To $iR - $iN + 5
;            For $o = $n + 1 To $iR - $iN + 6
;              For $p = $o + 1 To $iR - $iN + 7
                $iNum += 1
                $sResult &= $aArray[$i] & "," & $aArray[$j] & "," & $aArray[$k] & "," & $aArray[$l] & "," & $aArray[$m] & "," & $aArray[$n] & @CRLF
;              Next
;            Next
          Next
        Next
      Next
    Next
  Next
  FileWrite($hOutFile, $sResult)
Next

FileClose($hOutFile)

; MsgBox ($MB_SYSTEMMODAL, $iStart, TimerDiff($hTimer) & "/" & $iNum) ; to check which process finishes when

Script 2 (Main.au3)

#include <Constants.au3>

Local $aPID[4]
$aPID[0] = Run("PrintCombi.exe 1 2")
$aPID[1] = Run("PrintCombi.exe 3 5")
$aPID[2] = Run("PrintCombi.exe 7 10")
$aPID[3] = Run("PrintCombi.exe 11 44")

Local $hTimer = TimerInit()

While True
  Sleep (500)
  For $i = 0 to 3
    If ProcessExists($aPID[$i]) Then ContinueLoop 2
  Next
  ExitLoop
WEnd

MsgBox ($MB_SYSTEMMODAL, "Over All", TimerDiff($hTimer))

I removed 2 loops in PrintCombi.au3 since it is now a C6.  You can add more run in main if you got more processors (I got 4 on this machine).

All left to do is run a Copy 1.txt+3.txt+7.txt+11.txt All.txt in DOS console.  Or you could include it the main script if you wish to.  

Link to comment
Share on other sites

7 hours ago, CFLam said:

In worst case, yes, actually need not to perform all 4,000 combinations at 1 time, I can analysis result and proceed e.g. generate 200 files first, read in (analysis) them and if not satisfy, generate next 200 files.

It would certainly help if you could fully explain the criterion you use to decide if a given subset satisfies your requirements. You probably could use that to constrain the generation to directly yield only subsets which you regard as "satisfactory".

Also AutoIt is probably not the best tool for such heavy lifting. For instance, 50C6 (with the items being integers) takes about one second to compute using Mathematica. Of course the full result (a huge list) would take very long to display, something I've omitted here:

In[1]:= Timing[Subsets[Table[x, {x, 1, 50}], {6}];]

Out[1]= {1.01563, Null}

You're likely to get the same kind of speed with similar (and much cheaper) CAS (computational algebra system = mathematical software) like SageMath, Magma, GAP, PARI/GP and others freewares. Payware CASes are Maple, Mathematica, and many more. See https://en.wikipedia.org/wiki/Category:Computer_algebra_systems

Edited by jchd
Typos

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

8 hours ago, Nine said:

Ok, here my latest test (under 25 secs) for a 50C6

Kudos!

Modified your quite clever loop-arama :) (Hope you don't mind!)

I dropped the multi-process, only because the 50C6 runs fast enough, and just trying to simplify to compare performance easier.

i dropped the array because i wasn't sure what it was doing. Maybe it was left over from an _arraycombination() version?

btw, I think you might be missing the last line/combo on the PrintCombi 11-44 run, (unless I'm mistaken)

45,46,47,48,49,50

it ran in 32 seconds for the whole 506C!

#include <Constants.au3>

Local $hTimer = TimerInit(), $sResult=""
Const $iN = 6, $iR = 50
Local $hOutFile = FileOpen("sets.txt", $FO_OVERWRITE)

For $i = $i + 1 To $iR - $iN + 1
  For $j = $i + 1 To $iR - $iN + 2
    For $k = $j + 1 To $iR - $iN + 3
      For $l = $k + 1 To $iR - $iN + 4
        For $m = $l + 1 To $iR - $iN + 5
          For $n = $m + 1 To $iR - $iN + 6
             $sResult &= $i & "," & $j & "," & $k & "," & $l & "," & $m & "," & $n & @CRLF
          Next
        Next
      Next
    Next
  Next
  FileWrite($hOutFile, $sResult)
  $sResult = ""
Next

FileClose($hOutFile)

ConsoleWrite("Run Time: "& Round(TimerDiff($htimer)/1000,3) &@CRLF)

 

Code hard, but don’t hard code...

Link to comment
Share on other sites

2 hours ago, JockoDundee said:

it ran in 32 seconds

Your posted script took 51 seconds on my aging i7-6700 @3.6ghz, what kind of monster cpu do you have? I want one 😛

Some guy's script + some other guy's script = my script!

Link to comment
Share on other sites

35 minutes ago, Werty said:

what kind of monster cpu do you have?

It’s an i9-10900K. I put off getting a new box for years because the CPUs didn’t seem to making any dramatic progress, at least as far as single-threaded tasks are concerned (go autoit!).  Then @Earthshine convinced me :)

Code hard, but don’t hard code...

Link to comment
Share on other sites

13 hours ago, Nine said:

Ok, here my latest test (under 25 secs) for a 50C6.

Script 1 (need to be compiled into PrintCombi.exe) :

#include <Constants.au3>

Const $iN = 6, $iR = 50
Local $aArray[$iR]
For $i = 0 To UBound($aArray) - 1
  $aArray[$i] = $i + 1
Next

Local $iStart = $CmdLine[1], $iEnd = $CmdLine[2]
;Local $iStart = 1, $iEnd = 44

Local $sOutName = $iStart & ".txt"
Local $hOutFile = FileOpen($sOutName, $FO_OVERWRITE)

Local $iNum = 0

Local $hTimer = TimerInit(), $sResult

For $i = $iStart - 1 To $iEnd - 1
  $sResult = ""
  For $j = $i + 1 To $iR - $iN + 1
    For $k = $j + 1 To $iR - $iN + 2
      For $l = $k + 1 To $iR - $iN + 3
        For $m = $l + 1 To $iR - $iN + 4
          For $n = $m + 1 To $iR - $iN + 5
;            For $o = $n + 1 To $iR - $iN + 6
;              For $p = $o + 1 To $iR - $iN + 7
                $iNum += 1
                $sResult &= $aArray[$i] & "," & $aArray[$j] & "," & $aArray[$k] & "," & $aArray[$l] & "," & $aArray[$m] & "," & $aArray[$n] & @CRLF
;              Next
;            Next
          Next
        Next
      Next
    Next
  Next
  FileWrite($hOutFile, $sResult)
Next

FileClose($hOutFile)

; MsgBox ($MB_SYSTEMMODAL, $iStart, TimerDiff($hTimer) & "/" & $iNum) ; to check which process finishes when

Script 2 (Main.au3)

#include <Constants.au3>

Local $aPID[4]
$aPID[0] = Run("PrintCombi.exe 1 2")
$aPID[1] = Run("PrintCombi.exe 3 5")
$aPID[2] = Run("PrintCombi.exe 7 10")
$aPID[3] = Run("PrintCombi.exe 11 44")

Local $hTimer = TimerInit()

While True
  Sleep (500)
  For $i = 0 to 3
    If ProcessExists($aPID[$i]) Then ContinueLoop 2
  Next
  ExitLoop
WEnd

MsgBox ($MB_SYSTEMMODAL, "Over All", TimerDiff($hTimer))

I removed 2 loops in PrintCombi.au3 since it is now a C6.  You can add more run in main if you got more processors (I got 4 on this machine).

All left to do is run a Copy 1.txt+3.txt+7.txt+11.txt All.txt in DOS console.  Or you could include it the main script if you wish to.  

Thanks for your help, it take 15 seconds to run on my machine (I compiled into x64). I am now tried modified the program, since in my case, the input values would be like this: 1,2,4,5,6,8,9...,28,35,42,43,...77,78 (total 50 numbers) and generate combination of 6 from them.

 

Regds

LAM Chi-fung

Link to comment
Share on other sites

8 hours ago, jchd said:

It would certainly help if you could fully explain the criterion you use to decide if a given subset satisfies your requirements. You probably could use that to constrain the generation to directly yield only subsets which you regard as "satisfactory".

Also AutoIt is probably not the best tool for such heavy lifting. For instance, 50C6 (with the items being integers) takes about one second to compute using Mathematica. Of course the full result (a huge list) would take very long to display, something I've omitted here:

In[1]:= Timing[Subsets[Table[x, {x, 1, 50}], {6}];]

Out[1]= {1.01563, Null}

You're likely to get the same kind of speed with similar (and much cheaper) CAS (computational algebra system = mathematical software) like SageMath, Magma, GAP, PARI/GP and others freewares. Payware CASes are Maple, Mathematica, and many more. See https://en.wikipedia.org/wiki/Category:Computer_algebra_systems

My input values would be like this:

1,2,3,5,6,....22,28,32,.....85,88... (total 50 numbers) and need to generate combination of 6 from them

e.g. 1,2,3,22,28,32 6,22,28,32,85,88 etc.

 

Regds

LAM Chi-fung

Link to comment
Share on other sites

Yes.

But how do you "analyse" and consider "satisfactory" (your words) the output combinations? Which criterion or algorithm do you use?

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

43 minutes ago, jchd said:

Yes.

But how do you "analyse" and consider "satisfactory" (your words) the output combinations? Which criterion or algorithm do you use?

Seems that modify the program to generate combination from non consecutive is not trivial...

The input of the 50C6 is the output of other program. i.e. the 50 numbers need not consecutive and must below 100. I need to combine each line of 50C6 of 3 (to 5) results to become a file similar to this (remember I break down it from 70C8 to few 50C6):

1,2,3,4,8,77,89,92

...

The project is related to "eight characters"  i.e. some kind of Asian fortune telling based on people's birthday (year, month, day and hour). What consider satisfy is the "eight characters" belong to a people (should) have good fortune (based on some special criteria). IMO, I am not quite believe this but client believe this.  ~_~ Why work on AutoIt for this ? Because the project need to interact with Excel a lot (need to control Excel to work out the result in previous stage).

 

Regds

LAM Chi-fung

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...