Sign in to follow this  
Followers 0
EndGame2013

Searching through massive array...

6 posts in this topic

#1 ·  Posted (edited)

Summary: I need to know the fastest way to search a very large array. Right now I am using a 2D array which may not be the fastest and restricts me to a lower amount of cells than I need. Some ideas I have had, but do not yet know how to implement, include possibly searching a very large excel sheet or mysql database instead. What is the fastest way of doing this. You can skip down to the code if want or read more if you like :)

Back to working on my lottery simulator/analyzer program. This is what I have so far, please ignore whatever sloppy messes you find. If there is a faster way of doing the search please let me know. The script basicaly allows you to simulate playing the megamillions lottery with however many numbers you want, however many times you want, and see the results. You can even simulate playing a set of numbers a hundred thousand times and see what you would have "won". Kind of fun to see how hard it is to win the jackpot, or anything close! So far its just a good tool to visualize playing thousands of draws of the lottery in just a few minutes.

The megamillions game will select 5 numbers out of a pool of 56, and one other number out of a different pool of 46. However with this "system" it is possible to play more than 5/1 numbers. For example if you want to you could play buy 5 tickets all with regular numbers 1 through 5 and each one with a different powerball number 1 through 5. I will not bother to explain the advantages/disadvantages to playing the lottery this way, but I will say that no matter what you do you can not increase your odds of winning in any way aside from buying more than one ticket with a different combination, no matter how you choose those combinations your odds will be the same. This type of tool is (or will be) designed to select some combinations of numbers (if say you wanted to buy 50 tickets in a lottery pool for your work group) that make sure to select combinations that will not be redundant.

For more information you can read this article (not written by me) http://www.colinfairbrother.com/GoldenLottoPlayingRules.aspx

Instructions:

At the top you just change the array size to match how many regular(white balls/first 5) numbers you want to cover (up to 56 which is the highest numbered regular ball in the game), you can change the actual picked numbers to whatever you want but I just use numbers in numerical order since the numbers you select don't really mean anything for analyzing the combinations, they are just used as "indexes" for combinations. If that makes no sense don't worry about it just select some numbers making sure you don't select the same number twice! Don't forget to change the size of the powerball numbers array to match how many powerballs you want to cover(up to 46 which is the highest numbered "powerball" in the game).

Then change the $rounds variable to match how many times you want to simulate playing the lottery with all possible combinations of those numbers. The more combinations in your "set" the longer it will take for each drawing to complete. This is the part that has do to with my question which I will get to in a moment. The script will then make all combinations of the numbers you picked in sets of 5, then it splits the results to make a 2D array, because I thought it would be easier to check each individual number that way (this may be part of the slowness). Then it tacks on the powerball numbers creating the final array of combinations, which can be very very big if you try and cover a lot of numbers.

When its done making this array it will display the number of tickets (combinations) you would have to buy to cover all possible combinations for the amount of regular numbers and powerball numbers you selected, after which it will begin simulating the lottery.

This is the part where I need help, in two different ways. First, taking what is there, how can I make this searching process faster? Would it be faster to leave the orginal array unsplit and then disect it in each loop? Or is it faster to just search a 2D array with many more elements, but be looking directly in the specific element with no disecting needed?

After running through all the drawings it tallying up your winnings (based on megamillions prizes) and compares it to the cost to buy the tickets each round. And finally you have an option to output it all to an excel sheet where you can look at each combination and see how many times it won each prize.

The second part of my question deals with the full set of combinations for the first 5 numbers of the game. Meaning all possible combinations of the numbers 1 through 56 in a set of 5. Which is 3,819,816. The maximum elements in an array is 16 million, and if I split the 3,819,816 five number combinations so I can search through it the way I did before, I would need 19,099,080 elements. Which is over 3 million more than the max for AutoIt.

So my problem arises from the need for a faster way to search a massive array... in fact possibly an array that is the near the maximum size. If I do not split the array and do the disecting of each 1D element in each loop (in order to compare each number in the combination) how much will it slow it down? Is there a faster way?

What I want to ultimately be able to do after resolving this problem is to be able to look at all possible combinations of those numbers, and remove any of them that have 3 or more numbers that are alike (except one). I am not sure which of the combinations that fit that criteria I would keep, and which to remove quite yet... I would for now just go with keeping the combination it is comparing with and removing any matches it compares against, then move down the array lexicographically and keep going till the end.

So here it is...

#include <Array.au3>
#include <RandUnique.au3>
#include <Excel.au3>
Dim $picks[7]=[20,9,17,23,50,33,7]
Dim $extraball[2]=[14,33]
Dim $rounds = 1000
ConsoleWrite("Creating regular number wheel..."&@lf)
Dim $fullwheel = _ArrayCombinations($Picks, 5, ",")
_ArrayDelete($fullwheel, 0)
Dim $splitted[UBound($fullwheel)][7]

for $i = 0 to UBound($splitted) - 1
$temp = StringSplit($fullwheel[$i], ",")
$splitted[$i][0] = $temp[1]
$splitted[$i][1] = $temp[2]
$splitted[$i][2] = $temp[3]
$splitted[$i][3] = $temp[4]
$splitted[$i][4] = $temp[5]
next


Global $complete[1][15]

$k = 0
$j = 0
$i = 0
ConsoleWrite("Creating full wheel with powerballs..."&@lf)
do
if $i = UBound($complete) then
ReDim $complete[$i+1000][15]
endif
$complete[$i][0] = $splitted[$j][0]
$complete[$i][1] = $splitted[$j][1]
$complete[$i][2] = $splitted[$j][2]
$complete[$i][3] = $splitted[$j][3]
$complete[$i][4] = $splitted[$j][4]
$complete[$i][5] = $extraball[$k]
If $k = UBound($extraball) - 1 Then
$j=$j+1
$k=0
Else
$k=$k+1
EndIf
$i = $i+1
until $j = UBound($splitted)


ConsoleWrite("Cleaning array please wait..."&@lf)
For $i = UBound($complete) - 1 To UBound($complete)- 1001 Step -1
;ConsoleWrite(" "&$i)
if $complete[$i][0] = "" Then
_arraydelete($complete, $i)
EndIf
next
ConsoleWrite("Total combinations using "&UBound($picks)&" regular numbers and "&UBound($extraball)&" power balls : "&UBound($complete)&@lf&"..."&@lf)
sleep(3000)
$match_5ball = 0
$match_5 = 0
$match_4ball = 0
$match_4 = 0
$match_3ball = 0
$match_3 = 0
$match_2ball = 0
$match_1ball = 0
$match_ball = 0
$done = 0
ConsoleWrite("Starting drawings..."&@lf)
sleep(2000)
for $k = 0 to $rounds

$draw = _RandomUnique(5, 1, 56, 1)
$powerball = _RandomUnique(1, 1, 46,1)
_arraydelete($draw,0)
_arraydelete($powerball,0)
ConsoleWrite("Drawing Number : "&$k&@lf&"Numbers Picked : "&$draw[0]&","&$draw[1]&","&$draw[2]&","&$draw[3]&","&$draw[4]&" * "&$powerball[0]&@lf)
for $i = 0 to UBound($complete)-1

if $complete[$i][0] = $draw[0] or $complete[$i][1] = $draw[0] or $complete[$i][2] = $draw[0] Or $complete[$i][3] = $draw[0] Or $complete[$i][4] = $draw[0] then
$first_draw = 1
Else
$first_draw = 0
EndIf

if $complete[$i][0] = $draw[1] or $complete[$i][1] = $draw[1] or $complete[$i][2] = $draw[1] or $complete[$i][3] = $draw[1] or $complete[$i][4] = $draw[1] then
$second_draw = 1
Else
$second_draw = 0
EndIf

if $complete[$i][0] = $draw[2] or $complete[$i][1] = $draw[2] or $complete[$i][2] = $draw[2] or $complete[$i][3] = $draw[2] or $complete[$i][4] = $draw[2] then
$third_draw = 1
Else
$third_draw = 0
EndIf

if $complete[$i][0] = $draw[3] or $complete[$i][1] = $draw[3] or $complete[$i][2] = $draw[3] or $complete[$i][3] = $draw[3] or $complete[$i][4] = $draw[3] then
$fourth_draw = 1
Else
$fourth_draw = 0
EndIf

if $complete[$i][0] = $draw[4] or $complete[$i][1] = $draw[4] or $complete[$i][2] = $draw[4] or $complete[$i][3] = $draw[4] or $complete[$i][4] = $draw[4] then
$fifth_draw = 1
Else
$fifth_draw = 0
EndIf

if $complete[$i][5] = $powerball[0] then
$power_draw = 1
Else
$power_draw = 0
EndIf

$regular_matches = $first_draw + $second_draw + $third_draw + $fourth_draw + $fifth_draw
Select
case $regular_matches = 5 and $power_draw = 1
$match_5ball = $match_5ball + 1
$complete[$i][6] = $complete[$i][6] + 1
$done = 1
case $regular_matches = 5 and $power_draw = 0
$match_5 = $match_5 + 1
$complete[$i][7] = $complete[$i][7] + 1
$done = 1
case $regular_matches = 4 and $power_draw = 1
$match_4ball = $match_4ball + 1
$complete[$i][8] = $complete[$i][8] + 1
case $regular_matches = 4 and $power_draw = 0
$match_4 = $match_4 + 1
$complete[$i][9] = $complete[$i][9] + 1
case $regular_matches = 3 and $power_draw = 1
$match_3ball = $match_3ball + 1
$complete[$i][10] = $complete[$i][10] + 1
case $regular_matches = 3 and $power_draw = 0
$match_3 = $match_3 + 1
$complete[$i][11] = $complete[$i][11] + 1
case $regular_matches = 2 and $power_draw = 1
$match_2ball = $match_2ball + 1
$complete[$i][12] = $complete[$i][12] + 1
case $regular_matches = 1 and $power_draw = 1
$match_1ball = $match_1ball + 1
$complete[$i][13] = $complete[$i][13] + 1
case $regular_matches = 0 and $power_draw = 1
$match_ball = $match_ball + 1
$complete[$i][14] = $complete[$i][14] + 1
EndSelect
$first_draw = 0
$second_draw = 0
$third_draw = 0
$fourth_draw =0
$fifth_draw = 0
$power_draw = 0
Next
if $done = 1 then
ExitLoop
EndIf

next
ConsoleWrite("Match 5+Ball : "&$match_5ball&" -- Match 5 : "&$match_5&" -- Match 4+Ball : "&$match_4ball&" -- Match 4 : "&$match_4&" -- Match 3+Ball : "&$match_3ball&" -- Match 3 : "&$match_3&" -- Match 2+Ball : "&$match_2ball&" -- Match 1+Ball : "&$match_1ball&" -- Match Ball : "&$match_ball&@lf)
$c1 = $match_5 * 250000
$c2 = $match_4ball * 10000
$c3 = $match_4 * 150
$c4 = $match_3ball * 150
$c5 = $match_3 * 7
$c6 = $match_2ball * 10
$c7 = $match_1ball * 3
$c8 = $match_ball * 2
$cash_back = $c1+$c2+$c3+$c4+$c5+$c6+$c7+$c8
$total_cash = $cash_back - UBound($complete)*$rounds
ConsoleWrite(" ---- Cash Back = "&$cash_back&" Total Cash = "&$total_cash&@lf)
ConsoleWrite("Creating Excel Sheet..."&@lf)
$make_sheet = MsgBox(4,"Make Excel Sheet?", "Make Excel Sheet?")
if $make_sheet = 6 then
_arrayinsert2d($complete, 0)
$complete[0][0] = "Pick 1"
$complete[0][1] = "Pick 2"
$complete[0][2] = "Pick 3"
$complete[0][3] = "Pick 4"
$complete[0][4] = "Pick 5"
$complete[0][5] = "PowerBall"
$complete[0][6] = "Match 5+Ball"
$complete[0][7] = "Match 5"
$complete[0][8] = "Match 4+Ball"
$complete[0][9] = "Match 4"
$complete[0][10] = "Match 3+Ball"
$complete[0][11] = "Match 3"
$complete[0][12] = "Match 2+Ball"
$complete[0][13] = "Match 1+Ball"
$complete[0][14] = "Match Ball"
$oExcel = _ExcelBookNew(1)
$b=$oExcel.transpose($complete)
$oExcel.Range("A1").Resize(UBound($complete),15) = $b
EndIf
;_arraydisplay($complete)

Oh by the way, I haven't really looked at it yet but why is my $done not working? It is supposed to stop the script if you win one of those higher jackpots (to see if your ahead) but it doesn't. Later I would also like to stop it when you have won a certain amount more than you have bet using similar methods.

I use random unique for the powerball even tho it only selects one just for the sake of using the same method of picking a random number.

Edited by EndGame2013

Share this post


Link to post
Share on other sites



EndGame2013,

You are missing functions _randomunique and _arrayinsert2d in your example code.

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Hmm that's weird. The forum popup thing must not of put it in correctly. I fixed the includes and added the functions.

Thx for pointing it out.

EDIT: I removed the code for those two functions because they are irrelevant. Random Unique just picks unique random numbers :idiot:, and array 2d insert just puts an empty row at a specific position and pushes everything else down.

I apologize for the long post, I essentially need to figure out a way to optimize the search part.

EDIT: Darn it... i just did this:

#include <Array.au3>
#include <RandUnique.au3>
#include <Excel.au3>
#include <WinAPI.au3>
#include <File.au3>
Dim $picks[56]
for $i = 0 to 55
$picks[$i] = $i + 1
next
ConsoleWrite("Creating regular number wheel..."&@lf)
Dim $fullwheel = _ArrayCombinations($Picks, 5, ",")
_ArrayDelete($fullwheel, 0)
ConsoleWrite("Outputting to Excel..."&@lf)
_ArrayToXLS($fullwheel, @ScriptDir & 'temp.xls')
ConsoleWrite("Done"&@lf)

Which took 52 minutes on my cheap laptop before I realized the maximum rows for excel 2010 is 65536... Does this apply to microsoft excels ability to use that many rows, or is that the max that the is actually written to that file? Can openoffice use more maybe Nevermind I just answered my own question... and Excel can only have about 16 million cells too! The same as the array maximum... Well I hope there isn't a maximum on mysql rows! I suppose I could use a text file to store the data, but how long would it take to load the data and split it into an array? Hopefully faster than creating the combinations initially.

Edit:

After doing some tests and then some math I found out that running the above script and outputting to and excel sheet takes a little over 52 minutes on my machine. Doing the same thing except outputting to a text file should take 16 minutes. So I will also assume that it will be faster to read from a text file than an excel sheet.

*Done making the text file.. which is 55.3 MB... :o now waiting for it to open in notepad....

Edited by EndGame2013

Share this post


Link to post
Share on other sites

To answer the original question, if the array can be sorted without a problem, then you can use _ArrayBinarySearch to do a lightning fast search of an array. If it is a 2D array, you can use the the UDF modification I posted that works with 2D arrays it's linked in my signature.


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

EndGame2013,

Just spit ballin'...have you considered an SQLite in memory DB solution?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

To answer the original question, if the array can be sorted without a problem, then you can use _ArrayBinarySearch to do a lightning fast search of an array. If it is a 2D array, you can use the the UDF modification I posted that works with 2D arrays it's linked in my signature.

I will check that out now. And yes the array is sorted in lexicographical order (for those who don't know what that means, think alphabetical for numbers) when it comes out of _ArrayCombinations.

EndGame2013,

Just spit ballin'...have you considered an SQLite in memory DB solution?

kylomas

I actually don't know what SQLite is. I know what MySQL is and have used it with PHP and AU3. Guess that ones "worth a google" too! :)

Thanks for setting me in the (hopefully) the right direction, both of you! I will update with results.

Note: I was finally able to load that text file in notepad last night and finally get a look at all 3.8 millionish combinations, albeit after about 5 minutes of "Not Responding" lockup.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0