Opened 17 years ago
Closed 17 years ago
#599 closed Feature Request (Completed)
_FileListToArray costs O(N^2) instead of O(NlogN)
| Reported by: | code65536 | Owned by: | Gary |
|---|---|---|---|
| Milestone: | 3.2.13.10 | Component: | Standard UDFs |
| Version: | Severity: | None | |
| Keywords: | Cc: |
Description
Memory re-allocation is an expensive operation and as such, should be avoided. For _FileListToArray to read a directory with N entries, it will need to perform N ReDim operations, and with each ReDim operation costing O(N) (assuming that ReDim is implemented efficiently), _FileListToArray will cost O(N2).
A very common way to mitigate this is to double the array size with each re-allocation instead of just incrementing it. Under such a system, you will greatly reduce the number of ReDim operations, down to log2(N), bringing the final asymptotic cost of the function down to O(NlogN), which is optimal.
I am attaching a patch that will fix this problem.
With large directories with about 300 files, implementing this simple change reduced the time spent by over half (from ~10s for 100 runs down to ~4.8s for 100 runs in my benchmark). Proportionally, the performance win will be even larger for directories with more entries.
Attachments (2)
Change History (9)
by , 17 years ago
| Attachment: | bug599.patch added |
|---|
comment:1 by , 17 years ago
Note that with this patch, UBound will no longer correspond to the number of entries found, but since the first element of the result array contains the number of entries, I don't think that this would be a problem.
If people do think that this is a problem--that people are using UBound instead of the first array element, you can add a final ReDim to the end, right before the return to "trim" away any excess size from the array. The function will still retain O(NlogN) performance.
comment:2 by , 17 years ago
The function must trim to the correct size. People do use UBound() because it's the correct way. The size in element 0 is a carry-over convention from before UBound() existed and is thus largely pointless and redundant.
comment:3 by , 17 years ago
| Type: | Bug → Feature Request |
|---|
comment:5 by , 17 years ago
| Milestone: | → 3.2.13.10 |
|---|---|
| Resolution: | → Fixed |
| Status: | new → closed |
comment:6 by , 17 years ago
| Resolution: | Fixed |
|---|---|
| Status: | closed → reopened |
comment:7 by , 17 years ago
| Resolution: | → Completed |
|---|---|
| Status: | reopened → closed |

patch