Sign in to follow this  
Followers 0
iahngy

file to array limit?

25 posts in this topic

#1 ·  Posted (edited)

Hi I try to use fileread() and convert the string to an array ...is there a limit ? or I can put any big size file to an array?

my file is 18MB or larger.

the whole point is I just wnt to search for a  unit ID and a faliing test faster  in such a big file..( I will hve more than one unit or a long list of units to search for in the file)

in the file  it has lot of other info between unit id and fail tests . I would like to know if using array to search for a unit is a faster way or would you give me any hint to search faster .

Edited by iahngy

Share this post


Link to post
Share on other sites



Hi,

You can use StringRegExp to extract the info you want.

Br, FireFox.


 

OS : Win XP SP2 (32 bits) / Win 7 SP1 (64 bits) / Win 8 (64 bits) | Autoit version: latest stable / beta.
Hardware : Intel(R) Core(TM) i5-2400 CPU @ 3.10Ghz / 8 GiB RAM DDR3.

My UDFs : Skype UDF | TrayIconEx UDF | GUI Panel UDF | Excel XML UDF | Is_Pressed_UDF

My Projects : YouTube Multi-downloader | FTP Easy-UP | Lock'n | WinKill | AVICapture | Skype TM | Tap Maker | ShellNew | Scriptner | Const Replacer | FT_Pocket | Chrome theme maker

My Examples : Capture toolIP Camera | Crosshair | Draw Captured Region | Picture Screensaver | Jscreenfix | Drivetemp | Picture viewer

My Snippets : Basic TCP | Systray_GetIconIndex | Intercept End task | Winpcap various | Advanced HotKeySet | Transparent Edit control

 

Share this post


Link to post
Share on other sites

The only limit is know of for arrays is this one:

Maximum number of elements for an array: 16,777,216


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

Hi FireFox,

I wish I can have the pattern working  to use stringregex w/o converting the fileread to array.

I hve trouble with pattern .

for example , the file will hve a lot of info until the unit id line   starts with : 2_unitid_(id1234) ( end of line)

then tons of other info lines  then comes with the line of failing test  , this line starts with 3_failtest_(VoltageRegulator)      (end of line)

I don't know how to put 2 search in one regex .

I m very appreciated if you can give any exp or hint.

Share this post


Link to post
Share on other sites

thnks Water. Files so far hve 500000 lines ..so far so good :)

I use _arraysearch for such big array ...I don't think it is fast as use regex for a string from fileread but I

hve trouble with pattern for regex so far to look for 2 item lines.

Share this post


Link to post
Share on other sites

Post a sample of your input data and show us what exactly you need to extract. RegExp pattern are very powerful but don't use magic wands and require exhaustive conditions to work satisfactorily.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Hi JHCD...:)

for example the file look like this :

other info line  1

other info line  2

other info line  3

....

at line 294  , the line is like this : 2_unitid_unit12345

other info line  295

....

at line 466, the line is like this: 3_failtest_iovoltage

other info line  467

...

then maybe at line 700 , it shows again 2_unitid_otherunit         

and so on.

I ll try to make a code again to post ..but so far It fails  .....I ll post mine here morrow.

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

The "what exactly you need to extract" is rather vague

#Include <Array.au3>
$file = FileRead("file.txt")
$lines = StringRegExp($file, '(?m)(^\d+_(?:unitid|failtest)_.*)$', 3)
_ArrayDisplay($lines)

?

Edited by mikell

Share this post


Link to post
Share on other sites

Mikell , you r awesome..

it takes only .7 sec to list 200 units  and failing tests.

Share this post


Link to post
Share on other sites

+_(?:unitid|failtest)_.*)

You use lookafter or lookahead  i guess .

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

Sorry for forgetting the comments

In Melba's style :  :)

'(?m)(^\d+_(?:unitid|failtest)_.*)$'

(?m)   :   multiline, ^ and $ match at newline sequences within data (by default multiline is off)
^\d+_   :   at the beginning of the line '^', one or more digits and an underscore
(?:unitid|failtest)  :  non-capturing group (?:) matching either 'unitid' or 'failtest'
_.*   :   an underscore and 0 or more characters (up to '$', newline at end of line)
Edited by mikell

Share this post


Link to post
Share on other sites

That's good. But if you're searching only couples of itemids with corresponding failling test(s) (that is only when one or more test failure is/are reported for this item id), then the job is pretty much more complex.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

In this data file, they design with only 1 failing test per unit but there are a lot of other info in the  mid of unit id and fail test.

MikeLL, is it possible to insert a known unit ID into that pattern so it pulls out the fail test for that unit only?

I tried diff ways with your pattern but it doenst list that unit and fail test.

So I convert the array $lines to another string and use stringRegExpReplace to search for that unit 's fail test and it works.

I just wonder if we hve a direct way from your pattern .

$lines = StringRegExp($file, '(?m)(^\d+_(?:unitid|failtest)_.*)$', 3)

$s1 = _ArrayToString($lines, ',')
$units = 'unit1345'

$m = StringRegExpReplace($s1, ".*2_unitid_" &$units &"\,3_failtest_(\d+)\,.*", "\1")
Edited by iahngy

Share this post


Link to post
Share on other sites

 is it possible to insert a known unit ID into that pattern so it pulls out the fail test for that unit only?

 

Yes if the failtest follows in the file the corresponding unitid (the amount of lines between them don't matter)

Not sure about what you need to get so choose your flavour  :)

$units = 'unit12345'

$lines = StringRegExp($file, '(?s)(\d+_unitid_' & $units & ').*?(\d+_failtest_(\V*))', 3)
_ArrayDisplay($lines)

$lines = StringRegExp($file, '(?s)\d+_unitid_' & $units & '.*?\d+_failtest_(\V*)', 3)
_ArrayDisplay($lines)

Share this post


Link to post
Share on other sites

#15 ·  Posted (edited)

Answering my own question about how we can extract only corresponding IDs with their test failure (hence ignoring units which don't fail), here's the pattern which fits the bill:

$lines = StringRegExp($file, "(?imx) ^2_unitid_ (\w+) .* \R (?: (?! 2_unitid_) .* \R )* 3_failtest_ (\w+)", 3)
_ArrayDisplay($lines)

Use this sample file to test and please report if result is incorrect with real-world input:

top of file

some stuff

2_unitid_id1    
5_gebho
89_fhnof
3_failtest_VoltageRegulator
2_unitid_id2    
5_gebho
3_failtest_FrontPanel
89_fhnof

2_unitid_id3    
5_gebho
89_fhnof
2_unitid_id4    
5_gebho
89_fhnof
2_unitid_id5    
5_gebho
89_fhnof
3_failtest_CurrentLimiter

2_unitid_id6
3_failtest_SurgeSuppressor
other stuff

end of file
Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

Thank you Mikell again for 2nd code...it works awesomely.

JCHD , yours has error, >Exit code: -1073741571    Time: 4.441

Well...I just found out the regex I tried to learn to make it search faster .But for this kind

of huge data files....$s = fileread($path)  takes like 30 second to complete for this  file size 12.5MB even I tried with binary read flag.

I m thinking to use Perl . I think it might be faster. I ll report later. :)

I m pretty newbie on Perl and tester at my work all use Perl for most..

Edited by iahngy

Share this post


Link to post
Share on other sites

Your assumption is incorrect and the code actually works. It's just that the nature of the search requires heavy backtracking which causes PCRE stack exhaustion with really large files. Try the same pattern on a smaller file (or try with Perl, I guess that it uses a bigger stack) and you'll see that it works as intended.

What setup are you using? It's impossible that any AutoIt program uses 30 s to FileRead 12.5 Mb off any slow disk, even from a sluggish 5400rpm device allergic to caffeine.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

I ll work out with your code tomorrow..i ll take a break...i was too excited last night wth diff trial..i didnt have enuf sleep. Hehe

Share this post


Link to post
Share on other sites

JHCD...it takes 30 sec jst for readfile

Share this post


Link to post
Share on other sites

I ll try your code morrow on smaller file

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0