Marlo Posted November 18, 2012 Share Posted November 18, 2012 (edited) So I have a ~6Mb that is formatted like so: { "realm":{"name":"Someserver","slug":"someserver"}, "side1":{"data":[ {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]}, "Side2":{"data":[ {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]}, "Side3":{"data":[ {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}, {"auc":9999999999,"item":01234,"owner":"SomeNáme","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]} } Now bearing in mind that the file can oft times contain 50k lines of this stuff. I started by reading the file line by line and parsing it with a simple regexp string which extracted the basic info and pushed it into a SQLite memory database but even so it takes upwards of 30 seconds to process a whole file (and it takes about 30-50% CPU usage). Here is the RegExp i used; ^.*?{.?auc":(d*).*?"item":(d*),"owner":"([w]+)","bid":(d*),"buyout":(d*),"quantity":(d*),"timeLeft":"([a-zA-Z_]+)"} I am new to RegExp so my method is probably very bad : / So does anyone know a better way for me to be doing this? My way feels way too clunky. Edited November 18, 2012 by Marlo Click here for the best AutoIt help possible.Currently Working on: Autoit RAT Link to comment Share on other sites More sharing options...
KaFu Posted November 18, 2012 Share Posted November 18, 2012 (edited) Processing Reading the file line by line is a bottleneck. A simple way is to use _FileReadToArray(), that should be able to handle 6MB input files. Loop through the resulting array and apply your RegEx. For larger file I would recommend your own parser, reading e.g. 1MB chunks. Look for the last linebreak in the buffer (stringinstr -1) and parse the data up to that point, transfer the rest to a new buffer and read the next 1MB chunk. Splitting the lines with a RegExp should alreay be quite fast. Edited November 18, 2012 by KaFu OS: Win10-22H2 - 64bit - German, AutoIt Version: 3.3.16.1, AutoIt Editor: SciTE, Website: https://funk.eu AMT - Auto-Movie-Thumbnailer (2022-Nov-26) BIC - Batch-Image-Cropper (2023-Apr-01) COP - Color Picker (2009-May-21) DCS - Dynamic Cursor Selector (2024-Feb-16) HMW - Hide my Windows (2018-Sep-16) HRC - HotKey Resolution Changer (2012-May-16) ICU - Icon Configuration Utility (2018-Sep-16) SMF - Search my Files (2023-Jun-03) - THE file info and duplicates search tool SSD - Set Sound Device (2017-Sep-16) Link to comment Share on other sites More sharing options...
AZJIO Posted November 18, 2012 Share Posted November 18, 2012 (edited) expandcollapse popup$sText = _ '{' & @CRLF & _ '"realm":{"name":"Someserver","slug":"someserver"},' & @CRLF & _ '"side1":{"data":[' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]},' & @CRLF & _ '"Side2":{"data":[' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]},' & @CRLF & _ @CRLF & _ '"Side3":{"data":[' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"},' & @CRLF & _ '{"auc":9999999999,"item":01234,"owner":"SomeName","bid":999999,"buyout":999999,"quantity":999,"timeLeft":"VERY_LONG"}]}' & @CRLF & _ '}' ; MsgBox(0, "Сообщение", $sText) $aText = StringRegExp($sText, '(?m)^.*?{"auc":(d*).*?"item":(d*),"owner":"([w]+)","bid":(d*),"buyout":(d*),"quantity":(d*),"timeLeft":"(w+)"}', 3) If Not @error Then $n = UBound($aText) Local $aText2D[$n / 7 + 1][7] = [[$n / 7]] For $i = 0 To $n - 1 Step 7 $d = $i / 7 + 1 $aText2D[$d][0] = $aText[$i] $aText2D[$d][1] = $aText[$i + 1] $aText2D[$d][2] = $aText[$i + 2] $aText2D[$d][3] = $aText[$i + 3] $aText2D[$d][4] = $aText[$i + 4] $aText2D[$d][5] = $aText[$i + 5] $aText2D[$d][6] = $aText[$i + 6] Next EndIf #include <Array.au3> _ArrayDisplay($aText2D, 'Array') Edited November 18, 2012 by AZJIO My other projects or all Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now