Mat Posted September 26, 2011 Share Posted September 26, 2011 Anyone got any ideas/advice here? I'll be working with 1200000+ samples and a lot of noise. I'm looking at knn and LOF atm, with a view to maybe using both side by side and collecting the result. Unfortunately I have no experience with data analysis other than a stats AS level (read as no experience at all). Thanks for any info Matt AutoIt Project Listing Link to comment Share on other sites More sharing options...
taietel Posted September 26, 2011 Share Posted September 26, 2011 Mat, what format is the data? Things you should know first...In the beginning there was only ONE! And zero... Progs: Create PDF(TXT2PDF,IMG2PDF) 3D Bar Graph DeskGadget Menu INI Photo Mosaic 3D Text Link to comment Share on other sites More sharing options...
Mat Posted September 26, 2011 Author Share Posted September 26, 2011 Mat, what format is the data?You don't want to know.It's a directory of 20+ csv files each with 60000 rows. Columns are a timestamp then 4 channels (float between +- 10). It's a mess. The reason it's a mess is because I was originally asked to output it in a format that could be opened in excel.However, I don't think that's really that important. The timestamps can be ignored (intervals are ~constant), and the channels can be analysed separately, so it's just a question of where the anomalous results are in the sample. AutoIt Project Listing Link to comment Share on other sites More sharing options...
taietel Posted September 26, 2011 Share Posted September 26, 2011 If the data is in csv format, you can easily transform it to xls. You can determine the standard deviation, then see what values are far away from the mean. Things you should know first...In the beginning there was only ONE! And zero... Progs: Create PDF(TXT2PDF,IMG2PDF) 3D Bar Graph DeskGadget Menu INI Photo Mosaic 3D Text Link to comment Share on other sites More sharing options...
Mat Posted September 26, 2011 Author Share Posted September 26, 2011 The problem is that it's not a case of there being 1 line and a few anomalies. I should have added that at the beginning. It will change, so it will start at 5 (+-1), step up to 7 +-1 go down to 2 +-1, and it's those changes that are important as well as the anomalies. AutoIt Project Listing Link to comment Share on other sites More sharing options...
czardas Posted September 26, 2011 Share Posted September 26, 2011 (edited) Thinking about this in my rule of thumb way of going about things: if there is a large amount of data and a large number of anomalies to watch out for, then I would think about taking smaller samples to look for a wider range of anomalies. Then if you find something that might be a candidate for deeper scrutiny, take larger samples and test them for similar anomalies. A kind of dead reckoning statistics. I don't know how relevant such an approach might be. Edited September 26, 2011 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
Mat Posted September 26, 2011 Author Share Posted September 26, 2011 Again, it's a 1 sample anomaly in 100000 that I'm looking for. Grrr.... Guess there's no miracle cure then. One day someones going to reply and say: Oh yes, someone wrote exactly what you wanted >here< and I won't need to do it myself I'm still waiting for that day, like a monkey at a typewriter. AutoIt Project Listing Link to comment Share on other sites More sharing options...
czardas Posted September 26, 2011 Share Posted September 26, 2011 Just throwing around whatever comes to mind. You'll probably have to get your hands dirty. operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
taietel Posted September 26, 2011 Share Posted September 26, 2011 Find a way to graphicaly represent the data, then look for a change in the pattern. Human brain can see the difference lot more faster than any written code. Things you should know first...In the beginning there was only ONE! And zero... Progs: Create PDF(TXT2PDF,IMG2PDF) 3D Bar Graph DeskGadget Menu INI Photo Mosaic 3D Text Link to comment Share on other sites More sharing options...
MvGulik Posted September 26, 2011 Share Posted September 26, 2011 (edited) "look" ... Think your ignoring some of the trailing zero's in the numbers Mat posted. Edited September 26, 2011 by iEvKI3gv9Wrkd41u "Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions.""The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014) "Believing what you know ain't so" ... Knock Knock ... Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now