Jump to content

Recommended Posts

Posted

Anyone got any ideas/advice here?

I'll be working with 1200000+ samples and a lot of noise.

I'm looking at knn and LOF atm, with a view to maybe using both side by side and collecting the result. Unfortunately I have no experience with data analysis other than a stats AS level (read as no experience at all).

Thanks for any info

Matt

Posted

Mat, what format is the data?

You don't want to know.

It's a directory of 20+ csv files each with 60000 rows. Columns are a timestamp then 4 channels (float between +- 10). It's a mess. The reason it's a mess is because I was originally asked to output it in a format that could be opened in excel.

However, I don't think that's really that important. The timestamps can be ignored (intervals are ~constant), and the channels can be analysed separately, so it's just a question of where the anomalous results are in the sample.

Posted

The problem is that it's not a case of there being 1 line and a few anomalies. I should have added that at the beginning.

It will change, so it will start at 5 (+-1), step up to 7 +-1 go down to 2 +-1, and it's those changes that are important as well as the anomalies.

Posted (edited)

Thinking about this in my rule of thumb way of going about things: if there is a large amount of data and a large number of anomalies to watch out for, then I would think about taking smaller samples to look for a wider range of anomalies. Then if you find something that might be a candidate for deeper scrutiny, take larger samples and test them for similar anomalies. A kind of dead reckoning statistics. I don't know how relevant such an approach might be.

Edited by czardas
Posted

Again, it's a 1 sample anomaly in 100000 that I'm looking for.

Grrr.... Guess there's no miracle cure then. One day someones going to reply and say: Oh yes, someone wrote exactly what you wanted >here< and I won't need to do it myself :graduated: I'm still waiting for that day, like a monkey at a typewriter.

Posted (edited)

"look" ... :graduated:

Think your ignoring some of the trailing zero's in the numbers Mat posted.

Edited by iEvKI3gv9Wrkd41u

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...