Tuesday, February 07, 2006

Data Mining and Wiretaps

Bob Cringley has been astounding my world for more than a year now. I thought I should go trolling for comments here on a snippet from his most recent article.

http://www.pbs.org/cringely/pulpit/pulpit20060202.html

The last part discusses using data mining techniques on massive traffic flows. If you check out the two previous articles he discusses this more indepth, but this one really touches on my point. When you have enough traffic you can find any pattern you want. It is a natural human tendency to look for patterns in chaos (not with a q).

I posit that with correct mathematical techniques and an understanding of the limitation of data mining, new information can be found. Computer science is hammering away in this field right now, but from the class work I've taken and research I've done, it still seems a limited field. From my experience, unless you already know what you are looking for it is difficult to find the correct patterns. I know this is not always the case, but in general I believe it is. Most people use traffic to justify a belief they already have, even though the traffic may not show that (WMD's in Iraq for example).

So should we be creeped out by the fact that people are looking for patterns that may not exist?

I'm worried that this is being passed off as a legitimate groundwork for starting possible criminal investigations.