Tag: Machine Learning
Tom Fawcett, Machine Learning Architect at Proofpoint, gave the San Francisco Bay Area ACM Data Mining SIG an insider’s view of email filtering on Monday, October 25th, 2010. Proofpoint has thousands of customers large and small and guarantees in their service level agreement that customers will get no more than one spam message per 350,000 emails. Tom pointed out that research on spam filtering has little to do with what companies do in practice in the “real world” and then he revealed a lot about how commercial spam filtering works.
Peter Farago and Sean Byrnes gave a juicy and surprising presentation about Flurry‘s mobile app analytics at the SDForum Business Intelligence Special Interest Group meeting on 10/19/2010 in Palo Alto. The title of their presentation was: ”Your Company’s Mobile App Blind Spot” and it provided both business and technical insights.
Flurry made a big splash in the news when Steve Jobs got pissed off at them and called them out by name in an interview because they outed Apple’s iPad when it was still a closely guarded secret. (See a short video outtake of the interview at VentureBeat.) Apple responded by changing legal agreements to exclude some third party analytics and some advertising.
Peter Norvig focused on a major lesson learned at Google and elsewhere in recent years and gave a fascinating keynote presentation on “The Unreasonable Effectiveness of Data” at the SDForum conference on “The Analytics Revolution” April 9th, 2010. The lesson is that data can be surprisingly effective: it can be used to get better performance improvements than one can get from improvements in algorithms.
Jeff Eastman gave a presentation on Mahout at the SDForum Business Intelligence Special Interest Group’s meeting on April 21st, 2009. Mahout is a collection of machine learning algorithms adapted for use on very large data sets using the Hadoop map-reduce platform. Jeff’s presentation “BI Over Petabytes: Meet Apache Mahout” gave a good introduction to Mahout and a snapshot of the current status. His slides are available here and in the SDForum Archives. (continue reading…)