Peter Farago and Sean Byrnes gave a juicy and surprising presentation about Flurry‘s mobile app analytics at the SDForum Business Intelligence Special Interest Group meeting on 10/19/2010 in Palo Alto. The title of their presentation was: ”Your Company’s Mobile App Blind Spot” and it provided both business and technical insights.
Flurry made a big splash in the news when Steve Jobs got pissed off at them and called them out by name in an interview because they outed Apple’s iPad when it was still a closely guarded secret. (See a short video outtake of the interview at VentureBeat.) Apple responded by changing legal agreements to exclude some third party analytics and some advertising.
Ken Krugler gave an interesting presentation on elastic web data mining at the 2009 Silicon Valley Data Mining Camp. Ken is the founder of Bixo Labs, Inc. Ken’s session was part of the half-day “unconference” organized by the Bay Area ACM at the Hacker’s Dojo in Mountain View on Sunday, November 1st, 2009.
Jeff Eastman gave a presentation on Mahout at the SDForum Business Intelligence Special Interest Group’s meeting on April 21st, 2009. Mahout is a collection of machine learning algorithms adapted for use on very large data sets using the Hadoop map-reduce platform. Jeff’s presentation “BI Over Petabytes: Meet Apache Mahout” gave a good introduction to Mahout and a snapshot of the current status. His slides are available here and in the SDForum Archives. (continue reading…)
Scale Unlimited held its first public “Hadoop Boot Camp” at the Plug and Play Center in Redwood City on March 5th and 6th, 2009. Hadoop is an Apache open source project used by Yahoo that includes a bundle of related sub-projects supporting distributed computing using MapReduce. It is becoming a “virtual OS for your data center” for many large distributable problems. Yahoo is a major contributor and uses Hadoop extensively on large clusters. Yahoo and Hadoop won the Terabyte sort benchmark contest in 2008 (the first Java and open-source entrant to win) using 910 nodes with two quad core Xeons per node. Hadoop has been used on a two thousand node cluster and the current design goal is 10,000 nodes.
Scale Unlimited is a new company specializing in Hadoop training and Principals Chris Wensel and Stefan Groschupf serve as friendly “Drill Sergeants.” Their two day training session includes hands-on labs as well as lectures and it is a great way to learn a lot about Hadoop Core and related technologies in a short period of time. They strike a nice balance by making everything compact and concentrated while avoiding making things indigestible, opaque, or overwhelming.
Michael Driscoll and Jim Porzak organized an excellent panel on “Case Studies in R” at Predictive Analytics World at the Hotel Nikko on Mason Street in San Francisco, February 18th, 2009. Actually, they couldn’t resist having fun yet again with the name “R” and the actual title was “The R and Science of Predictive Analytics: Four Case Studies in R.” Jim and Mike organize the Bay Area useR Group and this meeting was their 2009 kickoff. Driscoll is a Principal in a Business Analytics startup called Dataspora. Michael chaired the session and Jim, now at The Generations Network, gave a quick overview of R and served as one of the four panelists. The other three panelists were: Bo Cowgill from Google, Itamar Rosenn from Facebook, and David Smith from Revolution Computing. (continue reading…)
According to Michael Swain, Editor at Large of Dr. Dobb’s Journal, a paradigm shift is underway and functional programming is “on the verge of becoming a must-have skill.” In the cover story of the January, 2009 issue of Dr. Dobb’s Journal, “It’s Time to Get Good at Functional Programming” Swain argues that functional programming is better suited to parallel computation than procedural and object-oriented programming and will be needed to more fully exploit multi-core and multi-CPU computer systems.