Musings from the World of Consulting

Month

March 2012

9 posts

Self-Service goes mainstream in Business Intelligence and Analytics

image

EMC’s recent acquisition of Pivotal Labs coincided with the release of Greenplum Chorus. The former seems to be driven with the need to inject its internal software organization talent and leadership around Agile delivery; large software shops from time to time need a bit of cultural change to enhance productivity and remain nimble. For Pivotal, its distribution and marketing prowess just got a shot in the arm - EMC’s sales and marketing  coupled with its penetration in F1000 companies will help it compete better against the likes of Rally as Agile Delivery is just starting to gain traction. 

The main story though is around the notion of self-service. Just a few years ago, the notion of business users being able to write their own queries would have given their IT counterparts the shivers. IT has long had the centralized, locked down mentality when it came to business intelligence and analytics. They felt only they were intimately familiar with optimization of data access patterns and understood the ramifications of data distribution better than its owners. Though times are changing: most analytical oriented databases can manage adhoc workloads better through smarter query optimization and processing, better suited design (e.g., columnar databases) and make use of new infrastructure capabilities (e.g., SSD, in-memory architecture) In addition, the tools used by business users, such as MicroStrategy or Cognos are getting better at pushing down the various operations to the database, allowing for better use of processing power. 

In the last year or so, the notion of self-service has expanded beyond queries. Chorus is a prime example of this trend. Some of the capabilities are foundational, like federated metadata repository and search. On those dimensions, it is addressing areas where Greenplum was lagging and with this release is closer to par with some of its competitors.

The more notable and leading capabilities are the self-service around provisioning (or ‘spinning out’) of data sets for studies and ability for users to integrate their own data sources via REST or by uploading common file formats. If the underlying data is stored appropriately, it allows data scientists to be self sufficient for most of their daily activities, without relying on IT support. In addition, it accelerates the integration of third party datasources in the investigative phase, enhancing overall organizational learning productivity. 

Finally, few of the capabilities are an implicit acknowledgement that researchers or data analysts are not the most organized bunch - the concept of shared libraries and code seems like a marketing euphemism for code management tools. For most developers these are second nature, though for analytics departments needing to scale as they grow, these become a necessity to preserve the intrinsic knowledge and manage rapid iterations across a large team. 

Like EMC, Microsoft has been focusing on the accessibility and integration of third party data more heavily (versus ‘spinning out’ of datamarts). The first foray was launch of Data Marketplace on Azure. In addition to data sets, there are applications that can be accessed. Supposedly, building on SQL Server 2012, there will be a ‘private’ version of marketplace for use within the organization that would allow users to collaborate on queries, data sets and visualizations. 

It will be interesting to see how the database vendors (e.g., Teradata) react as well as the vertical integrated (e.g., Cognos + DB2 offering from IBM) evolve to address the growing awareness that social collaboration is key to unlocking the information potential of corporate data assets. 

Mar 28, 2012
#EMC #Greenplum #Chorus #self-service #BI #Analytics #Microsoft
Mar 27, 201225 notes
Mar 27, 2012
#microserver #Interconnect #AMD #Intel
Mar 26, 20121 note
#teamdynamics #leadership
Play
Mar 21, 2012
#bigdata #analytics #GLM #insurance #Allstate #machine learning
CNNMoney Tech Tumblr: About 5% of Facebook's users are fake → cnnmoneytech.tumblr.com

Not surprising that there are a few fake accounts though the estimate seems a bit high. Guess in matters such as these, conservative margin of error is the way to go.

cnnmoneytech:

Facebook filed an amended S-1 — the paperwork for an IPO — late on Wednesday. Our post-5pm brains are still picking through it, but the big takeaway is that Facebook estimates “false or duplicate accounts may have represented approximately 5-6%” of its monthly active users as of the end of 2011.

Mar 12, 2012313 notes
Play
Mar 4, 20121 note
#telematics #vehicle area network #social media
“The Starbucks boss—now comfortably a billionaire—was wiping a spill from the table with a napkin. Then he stood up to bus his mug to the counter. On the way, though, he paused: He had noticed an empty coffee cup that someone else had left behind, and so he grabbed that, too.” —FastCompany on Starbucks (http://www.fastcompany.com/most-innovative-companies/2012/starbucks)
Mar 1, 2012
#leadership
What does AMD's acquisition of SeaMicro portend? → bit.ly

The acquisition of SeaMicro by AMD is an affirmation of the distributed nature of processing becoming more common. Though the segment of microservers was initially touted by Intel (finding a great use for its Atom product line), as applications evolve and multi-threading / parallel software architectures become more common, server environments will be more like SeaMicros’ line up, than the traditional servers of today.

As noted in an earlier post, Intel is addressing this marketplace via its MIC architecture, which though initially targeting Exascale scenarios, would be quite relevant here. Through this acquisition, AMD is getting SeaMicro’s interconnect technology (and the supporting ASICs), which it needs to compete against Intel in this area. 

All in all, it is a bold and timely move for AMD, and it will keep Intel on its toes.

Mar 1, 2012
#microserver #interconnect #cloud computing #AMD
Next page →
2012 2013
  • January 2
  • February 1
  • March 1
  • April 1
  • May 2
  • June 3
  • July
  • August
  • September
  • October
  • November
  • December
2011 2012 2013
  • January 18
  • February 8
  • March 9
  • April 2
  • May 5
  • June 2
  • July 3
  • August
  • September 4
  • October 2
  • November
  • December 4
2011 2012
  • January
  • February 26
  • March 45
  • April 24
  • May 23
  • June 25
  • July 25
  • August 23
  • September 13
  • October 7
  • November 22
  • December 14