The Data Mining Blog : Data Mining : Business Intelligence : Analytics : Marketing : Finance:

User Generated Content

Posted in Amazon, Business Intelligence, Data, Data Mining by Pankaj Gudimella on March 27, 2009

Dave Winer says

The thing I like best about shopping at Amazon are the user comments. They really are good. And I often base purchasing decisions on what the other users say. It got so bad that when I went shopping at Fry’s for some sound equipment I fumbled around until I realized what I was missing was the advice of other shoppers. I did the unfair thing, listened to a bunch of stuff and then went home and bought what I liked and what the others liked, from Amazon.

The gold mine of data Amazon is collecting from its user’s via their reviews has been increasing their bottom line for years’ now. Amazon is very prudent in how it uses this data and provides it to the customer.

Facebook is the other company that is sitting on such a gold mine and will unleash its true potential soon. Here is a piece from Scoble about facebook and Zuckerberg and the phase the business is in.


History of Business Intelligence

Posted in Analytics, Business Intelligence, Microsoft by Pankaj Gudimella on March 26, 2009

A fascinating story from Nic at Microsoft BI

Hat tip:The BI Blog

Interact 09 Conference

Posted in Analytics, Business Intelligence, Data Mining, FICO, Risk Management by Pankaj Gudimella on March 3, 2009

Fair Issac is hosting a decision management conference Interact 09 in New York City from March 10-13, 2009.

Some interesting topics are being covered which are very relevant to the turbulent economic times that we are living in:

Managing Risk in Credit Crunch
How to learn from Bad Debt
How Lending has changed

If you are interested, go here to register and use the promo code FMH00 to get $350 off the price of attendance. Thanks to Chris from Fleishman.

Data sets from Amazon

Posted in Amazon, Business Intelligence, Data by Pankaj Gudimella on February 25, 2009

Amazon announced four new data sets available to the public yesterday. You can find more on this here at the Amazon Web Services Blog.

It would be interesting to know the findings/insights from the developers who would work with these data sets.


Posted in Analytics, Business Intelligence, Data Mining by Pankaj Gudimella on February 12, 2009

An algorithm is a set of instructions that allows you to solve a problem.

Each instruction is simple and repeatable. It’s important to understand that the instructions work on all similar problems, not just one.

Here’s an algorithm for sorting any set of numbers, to get them into order. Start with 4,3,5,6,2 for example.

The bubble sort algorithm is simple. Compare two numbers. If the first number is higher than the second, switch them. So now it’s 3,4,5,6,2. Next step is to compare positions two and three. If the second is higher than the third (it’s not) switch them. Repeat for the whole string. Then start over. Do it over and over again until you can go the whole way with no switching. Done.

More here from Seth.


Posted in Analytics, Book, Business Intelligence by Pankaj Gudimella on February 2, 2009

Recently completed reading Numerati by Stephen Baker.

Good read for someone looking for an introduction to analytics and how it is being used in various industries today.

Tagged with: , ,

Data mining in the credit crisis

Posted in Analytics, Business Intelligence, Data Mining, NYTimes by Pankaj Gudimella on February 2, 2009

In recent months, American Express has gone far beyond simply checking your credit score and making sure you pay on time. The company has been looking at home prices in your area, the type of mortgage lender you’re using and whether small-business card customers work in an industry under siege. It has also been looking at how you spend your money, searching for patterns or similarities to other customers who have trouble paying their bills.

More here

Inscreased Usability gives an edge to SAS

Posted in Business Intelligence, Data Mining, SAS by Pankaj Gudimella on May 14, 2008

It’s been a busy spring for SAS Institute Inc., which recently unveiled version 9.2 release of its flagship business intelligence (BI) platform, picked up additional text mining and analytic technology (by acquiring Teragram), and announced an expansion of its relationship with data warehousing (DW) powerhouse Teradata Corp. (see

Despite a year of unprecedented consolidation in the BI market by a trio of BI giants (IBM Corp., Oracle Corp., and SAP AG), it’s business as usual at SAS, the Cary, N.C.-based BI, DW, and statistical analysis player, according to Ken Hausman, the company’s product marketing manager for data integration.

If anything, Hausman argues, rampant BI consolidation has only helped SAS refine its message. “There’s a certain part of our sales pitch that says SAS is a stable company, we’ve been around for 31 years, and we’ve had a fairly consistent focus over those years,” he comments. “With all that’s happened [with consolidation], that’s [a pitch that is] resonating with customers.” There’s also SAS’ focus on R&D, Hausman stresses. The company reinvests about one-fifth of its annual revenues into additional research and development activities.

Read more here.

Microsoft and Data Mining Dominance

Posted in Analytics, Business Intelligence, Data Mining, Microsoft, Predictive Analytics by Pankaj Gudimella on May 7, 2008

When it comes to data mining and predictive analytics, Microsoft Corp. might not be the first company that comes to mind.

That could change, however, especially if Donald Farmer, Redmond’s principal program manager for SQL Server Data Mining, has his way.

Microsoft has come a long way in the data mining and predictive analytics segment, Farmer says, and with a game-changing Excel 2007 release under its belt — and a promising SQL Server 2008 revision in the pipeline — Redmond hopes to challenge established powers SAS Institute Inc. and SPSS Inc. for data mining and predictive analytic bragging rights.

“[We don’t] have all the functionality of something like a SAS or an SPSS, because that’s just not our market,” he conceded.

It comes down to a difference of scale, according to Farmer. SAS and SPSS typically target larger, more expensive deployments, typically with users well-versed in the usage of their tools. Microsoft is targeting a different kind of data mining consumer: the Excel analyst, for example, who might not have much (if any) experience with data mining, predictive analytics or statistical analysis, for that matter.

Read more from RedmondMag here.

Programming Collective Intelligence – Toby Segaran

Posted in Book, Business Intelligence, Data Mining, Recommendation Engine by Pankaj Gudimella on April 18, 2008

Among the chief ideological mandates of the Church of Web 2.0 is that users need not click around to locate information when that information can be brought to the users. This is achieved by leveraging ‘collective intelligence,’ that is, in terms of recommendations systems, by computationally analyzing statistical patterns of past users to make as-accurate-as-possible guesses about the desires of present users. Amazon, Google and certainly many other organizations, in addition to Netflix, have successfully edged out more traditional competitors on this basis, the latter failing to pay attention to the shopping patterns of users and forcing customers to locate products in a trial and error manner as they would in, say, a Costco. As a further illustration, if I go to the movie shelf at Best Buy, and look under ‘R’ for Rambo, no one’s going to come up to me and say that the Die Hard Trilogy now has a special-edition release on DVD and is on sale. I’d have to accidentally pass the ‘D’ section and be looking in that direction in order to notice it. Amazon would immediately tell me, without bothering to mention that Gone With The Wind has a new special edition.

Programming Collective Intelligence is far more than a guide to building recommendation systems. Author Toby Segaran is not a commercial product vendor, but a director of software development for a computational biology firm, doing data-mining and algorithm design (so apparently there is more to these ‘algorithms’ than just their usefulness in recommending movies?). Segaran takes us on a friendly and detailed tour through the field’s toolchest, covering the following topics in some depth:
Recommendation Systems
Discovering Groups
Searching and Ranking
Document Filtering
Decision Trees
Price Models
Genetic Programming
… and a lot more

More from Slashdot here.