The Data Mining Blog : Data Mining : Business Intelligence : Analytics : Marketing : Finance:

SuperNap – The largest data center ever!

Posted in Data by Pankaj Gudimella on May 29, 2008

Drive a couple of blocks past the Loose Caboose and the Carburetor Shop on E. Sahara Avenue in Las Vegas, and you’ll find one of the world’s leading technology companies. The name of the company – Switch Communications – will go unrecognized by almost all of you. That’s because it has operated in near total secrecy for the last few years. Switch has preferred to keep its gold mine a need-to-know type of affair. “Pay no attention to the secure fortress in the strip mall.”

A few months ago, word of Switch’s apparently fantastic operations started to reach my in-box. Most of the people who visited the Switch facility were bound by non-disclosure agreements, but that failed to stop them from leaking out a few choice details. “This is the most advanced computing center in the world,” I was told. “It’s like the internet superhighway wrapped up in one package. All the heavies are there.”

Ever a cynic, I struggled to match these claims with the total lack of public information available on Switch. Companies fall all over themselves to issue press releases about things as a minor as cost-savings achieved by changing toilet paper suppliers. If a technology giant really existed in Las Vegas of all places, then it should be patting itself on the back and then letting city officials finish off the job with celebrations of their own.

As Switch’s CEO Rob Roy tells it, however, the company had good reason to avoid publicity.

Legend has it that the company managed to acquire what was once meant to be Enron’s broadband trading hub for a song. This gave Switch access to more than twenty of the primary carrier backbones in a single location. Switch tied this vast network to existing data center hosting facilities and attracted military clients, among others, to its Las Vegas shop.

Read more here.

Tagged with: ,

Predictive Modeling 101

Posted in Data Mining, Predictive Modeling by Pankaj Gudimella on May 22, 2008

I read a very good article from marketingsherpa which explaines the basics of predictive modeling. A very good read for someone who is looking for an introduction to the art and science of predictive modeling. Enjoy the article!

How to Create a Predictive Model
A predictive model determines the probability of a certain outcome based on a target — what you want to predict. You use data-mining software to sift through your customer database.

Every category of customer information — age or favorite color or buying frequency or how many times a customer visited your store in the past year — is a variable collected as a predictor of future behavior. A predictor is your model’s central building block.

For example, you want to predict which customers will visit your store at least five times in the next 12 months. Here’s a simplified version of what you need to do:

-> Step #1. Prepare your data

Preparing data is the most difficult and complicated step in the process. We’ll talk about why and what you can do about it later.

“It’s estimated that 70% to 80% of the time devoted to an analytical project is devoted to data preparation. It’s just getting the data in the one place in the right form to actually start building models,” says Richard Hren, Director Product Marketing, SPSS.

->Step #2. Set your target

Your target is the customers who will visit your store five times in the next year. For this example, the target is the same as one of the variables — customers who visited the store five times in the past year.

->Step #3. Determine the most important variables

Determine which variables are most relevant to your target. Some types of data mining software will dig through data and tell you. Other packages depend on your judgment to determine which variables matter most. Some software will do both: tell you what it likes and allow a statistician to tweak it.

->Step #4. Run program to get a model

The software weighs the importance of each variable and creates a model — think of it as an equation. You fill in each variable in the equation and then the model calculates and gives higher scores to customers with the greatest probability of visiting your store more times in the next year.

Usually, you don’t have to score one customer at a time. You can build a model to automatically score a database of these higher probability customers.

The next generation web and data mining

Posted in Data Mining, Web Analytics by Pankaj Gudimella on May 22, 2008

I truly believe the next battleground will be based on scaling the back end and more importantly mining all of that clickstream data to offer a better service to users. Those that can do it cheaply and effectively will win. The tools are getting more sophisticated, the data sizes are growing exponentially, and companies don’t want to break the bank nor wait for Godot to deliver results.

More here from Ed Sim from BeyondVC.

Tagged with: ,

The art of logic

Posted in Puzzles by Pankaj Gudimella on May 20, 2008

I came across this very cool website which has puzzles, when solved reveal the hidden pixel art pictures.

If interested, visit conceptispuzzles.

Tagged with:

Online TV ads no longer afterthought

Posted in Online Marketing by Pankaj Gudimella on May 20, 2008

Conscious that millions of people are now watching TV shows online, marketers are likely for the first time this year to make digital-ad buys a key part of their “upfront” ad-purchase negotiations with TV networks, media buyers say. “The digital ads aren’t a throw-in after the main conversation is over. It’s now part of the main conversation,” says Alan Schanzer, managing partner at MEC Interaction North America, part of WPP Group’s media-buying and planning unit Mediaedge:cia. Major TV networks sell about 75% of their ad inventory for the coming fall season during the upfront. Digital ads historically haven’t made up a significant portion of these buys. But marketers say they increasingly are looking to purchase digital ads as part of a package with their standard TV commercials so that they can reach the audiences watching a show regardless of whether it is on TV or the Web. Web audiences have become sizable enough that they can’t be ignored.

Source:Wall Street Journal

Tagged with:

Microsoft braces for major customer shift to cloud computing

Posted in Data, Microsoft by Pankaj Gudimella on May 20, 2008

Microsoft sees tens of millions of corporate email accounts moving to its data centers over the next five years, shifting to a business model that may thin profit margins but generate more revenue. In an interview, Chris Capossela, who manages Microsoft’s Office products, said the company will see more and more companies abandon their own in-house computer systems and shift to “cloud computing,” a less expensive alternative. Cloud computing is the trend by Internet powerhouses to array huge numbers of computers in centralized data centers to deliver Web-based applications to far-flung users.

Read more here.

Tagged with: , ,

Database Analytics Startup Aster Data launched

Posted in Analytics, Startup by Pankaj Gudimella on May 20, 2008

Grid-computing startup Aster Data Systems will officially launch today, three years after it was founded. Aster, which began in the Ph.D program at Standford, is a provider of “massively parallel processing databases” for organizations that have mammoth quantities of data that need to be stored and analyzed quickly. The Redwood City, California-based company is backed by Sequoia Capital, Cambrian Ventures, and First-Round Capital.

Aster’s nCluster software allows companies with large amounts of data to store it on commodity hardware and scale with one-click, adding new servers as the data set grows. The company’s first major client is MySpace, which generates 100s of terabytes of traffic data from its 110 million monthly unique users. Mining that data to understand how customers use and interact with the site requires some pretty robust architecture.

Read more here.

Tagged with: , ,

Guessing the Online Customer’s Next Want

Posted in Data Mining, NYTimes, Online Marketing by Pankaj Gudimella on May 19, 2008

Marketers have always tried to predict what people want, and then get them to buy it.

Among online retailers, pushing customers toward other products they might want is a common practice. Both Amazon and Netflix, two of the best-known practitioners of targeted upselling, have long recommended products or movie titles to their clientele. They do so using a technique called collaborative filtering, basing suggestions on customers’ previous purchases and on how they rate products compared to other consumers.

Figuring that out is not so easy. For one thing, people do not always buy what they like. Someone may buy a sweater for their grandmother even though they dislike it and would never get it again. Similarly, a person who rents a movie may actually detest it but knows her child likes it. Or a film that was seen on a small airplane screen may garner a lower rating than if it were seen at a large multiplex.

More here from NYTimes.


Posted in Analytics, NYTimes, Online Marketing by Pankaj Gudimella on May 19, 2008

In the past few years, Web publishers have made a big bet on booming online advertising revenues. But the economic slowdown may be throwing a wrench into those plans.

While search advertising remains strong, there are signs that the growth in online advertising — particularly in more elaborate display ads — is slowing down. In the past few weeks, major online-advertising players, like Yahoo and Time Warner, have posted mixed results.

And online publishers may be getting less money for the ad space they do sell. The prices paid for online ads bought through ad networks dropped 23 percent from March to April, according to PubMatic, an advertising-technology company in Palo Alto, Calif., that runs an online-pricing index. Large Web publishers fared the worst in PubMatic’s study, with the prices they received through networks dropping 52 percent.

More here from NYTimes.

Business Intelligence via Blackberry!

Posted in Uncategorized by Pankaj Gudimella on May 16, 2008

IBM said on Wednesday it has started selling software that lets customers access its Cognos business intelligence software via BlackBerry mobile devices.

The device maker, Research in Motion is encouraging businesses to create software specifically for the BlackBerry so that it can boost usage beyond the e-mail, messaging, calendar and phone services for which it is best known.

The Cognos program, which sells at a list price of $300 per user, allows customers to view real-time analytics on the state of their business on their BlackBerrys.

The computing giant has also introduced programs that allow BlackBerry users to quickly locate and communicate with colleagues with expertise in specific business areas.

A third new product from IBM allows users to access personalized content from their corporate websites via the BlackBerry, IBM said.

Armonk, New York-based IBM acquired the business intelligence programs in January with its purchase of Canada’s Cognos for about $4.9 billion.

Source: Reuters