@johnlovett : If You Only Knew the Power of the Dark Side


Following the rules, per John Lovett’s tweet, sounds like solid advice . . . for those issues where pretty much anyone would agree rules are solid.
However, for just about everything there are few areas of clear delineation and vast areas of gray.

There is also tremendous value in having a comprehensive understanding of what is possible via technology.

Both for defense against attacks, and also to better utilize systems where the user may have imperfect information about the structure of the system.

Having a hacking mentality is something I view as a crucial component of high performance online optimization.


When Google started crawling the web there were grumblings that their manner of indexing wasn’t following the unwritten rules of the Internet at the time.

Google did their thing back then and now it’s all cool.

Analysts Answering Questions

Setting aside that and other overly familiar tales of rule breakers that succeed in business or in becoming the captain of the USS Enterprise, are people who follow the rules the ideal person for someone to do analysis for you?

Reading between the lines of ‘The New Know‘, which is such an excellent book I purchased two copies for a vendor I work with, the response might be ‘no.’

Analysts should be able to solve what Thornton calls ‘Alpha Level’ problems, those problems which haven’t an existing solution and by extension lack rules.

By definition, the paradigm of a rules based environment is opposite the effort to solve new problems, or existing problems in new ways.

Analyst of the Simple Triangle

There is a study of geometry, yes geometry, which examines the non-empty spaces and how features change.

What’s important is the ability to conceptualize the coordinates for the triangle across the different shapes of the surfaces.

Reducing the complexity of the challenge from the sphere, to the saddle function, to the flat plane may create a more computationally friendly environment.

For many organizations,  as data complexity is reduced the utility of that data across the organization grows.

Web Analytics Tools Reduce Complexity

There is a challenge when the analytics center of excellence themselves rely on a reduced view of the world as their sole input, without an understanding of what is going on behind the scenes.

Any tool, take Google Analytics for example, reduces the complexity of what is the reality to a summation.

In this case it is a dashboard, the most summary of summations, which I guess has some utility to someone:

An individual who thinks about the world in terms of how to game a system might be able to interact with the Google Analytics interface, and over time have a concept of the behind the scenes plumbing which drives Google Analytics.

This individual with the hacker mentality might be able to visualize a model of things such as the data model behind Google Analytics, how data transformations which are nominally opaque occur.

At this point, the importance of choosing to place a custom variable value for every visitor or not in Google Analytics would become apparent.

Mental Gymnastics

Gymnasts work out all the time to keep at peak efficiency, and so should analysts be working their mental routines constantly because you never know when you might be called upon to take a reduced view of the world and produce real results.

For the record, I didn’t actually create an auto-tweet robot, or some other scheme, to cheat the ACCELERATE contests going on.

Not because I couldn’t make an attempt, but because it was sort of uninteresting for the purposes of the conference.

Enter the Matrix

The challenge of understanding differences in the tools available, both from explicitly available information and otherwise obtainable resources, will always be interesting to me.

When I work with clients and see Google Analytics reporting interface above as the Matrix, I know I’m on the right path.

This fundamental understanding  enables analysts to exceed the limitations of the reality which they are placed into by their tools.


Make sure you come to ACCELERATE if you are signed up to attend, it’s a packed house and I hear seats may be given away if you are late.

Thursday night is Web Analytics Wednesday at ROE, sign up here.

I’ll be speaking in the afternoon at ACCELERATE and at least stopping by the Web Analytics Wednesday, so let me know if you want to connect.


Technorati Tags: ACCELERATE, John Lovett, Measure, WAW

Posted in ACCELERATE, Analytics, Conferences, Measure | Tagged , , , | 2 Comments

The Future of #Measure is Bright: #ACCELERATE Nov 18

Create Solutions From Challenges

Digital Measurement Professional Outlook

At eMetrics NY I was on a panel with Chris Berry of Syncapse moderated by Gary Angel wherein Chris made the distinction between

  • Sentiment Analysis and
  • Opinion Mining or Topic Classification

During our introductions prior to the distinction made by Chris I had conctiously chosen to simplify the discussion and discuss Topic Classification under the broad scope of Sentiment Analysis.

Chris’ distinction moved the conversation into a far more productive direction about how the usage of the two methods are very different.

Fantastic stuff!

The Business Case For Over-Simplification

The tweet by @AndrewJanis, with responses and re-tweets, summarizes why I chose over-simplification:

I am reminded of the person who watches a baseball game, and when they see a home run somehow they aren’t impressed.

When people don’t know the complexity of what they don’t know, those unknowns sure look easy.

People and Skill Classification

The three basic buckets of skills needed for digital measurement are:

  • Quantitative
    • A Huge Bucket Including Math, Stats, Econ and Much More
  • Technology
    • The Ability to Operationalize Automation in a Timely Fashion
  • Business
    • The Understanding of Business Operations and Optimization

Hire 58% Not 50%

Consider the three individuals sourced to make up those buckets above, what would an overlay of those three skill sets above look like for Quantitative, Technology and Business people?

Glad you asked

For the purposes of this overly generalized generalization, people are in columns and skill sets are in rows. Line up each specific person type with each skill bucket set for my completely unscientific estimation of the skill capacity.

Quantitative and Technology people have to cross train just to get things done, on top of which they all have at least some understanding of business skills.

The Business Person just may be the odd duck out.

Aside from those specific exceptions, MIT Business School I’m looking at you, Business People have a brief introduction to quantitative skills and would largely be self-taught in technology skills.

Why in the world does this matter?

The Future of Digital Optimization is Irreducibly Complex

During the same Sentiment Analysis panel I stayed far away from discussed specific methods used because they are far beyond the scope of a panel.

That, and my experience from attending a lecture at Stanford. I sat two seats away from Dr. Andrew Ng and amongst Stanford grad students listening to Dr. David Blei explain LDA.

It makes me very happy that I wasn’t the first person to let out an sigh signalling information overload.

However, I was able to almost keep up and have since operationalized the use of LDA with some customized tools and techniques. Without a blend of quantitative and technology skills, and a dash of business skills, that would have been impossible.

Shortly it may be impossible to do much without serious quantitative and technology skills in digital measurement.

Adobe Moves Into New Territory

Adobe is looking for a Senior Researcher, Analytics in their Advanced Research Lab. Taleo sucks so I can’t post a link, but here’s the job description:

Before you rush out and apply for it check your resume, does it say ‘Ph.D. in Machine Learning’ or some other quantitative field?


Don’t waste your time or Adobe’s, they aren’t interested.

At this point the Quantitative and Technology people are probably a little bummed if they lack a Ph.D., but very excited that Adobe may be operationalizing advanced technologies.

Some Business oriented people I know aren’t sure what to think. They haven’t the familiarity with the topics outlined in the job posting sufficient enough to fill in the blanks and get real excited.

So if you run a company and would like to be prepared for the future, your choice is clear:

  • Literacy
  • Illiteracy

The pace of development in the digital optimization arena is increasing every day, if you aren’t moving ahead you are probably falling behind.

ACCELERATE November 18th

My guess is that this will come up during  the one day ACCELERATE on November 18th in San Francisco, if you haven’t signed up yet make sure to sign up for tickets here.

I am presenting in the Super Accelerator session on any topic of my choosing, and I will choose wisely.


Technorati Tags: Career Development

Posted in ACCELERATE, Analytics, Career Development, Conferences, Machine Learning, Measure | Tagged | 1 Comment

Sentiment Analyis at eMetrics

Initial Sentiment of Sentiment Panel

Gary Angel, Chris Berry and myself participated in a pretty darn cool conversation on the topic of sentiment analysis during eMetrics in NYC October 19,2011.

A component was my chart of the sentiment of tweets tagged with #emetrics more recently, that chart is reproduced below for the general audience.

I used R to complete the analysis, Python to query the API.

Posted in Analytics, Application Programming Interfance, Conferences, eMetrics, Machine Learning, Natural Language Processing, Python, R | 4 Comments

Remove the Execu-Speak Loop From Analytics

Competing in a Competitive World

All Skills To Port

James is a good friend of mine who worked his way through school, earning a degree in Chemistry and went looking for his next step.

At the time James was heavily recruited by pharmaceutical companies for a role in . . . sales. James is a lot of things, really smart, hard working and a great person.

Salesperson? No.

Something must be up if a big company was trying to fit a square peg into a round hole.

Analytics at Intel

Thomas Davenport, he of keynoting eMetrics San Francisco and the forthcoming New York City, has the following quote about Intel:

And, where analysts were once exclusively number-crunchers, they’re now more often expected to understand business operations and strategy, too. They’re expected to help “frame the decision” within the context of the organization’s strategy.

So it is proven that individuals can ramp up with business sense; those individuals whose training is more tactically oriented can grasp company strategy.

Number crunchers, and code crunchers, can indeed be brought along to understand how their tactical components plug into the organizational structure at large.

Onion Layers

Certainly there are individuals coming from the business point of view who bring a valuable voice to their organizations.

There are other individuals who frame their roles in terms of what they don’t do instead of what value they bring to their companies.

  • Programming
    • No
  • Quantitative Analysis
    • No

Facilitating communication between business units that already eat their lunches together at the same side of the company cafeteria or some such function.

The need for the built in functionality of someone to ‘translate’ code/quant into business terms is empirically disproven by the Intel experience, and crudely mapped out in the following org chart:

The black lines of communication are the existing model, the red lines are what is coming down the pipe.

Executives in this model function to translate tactical function status into the strategic level, so cross functional collaboration can be achieved. The executives are not necessarily inclined to be programmers/quants themselves, they just watch over them.

Once the quants and the programmers can communicate to each other at the strategic level, there isn’t much need for the execu-speak translation loop.

James Finds a Job

My friend James never took the sales job, and at some point I asked him why he, of all people, was being recruited to do sales. His answer:

It is cheaper to try and teach a guy like me to do sales than to teach a salesperson chemistry.

Will companies ever calculate that it is cheaper to hire a programmer/quant and work with them to frame their conversations in terms of organizational goals?

We shall see.

If your only skill set is potentially obsolete you’d probably defend it with vigor. And that, just maybe, is part of the answer to Jason’s question.

Note: A pseudonym is used for my friend to keep it all cool, ok?

Technorati Tags: Career Development, Jason Thompson

Posted in Analytics, Career Development, Measure | Tagged , | Leave a comment

#Measure Career Development – Beyond Analytics Ninjas

“Dude, where’s my Job?”

Avinash Kaushik posted on his, far more widely read, blog post a ‘Web Analytics Career Guide: From Zero to Hero In Five Steps!‘ which is well worth a read.

What about the slightly experienced web analyst? What are they supposed to think about companies being acquired, business units rolling up into BI and so forth?

How can a web analyst stay competitive in the job market today and in the future?

I am a big believe in continuing education, those same skills which get you well paid today won’t necessarily get you employed a couple years from now.

I know history is full of examples where strong dominant forces remained at the top of the food chain, where those forces were rewarded for being the same forever.

That being said, even an ‘Analysis Ninja’ might want to learn a new skill here or there.

  • What skills might the Ninja want to take a look at?
  • And, who the hell am I to give advice when I am ‘new’ to this industry?

Durable Skills

The first thing to keep in mind is that web analytics is not some secret dark art, which only a few people know and pass on after ritual initiation.

Web analytics is built on top of several different disciplines, having skill in the underlying discipline empowers people to use their portable skills in a new area.

To put it another way, if a salesperson stopped selling planes and started selling boats would you ask them if they changed careers?

My background is Econmetrics and Linguistics, which are 100% durable and portable to web analytics. I use both absolutely every single day.

The hardest problem I ever solved was from my Cal State Econometrics class, and that was at least 10 times harder than anything I have thus far seen in web analytics.

Instead of rewarding durable skills, all too often perishable skills, such as tool specific skills, are the ultimate arbiter of employment. This is not only an absolute shame, but also has effects on how individuals choose to pursue advancement of their careers.

Building out and expanding upon their durable skills, as opposed to perishable tool specific skills, is what I would hope analysts with experience would focus on. Not only for their own career development, but also to advance web analytics as a profession.

If we are satisfied doing the work of glorified button pushers and paper shufflers, we are begging to be replaced.

Secretaries Will be Secretaries

Pulling reports of raw data rarely shows a business story which is relevant or actionable. Visits are up, great . . . right?

Performing routine tasks that can be automated has always shown to be a good career move.

Business units already want the ability to tie dollars to performance; move this lever and experience this effect.

If you don’t believe me, check out Eric T. Peterson’s presentation, direct download, on the SAS website. Good stuff.

While SAS generously provides Eric’s presentation, they aren’t the only solution in this realm. My Twitter followers already know . . . .

Revolution Analytics I Heart R Sticker On Stroller

This type of analysis can be done, if you know:

Statistics/Econometrics/Linear Algebra

There are challenges to modeling, and working with a smart individual on these problems is a must. If the most advanced procedure you are regularly using is the CORREL function in Excel, that isn’t enough.

Having an understanding of the advanced techniques will shortly be, if not already is, an essential part of the analyst’s toolkit. Building out a model of the levers which drive the business KPI’s which you are studying will be a part of your daily tasks.

For me, it already is and I keep Kennedy’s ‘A Guide to Econometrics‘ on my desk at all times.

Start with the basics

  • How are discrete and continuous variables different?
  • How do I figure out if this distribution is normal?
  • What is a normal distribution?
  • Why is this important?

If you lack this skill set, don’t worry! Lots of great free resources out there, in particular I ,recommend MIT Open Courseware.

At the end of Tim Ash’s ‘Landing Page Optimization‘ he has a nice introduction a few statistical concepts. Re-read that section if you already own the book.


Recall the part of ‘The Joy of Stats‘ where Dr. Rosling is talking about the size of data in the Internet.  (note: if you haven’t watched ‘The Joy of Stats’ already . . . this may not be the right industry for you)

1 zettabyte is the size of the data on the Internet. A zettabyte is 1 billion terabytes . . . that’s a lot of disk drives.

When I see Jim Sterne (Jim Sterne!) taking part in a webinar talking about big data I feel confident that its going to be around for awhile. (further note: Why isn’t Jim Sterne featured on the home page rotating images? Seems like that might help get people signed up . . .)

What does that mean for the web analyst?

Big Data

The existing web analytics tools largely do all the data querying for you. Select this check box, push that button there and OK data downloaded.

Building out all those cool models which are going to make your company more money can only be enabled with the correct implementation on the back end.

If, however, you let the DBA’s of the world run it you might end up with an overly complex snowflake schema which produces very nice precision while recall and performance both suck.

Hadoop is free for crying out loud, download the Virtual Machine from Cloudera and give it a go. It isn’t for everyone but if you walk through the exercises you can at least get an understanding of big data.

The key benefits in my professional experience of the Hadoop platform are:

  • Flexibility of data analysis
    • I’m not stuck in a particular schema
  • Scale like crazy
    • Stuff it to the ears with data and keep going, still runs great!

Too lazy Not enough time to pin the VM? Just take a look at Orbitz using R and Hadoop to optimize hotel search.

Get an idea of the trade offs of different solutions, so the next time an implementation rolls around you have something to contribute to the conversation.


1 zettabyte of data still blows me away, as my next thought is how in the world is that data going to get into a database?

For my regular data, sure it isn’t a problem to set up a cron job which downloads and then performs the necessary ETL steps. That’s old school.

What about on a Tuesday afternoon, when I need some data which is available somewhere but not in my database?

In this instance, I wouldn’t describe creating an IT ticket a ‘self-starter’ solution.

Programming Languages

ZDnet has a blog post on the fact that companies are ‘overwhelmed’ with unstructured data; this sounds like job security to those people who know how to apply structure to data on the fly.

As the data volume increases beyond human comprehension, the likelihood that data which is essential to your daily job function will be outside the existing structure is also increasing. There is only one reasonable way to get this done in a timely fashion:

Learn a programming language, which is certainly not a trivial task.

I taught myself Python for exactly this purpose and have reaped the rewards numerous times over.

Python is relatively easy to learn, and there is an entire book FREELY available to read on the Internet which will teach you the language if you start at the first page and work through to the end.


  • Who has lots of free time at their work?
  • Who doesn’t have to work with non-quants?

Business Management

We are dealing with too many requests from too many business units, and the minute you start producing models those same business units just may start to have unrealistic expectations of the returns from those models.

Managing expectations, business relationships and all other aspects of a corporate existence is critical for every single analyst.

Aside from B-School there are some excellent resources on the MIT Open Courseware site, as well as numerous books on business management out there.

I also really like Thornton May’s ‘The New Know‘ for managing analytics specifically.


So you navigated the maze. You used quant methods on big data, which you queried with a programming language, on a project you guided to success.

Hoorah! Post the results on the wall of your workspace and then . . .

How the hell are you going to get the C level executives to listen to you?

Business Communication

Effectively communicating the insights to individuals who aren’t as excited about topic model classification as you are is required for you to advance.

Right around the same business section on management are solid communication books.

While you’re in the section, take a peek at a sales book or two. You know, if you want to sell your insights internally.

Just sayin.

Transformational Data

Of course people will experience success without mastery of these skills, people who have moved up in their organizations sufficiently such that they supervise this sort of work.

I would argue even they need an understanding of what is possible, but not everyone would agree. If you are in that crowd at the top of the food chain, more power to you and enjoy the nice life.

If, on the other hand, you aren’t at the top of the food chain keep in mind there is a population of individuals who are busting their ass on all of these points just hoping to catch a break and get into web analytics.

Still not sure?

Any company who needs to add a real go-getter should give Ed Fine a look; Ehren Cheung and Pandu Truhandito are both working right now, but bookmark them just in case.

Ed’s very interested in working on more projects related to web analytics, he brings a very impressive skill set which is directly portable over to web analytics.


Feel free to holler at me on Twitter @MichaelDHealy, drop me an email at mdh [at] michaeldhealy [dot] com or post a comment.

Technorati Tags: Analytics, Career Development, Econometrics, Linguistics, Measure, Statistics

Posted in Analytics, Career Development, Excel, Python, R, Tools | Tagged , , , , , | 5 Comments