Charting For Fun

Interesting Charts

Making Lemonade

If you are working on the FREE eMetrics pass, and you really should if you need a free pass, I created some charts based on the sample data. These data are limited in terms of the insights they can provide, so just do your best and come up with something cool.

I used either R or Matplotlib to generate the charts below; both are fairly straightforward to learn and highly recommended.

Submittal Instructions

  • Send an email to emetrics@michaeldhealy.com
    • With the subject “Submittal”
    • ATTACH IMAGE FILE AND TEXT DOCS
    • On or before Friday February 11, 2011 12:00 Midnight PST
    • Winner notified Monday February 14 at some point BY EMAIL
  • Required In The Email:
    • Your Name
    • Your Email
    • Your Phone Number
    • Your Physical Address
    • FEEDBACK YES or NO
  • Optional:
    • Your Personal Website
    • Your Twitter Account
    • Your LinkedIn Profile
    • Sample Data Available Here

Visualizing Problems

What is nice about this chart, showing the difference in Pageviews and Time on Site for the three Click Depth Visits, is the illustration of potential segments or issues.

My first question is, why is there a separation in the High Click Depth Visit?

If our site is making money from ad impressions, we may want to figure out why some members of the High Click Depth Visit segment branch off to view less pages.

Not an answer, but that sure seems like an interesting question.

Time Parting

This chart is a comparison of the, of course log scale, metrics for the New and Returning visitors to the site but hour of the day. The New Visitor segment has potentially interesting things, but the Returning Visitors are certainly interesting.

The Visits by Returning Visitors has two distinct bands, instead of a larger center which tapers out like the New Visits chart above it.

Add to that the Time on Site by Return Visitors, where there is a constriction in the chart from about 8 am until lunch time when it widens again, and the situation becomes worth investigating.

Box and Whisker Diagrams

Delving into the difference between New and Returning visitors, here are a pair of box plots of just the Visits metric.

Box plots are a method of plotting which capture the probability density of the variable, what is nice is that comparing the boxplots of the segments over the hours of the day can help move the analysis along.

The whiskers are the smaller portions of the population, the box the larger portion. Looking at both the New and Return visits it is clear that the boxes for New Visits are much smaller, showing greater variation with the Return Visits related to the time of day.

More Charts

Look for more charts tomorrow, and I may even get so curious about the source of variation in the Return Visits I will post the answer.

If that is your submission . . . better get it in soon!

I have updated the sample data again, the most current version is here.

Questions? Email me regarding the contest at emetrics@michaeldhealy.com, mdh@michaeldhealy.com otherwise.

Technorati Tags: Analytics, Charts, Graphs, Python, R

Share
This entry was posted in Analytics, Econometrics, eMetrics, Measure, Python, R, Statistics and tagged , , , , . Bookmark the permalink.

One Response to Charting For Fun

  1. Pingback: Tweets that mention Charting For Fun | MichaelDHealy.com -- Topsy.com

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>