Explaining #Measure in 2010 – Limitations and Improvements of Twitter Content Valuation

We’re Only as Accurate as Our Least Accurate Measurement

Wax On, Wax Off

The Twitterati chart posted previously turned into something I didn’t quite expect, so I would like to take a moment to explain the whole situation to the sensei of the #measure world.

Web Analytics Association Spring Gala

My understanding is that all the prominent people involved are attending the WAA Spring Gala on March the 15th. Interested in catching up with us in person?

Grab your ticket on the Web Analytics Association site.


  • The data are limited to tweets tagged with the “measure”
    • Tagged it with “measure” immediately followed by punctuation?
    • May not be there
    • Tagged it with “socialmedia”?
    • Not there
  • The data are further limited to those collected by Twapper Keeper

Working with Twitter I know 100% data coverage can be . . . challenging

Data Definition

Tweets appropriately tagged with “measure” and successfully captured by Twapper Keeper during the date range of January 6, 2010 to December 31, 2010 comprise the data set.

Data Cleansing

My initial analysis indicated that Twapper Keeper captured both “#measure” and “measure, the hashtag and keyword, so I removed those marked without the hashtag.


Twitter measurement tools give some value to being re-tweeted, which is certainly a measurement of something. That something being popularity, utility or whatever else is, I believe, undecided at this point.

Take, for example, a couple of recent tweets by prominent members of the #measure community:

BeyWebAnalytics Episode 40 – Shaking the baby with Evan Lapointe | Beyond Web Analytics! http://ow.ly/3Wts1 #measure Amazed at how @analysisxchange is taking off under Wendy’s guidance. If only there was an award I could nominate her for!
Cool! New episode with Rudi, Gary, Adam and Evan Lapointe as a guest, this tweet was re-tweeted by two people. Re-tweeted a grand total of one time, by me, this complex to machines tweet adds Eric’s opinion on who you should vote for an award . . . without explicitly mentioning the award!

Re-Tweeting to Success

I’m not sure how re-tweets are taken into account, if they are scored on a linear scale how is the value of the @BeyWebAnalytics tweet compared to the value of the tweet by @EricTPeterson? Double the value because two people re-tweeted it?

I like the @BeyWebAna podcast, however in terms of valuing what I want to see in real time @EricTPeterson’s tweet about the Analysis Exchange is much higher than the @BeyWebAnalytics podcast.

Finding a way to value tweets that are closer to real interactions, over tweets that are primarily marketing in nature, is what I am ultimately curious about.


Another interest is plagiarism, which Stéphane Hamel recently blogged about. Part of my comment on his post was about my perception that plagiarism on Twitter is increasing, the use of the ‘via’ indicator had decreased in my empirical observations.

I have seen content, whether links to articles or tweet content directly, shared without attribution. This doesn’t happen often, but even infrequently this isn’t ok.

Valuation of Content

Figuring out how we, as a community, can value content of Twitter within the extreme limitations of 140 characters is a positive move forward. Whether Twitter influence stems directly from good Twitter content was not proven in my analysis, so as a good scientist I could have probably made a different claim.

The spiral of over-promotion has spread to what should be our addition to social media, you know, just one of the most important technological developments in the history of mankind.

Variety of Content

I used entropy as a measure of the variety of topics, commonly used in information theory this can be an effective tool of disorder in a container.

I made no qualifications, predictions or guarantees regarding the semantic profile of the content. Just made the claim that people talk about different topics, some people talk about different stuff more frequently.

Consistency in Self-Tagging

The big winner was @Ulyssez, and people were taken aback at how could he be the winner.

Simple: my guess is that he self tagged with the “#measure” hashtag more consistenly than anyone else in the community.

Development of Networks

I actually thought the more inflammatory portion of the chart, click through the image above for the high resolution copy, was the networks that developed.

The arrows leading in or out of a person are messages to or from that person.

  • The groups on the side of the chart, why are they on the side?
  • Do they realize that they are outside the core conversation?
  • The groups towards the center of the chart, what are they talking about?
  • Is there anything one of the outlier groups are talking about that the inside groups should listen to?

Valuing the Community

Eric T. Peterson and Jeff Katz have done an excellent job moving us in that direction with Twitalyzer; my hope is that companies such as Twitalyzer continue to evolve and succeed.

It is incumbent upon us, the community, to encourage the use of tools which are based on data and not on the awesomeness of a GUI.

The sooner we draw a demarcation between the companies we know provide quality product, and those that do not, the sooner we can move forward measuring all those things we’d like to.

Technorati Tags: Eric T. Peterson, Jeff Katz, Measure, Social Media, Tim Wilson, Twitalyzer, Twitter, Valuing Content

This entry was posted in Analytics, Linguistics, Measure, Natural Language Processing, Social Media, Tools, Twitter and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>