« Bad Data = Bad Conclusions | Main | Katie Delahaye Paine's Accuracy Checklist for Public Relations and Social Media Measurement »

May 22, 2010


TrackBack URL for this entry:

Listed below are links to weblogs that reference I'm 100% Convinced that 50% of PR and Social Media Metrics Are Wrong:


Feed You can follow this conversation by subscribing to the comment feed for this post.

Anna Miotk

Hello Katie, I found your post as very interesting, but I have some questions. Do you mean solutions, for which a client is firstly defining the search criteria, and then content is gathered? I am asking, because for our solutions, NewsPoint and NewsPoint Social Media, we have different philosophy: firstly we collect content, then we do search in our database. For NewsPoint (which is Opoint Norway solution) we have also so called intelligent search, which allows to gather only relevant information. But I agree, it should be checked from time to time to verify the search.
The problem with omission of results is also not as easy to solve. That is because some new pages on a particular website are still created and spiders cannot automatically add them to the collection of searched websites. Human must do this - if we allow spider do this, it will add also some parts of website which do not contain articles, but for example information about advertising offer. If the collection of the websites is big and some websites are complex and still changing their structure (for example website of national newspapers), omission happens often. As far as I know, there are no solutions yet to solve this problem in 100%.
I agree your opinion regarding automated content analysis. As I noticed, some media monitoring companies allow to change the automated results by the user. This is one method to solve this problem. Of course, we can do the analysis manually (or order it in a media monitoring company). But what if we have a big number of articles and we want to analyze all the information sources?
I agree that some clients are expecting from content analysis information, which cannot be obtained with this method. I had an example of client, who wanted to know, about which problems connected with his area of interest, people from chosen industry discuss on the Internet. Our analysis has shown those people do not discuss on the Internet and the client blamed us. We told him the information those people do not discuss on the Internet is also relevant (for example, for planning the PR tools in the future campaign) and we advised in-depth interviews with chosen people from this industry as the better research method.
(I linked to the post on my blog in Polish with the same comments I placed here)


Ignoring the obvious bias of your cause (which you very honestly point out, so kudos for that), the point you make is very valid - though I'm not sure that *anyone* has got it completely nailed yet.

The complexities of language are hard enough for any algorithm as it is. It gets even worse though when you're looking internationally. You make the point about your 'Wicked Cool' New England phrasing - just TRY to get a sentiment tool to pick up sarcasm, or dry humour, or any kind of British sentiment. As a company trying to get a good sentiment tool here in the UK, it's almost impossible to get one that can cope with the Queen's English in anything but the most dry, corporate language.

Don't even get me started on companies that have mistakable or dual-meaning names...!


Thanks for continuing to beat the drum about the very common and disturbing challenges associated with automated monitoring tools. I originally noted the issues of spam, incomplete retrieval and bad sentiment analysis in a 2007 post that explained what “buzz/brand monitoring” tools do and how they collect, process and analyze data. I got some interesting responses to that post, most of which boiled down to two key themes: 1.) We’re working on it (I live in Washington, D.C. so I’ve heard this one before) and 2.) BTN (It’s better than nothing). Three years later, the problems really haven’t improved all that much and….is it really better than nothing when:

1.) You believe your “share of voice” is 3x higher than it really is because you’re buzz monitoring tool failed to pick up a large portion of the conversation (making share impossible to calculate)

2.) ½ of the neutral mentions of your brand (which made up the bulk of the records returned) weren’t really neutral at all but negative

3.) The other half of the neutral mentions were mostly spam (non-relevant) which weren’t filtered out of the data set as they should have been

4.) The most important prospective customers are talking about your brand and your competitors’ in three key forums that are behind a firewall, meaning that NONE of these high value conversations were included in your data set

Given the above (which I’ve seen happen repeatedly), how useful is the data set you are using as a basis for decision-making?

Linguistic analysis just isn’t easy and no one has this figured out yet, so these problems are not likely to go away in the near future. At Serengeti, we are experimenting with various coping strategies, one of which is to create smaller “custom neighborhoods” for the purpose of monitoring and measurement. This is almost like ongoing sampling – it reduces the data set, so it allows for cost-efficient human analysis and less spam/irrelevant records clogging up the data in the first place. Can you measure share of voice by doing it this way? No. But you’re not really measuring it (not accurately anyway) by using an automated tool either.

Making good marketing and PR decisions at this stage of the game requires multiple tools and more often than not, a somewhat customized yardstick. We’re not anywhere close to perfection in the measurement game, but we can still go a long way for our clients with some thoughtful combinations of tools and metrics, and – as you point out -- planning for the variables that actually mean something.

Thanks again for a great post.


I hope I’m the sassy intelligent chick you’re doing business with :-) – I do know I’m the one that triggered that example. I won’t subject your readers to my growing list (54 and rising) of “SAS” acronyms, but suffice it to say that I’ve been shocked for years at the low play the data quality issue gets in public discussions about measurement. The dilemma is huge – do you gather everything and face masses of unqualified content, or put on your best Boolean hat and risk filtering out important information?

It’s inevitable that this would become a discussion among vendors, as we’re the ones who experience the pains across multiple searches, sources, targets and topics. And we cringe at the claims of the automated systems that minimize the effort that goes into building a system that comes close to accuracy.

I agree with Anna – it’s vital to let users adjust and correct the automated judgments of software. The black-box approach may be cheap, but it’s irresponsible. As a humble plug for SAS’ approach, SAS Social Media Analytics not only lets users change the assessment of an individual article, it lets you actually change the rules the software uses to make its decisions – continuous improvement that can then be applied to previous data, so you’re always comparing apples to apples.

Which brings up another point that seems frequently overlooked – you need view your data over time, the longer the better, to see trends and patterns. Many of the black-box low-cost systems limit you to a 45-day view.

Robert Austin, APR

The example of audience identification is, I feel, the most important point here. If you haven't laid out your (measurable) objectives clearly and properly in the first place, you are never going to measure what matters.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Get us every month as a free email newsletter:

  • Just type your address here and click GO
    For Email Marketing you can trust

New Every Morning at 7:30 am...

  • The Measurement Standard Daily Digest is an automatically compiled collection of public relations measurement-related posts from around the Web.

Great Minds on Measurement

  • Lawrence M. Krauss“I cannot stress often enough that what science is all about is not proving things to be true but proving them to be false.” --Lawrence M. Krauss

Become a Fan

Recent Blog Posts on Measurement