The Measurement Standard

May 22, 2010

I’m 100% Convinced that 50% of PR and Social Media Metrics Are Wrong

...and if this was accounting, you'd be in jail.

by Katie Delahaye Paine

Having recently attended a number of measurement presentations and a variety of conferences, I’m now convinced that most marketers and communications professionals are cheerfully going through life with blinders on. Those blinders are made out of a flimsy gauze of questionable accuracy, incomplete variables, and general apathy. Today’s marketers have taken “fuzzy math” to an all new level.

The most egregious example of today's inaccurate public relations and social media measurement is the use of free automated sentiment analysis. The vast majority of sentiment analysis tools get it right about 45% of the time. Which means that if you use those ”measurement“ tools, then your results are at least half wrong. And if this were accounting, you'd be in jail. (In the interest of transparency and full disclosure, I work with SAS which has a sentiment analysis tool that is 90+% accurate, and is tested against human coders.)

No one seems to mind about this sloppy work because it’s “just PR” or “just marketing.” Well I'm here to tell you it's your job and our industry, and our credibility is on the line. The only way we in PR and communications can be credible is to at least attempt to base our decisions on reliable, complete, and accurate data. Which is why I created Katie Delahaye Paine's Accuracy Checklist for Public Relations Measurement and Social Media Measurement. Go get your free copy right now.

There are four areas where I think most of the industry gets it wrong:

  1. Relevancy of content
  2. Commercial services omit results
  3. Accuracy of content/sentiment analysis
  4. Incomplete assessment of variables

Here we go:

1. Spiders Aren't Smart Enough to Pick Your Content

Back in the old days, I’d have a team of people physically looking at publications and selecting only those articles that matched the client's criteria. In other words, the content was actually about the company and/or the product and had some bearing on a customer’s purchasing decisions. Today's electronic searches are a big help, but we still need human reviewers to check up on things.

Unfortunately, most spiders today just aren’t very smart. They aren't smart enough, for instance, to determine that an article that talks about a tax bill to which “small business objects” has nothing to do with the database company Business Objects. And they can’t tell the difference between a spike in coverage because of good PR for “Visa, a sponsor of the Olympics” and “I need a visa to go to the Olympics.”

In some cases up to 90% of what we collect with an electronic search can be irrelevant. You need a very sophisticated Boolean search string to even get close to accurate results, and those still need to be checked by humans. Or else you end up with “I met a really sassy intelligent chick in the Business School,” when you search for ”SAS business intelligence.”

2. Commercial Services Omit Results

Then there’s the issue of omission. The average content provider picks up just a fraction of actual Tweets and an even smaller selection of Facebook threads. If they say they can do better, do your own search on search.twitter.com or just compare with your average Google search. In about 5 out of 6 systems we tested, Google and Twitter outperformed the commercial services.

3. Accuracy of Content Analysis

After you’ve screened out all the crap and have a solid database of mentions, you then need a way to accurately analyze that content. As I said above, the solution for everyone today seems to be automated sentiment analysis. There’s a good reason it is so popular: Wouldn’t it be wonderful to simply hit a few buttons to determine what customers actually thought about your brand? Well, dream on. Most sentiment analysis doesn’t even come close.

First of all, most sentiment analysis systems get it right about 50% of the time, and you get what you pay for. A cheap system will get it wrong even more often. You need a sophisticated system supplemented with human coders to get anywhere close to accurate results.

Secondly, no amount of automated sentiment analysis can tell you what people think. You either need to ask them their thoughts, or hook them up to a sophisticated brain scanner that will ferret out the information. What sentiment analysis does is report back to you the words associated with your brand, and how people are discussing your product or services.

Lots of times computers can misinterpret those words. So if I say I found a wicked cool restaurant, the computer has no way of knowing that I’m from New England and that’s a compliment. Worse still if I mentioned that I saw the play Wicked after eating at that wicked cool dining spot, it would perhaps suggest burning the restaurant and all its occupants at the stake. Most computers don’t understand the irony and sarcasm of today’s conversations.

So what’s an acceptable level of accuracy? If you can get can get computers to agree with human coders 80% of the time you’re doing really well.

4. Incomplete Assessment of Variables

The biggest blinders of all are the assumptions we all make of what “causes” something to happen. So you put a whole lot of effort and energy into a program and you expect web traffic, or registrations, or whatever to increase. And many times it does. But not always. And most of the time you don’t know why because you’ve left out some key variable in your analysis.

Take for instance some work my company, KDPaine & Partners, did for a major national charity. After they did a fabulous PR job and saw overall exposure triple, we surveyed the national audience and found zero increase in awareness. Some would conclude that the entire PR program was a colossal failure. Except that the target audience wasn’t “everyone in America,” it was people with a connection to the military. And when we narrowed our analysis to that target audience, awareness and relationship scores went up, as did likelihood to contribute and volunteer.

We had enough foresight to include a question about military affiliation in the national survey. But if we hadn’t, we’d never have known that the program was successful only among those groups who were actively being targeted.

ATT and Bruce Jeffries-Fox have done a great study on the importance of the interaction of variables, finding that PR and certain key messages actually impact sales and loyalty far more than they thought.

Frequently it’s the presence (or absence) of a key message that has the greatest impact on consumer behavior. But if you’re not tracking your key messages, you have no way of knowing which message is driving behavior.

And just as frequently, it is the presence of conversations about the competition that drives behavior concerning the organization you are interested in. Again, if you’re not tracking the competition, you’ve left out a key variable that you will need if you want your research to be accurate. tan square

(Thanks for the image.)

Comments

TamarUK

May 25, 2010

Ignoring the obvious bias of your cause (which you very honestly point out, so kudos for that), the point you make is very valid – though I’m not sure that *anyone* has got it completely nailed yet.
The complexities of language are hard enough for any algorithm as it is. It gets even worse though when you’re looking internationally. You make the point about your ‘Wicked Cool’ New England phrasing – just TRY to get a sentiment tool to pick up sarcasm, or dry humour, or any kind of British sentiment. As a company trying to get a good sentiment tool here in the UK, it’s almost impossible to get one that can cope with the Queen’s English in anything but the most dry, corporate language.
Don’t even get me started on companies that have mistakable or dual-meaning names…!

Reply
Add Comment Register



Leave a new comment

Your email address will not be published.