Fortune 500 Corporate Blogs: Disclaimers, Caveats and Limitations
First Things First
All data related to Fortune 500 corporate blogs was collected as part of my Masters thesis. I realize by itself that doesn’t mean much to anyone except me, but what it does mean is that the data presented here was collected for a specific purpose. And that in turn places limits on its usefulness in a more general context.
For example, there were strict time and resource limitations that affected how the research was designed, which data points were collected, how it was measured and what was omitted. However, that said, all data was collected and analyzed in good faith and so is everything presented on this blog.
Corporate Blogs
To my knowledge there is no objective or widely accepted way to classify ‘corporate’ blogs. Each study has tended to use its own criteria and that was the case here. The criteria used to define corporate blogs for my study may be one of many reasons why data on the companies that blog and the number of blogs differ from other sources.
There were three special cases amongst the corporate blogs I identified. Microsoft, Oracle and Sun all published portals that aggregate anywhere from a dozen to several thousand employee blogs. These were counted as a single blog in the total number and no posts were collected for the Microsoft and Sun blog portals due to time constraints.
Timing
Everything presented here represents a snapshot in time. All data is based on corporate blog posts published between July 1st, 2008 and August 31st, 2008. Any new blogs born or shut down since that time are not accounted for. In Internet time, a lot of time has passed already so I expect a lot has changed and I realize this affects the usefulness of the data but I wanted to share it anyway.
Methods and Tools
Blog memes are dynamic and change over time. As a result, data on blog meme size and structures represents a snapshot in time. Blog memes were derived using Blogpulse and data is subject to its capabilities, nuances and limitations in collecting memes. Similarly, sentiment scores were derived by Sentimine using a custom configuration and so the same caveats apply.
Blog comments were not included in the study. I believe comments form an interesting and integral part of blog memes and I would have liked to have captured comments. But, like I said at the top, this was for a Masters thesis and it was not possible to include everything.
Analysis and Conclusions
It’s hard to generalizing findings on corporate blogs, corporate blogging activity, the size and structure and sentiment of corporate blog memes. Companies found to publish corporate blogs skew fairly heavily towards technology and tech-related companies so it’s not representative of all companies or even Fortune 500 companies.
At present, it is only possible to measure sentiment on a ratio scale – positive, negative or neutral which means varying extremes of sentiment are not reflected in the data. In addition, sentiment was assigned at the document level, rather than related to a specific topic or object. And finally, sentiment detection remains a very difficult problem to solve generally so no sentiment scores are perfectly representative.
And Finally…
There are (probably many) more limitations than those listed on this page. I made add more as I think of them or if I receive questions on particular issues.
My name is Phillip Baker and this is my personal blog about finding value in a world of free information.