Techmeme: Web 2.0 Discovery, with a Web 1.0 twist!
Jeremiah Owyang wrote an interesting post yesterday: The Five Members of the Techmeme Family - in which he lists the different types of bloggers that end up on Techmeme. I think he's right on the money; as an avid follower of the site, I've seen the same dynamics at play.
For technology watchers and bloggers, Techmeme is a gold mine, an invaluable resource that constantly highlights breaking news, unique perspectives and interesting blog posts. Through the site, I've discovered some amazing writers and their high-quality work: Scott Karp on Can Blogs Do Journalism? , Fred Wilson's incisive post - What My Kids Tell Me About The Future of Media , Jeremy Liew's ongoing series about the Semantic Web - Meaning = Data + Structure , Dale Dougherty's wonderful post on Journalism is Burning Or How Breaking News is Broken and so many others.
In his post, Owyang also looks at how posts are rated on Techmeme. What's interesting about it is that the person who breaks the story does not necessarily get the lead; a more mainstream news source or blogger often becomes the "top node", even if all he or she is doing is to repeat the story without any additional content or unique insight. This is a reasonable approach from an automated content discovery perspective, but it sometimes gives funny results.
As Owyang says:
...
The Breaker: This can be mainstream news source or a mainstream blogger that discovers the story from the Original News Source and blogs it, as a result, they often become the top node, even if they aren’t the original source. It seems as if some websites are naturally geared to be an “H1″ even if they are resonators.
The Resonator: Also referred to as those who echo or copy, they repeat what was already said, adding little or no additional content, news or opinion.
...
As an example, consider this Techmeme snapshot from 5:55 PM ET, December 31, 2007 - the image below shows a fragment of that page.

At that time, the big news of the moment was about an executive defection, er, employment change - Steve Souders, Chief Performance Yahoo, left his post at Yahoo! to join Google.
What is interesting to note is the ordering of the various stories on the Techmeme web site.
The lead story on this topic is the Silicon Alley Insider post by Henry Blodget - an A-list blogger. Now, Mr. Blodget is a fine writer and SAI is a great blog, but this particular story that leads is written mostly as a breaking-news flash, with minimal opinion and no particular startling insights. (Where is the story behind the story ?)
However, the story had already been broken by techno.blog on the previous day (according to the respective blog post time stamps), so it wasn't really breaking news by the time it appeared on Silicon Alley Insider. And others - for example, Donna Bogatin and Ashkan Karbasfrooshan - provide a lot more additional content and, arguably, much more insight. So how did the big-T pick Blodget's post as the lead?
My belief is that the Techmeme algorithms choose their lead based on the prominence of the source and on the links to a given post (which two factors are generally highly correlated, in any case).
This is fine and generally works well. Are there other options, other algorithms that can be used to choose the lead for a developing story, that could highlight the more meaty posts? A few possibilities come to mind:
- Reader Votes: Within the set of posts for a developing story, allow readers to vote for the ones they like best, so that the most popular ones rise to the top.
- Link Count: Examine the cross-linking between posts to leverage the implicit knowledge therein, similar to Google's PageRank algorithm. I believe Techmeme already incorporates this to some extent.
- Bookmark Count: Examine the incidence of social bookmarks for different posts, for popular bookmarking services like del.icio.us .
- Human Editors: Use human editors to select the top leads. Of course, this may prove too expensive and/or cumbersome.
- Author Markup: Enable authors to include metadata in some standard format for their posts. By using markup or tags such as "news", "opinion", "analysis", "multi-idea" and so on, authors could indicate the type of their post to the selection engine. Admittedly, this approach is susceptible to gaming, although it could be combined with voting to improve quality.
Over time, the significance of "prominence" as a measure of content quality is eroding - especially for blog posts in particular. As the web evolves, Techmeme and other sites are sure to experiment with these and other alternative approaches; it will be interesting to see which ones emerge as the winners.