Hyping Web 2.0: Techcrunch on Repetition

I’ve wanted to find a way that organizes and renders in a meaningful way what people talk about. Blogs, for example, are a fascinating word-of-mouth tool, but such unstructured data isn’t very easy to work with. So, I began creating something that will help me better understand what blogs talk about.

The phrase “Web 2.0″ is a nebulous term, that not even O’reilly could define well. Yet, it’s now in our vocabulary: it’s difficult to define, yet we use the term and somehow or rather, people know what each other is talking about. If Ludwig Wittgenstein were alive today, he would have a hayday with this: rather than “things of which we do not speak”, the phrase “web 2.0″ becomes “the thing of which we speak much of yet do not know.”

Anyhow, I was curious to see how often the term “web 2.0″ was used in blogs. My first test case is the main cheerleader of the web 2.0 generation: Techcrunch.

Here’s what I wrote in Ruby:

+++++

# @author: Pete Abilla
# @date: 27, September, 2006
# @function: crawls hard-coded feed url and
# computes basic linguistic statistics on given word
# in posts.

require ‘rubygems’
require ‘feed_tools’
feed = FeedTools::Feed.open(’http://www.techcrunch.com/feed’)
keyword = ‘web2.0′
total_occurances = 0
total_posts = 0

keyword_pattern = Regexp.new(keyword, Regexp::IGNORECASE)

feed.entries.each do |entry|
puts “Entry: #{entry.title}”
matches = entry.content.scan(keyword_pattern)
if matches !=nil
puts “Occurances of ‘#{keyword}’: #{matches.size}”
total_occurances += matches.size
total_posts += 1 if matches.size>0
end
end

puts “Total number of posts in feed: #{feed.entries.size}”
puts “Total occurances of ‘#{keyword}’: #{total_occurances}”
puts “Total number of posts in which ‘#{keyword}’ appeared: #{total_posts}”
puts “Percentage of posts with the phrase ‘#{keyword}’ in it: #{total_posts}/#{feed.entries.size}”

+++++

Now, here are the results:

shmula.com, ruby code

So, we see based on only the last 15 posts that my code could grab, Techcrunch used the phrase “web 2.0″ in 10 posts, bringing his hype percentage to 66%.

Eventually, I want to grab feeds daily and store them in my MySql Db, then run queries off of that. That would give me a larger set to play with.

Still, Arrington, out of 15 posts, 10 of them have “Web 2.0″ in it. To be sure, that’s a lot of hype.

Share This Post:



  • Digg
  • Facebook
  • Yahoo! Buzz
  • del.icio.us
  • FriendFeed
  • Suggest to Techmeme via Twitter
  • Reddit
  • Google Bookmarks
  • Live
  • StumbleUpon
  • LinkedIn
  • Slashdot
  • MySpace
  • E-mail this story to a friend!

2-pizza teams (9)
3 C's (3)
5S (37)
A3 Report (9)
adoption (6)
agile/software (59)
ajax (4)
amazon (50)
apple (2)
apple iphone (6)
axiom (3)
Aza Raskin (8)
backcountry.com (2)
berlin (1)
bill gates (1)
bill marriott (1)
blog tag (1)
book reviews (4)
bullwhip effect (5)
business (381)
business plans (3)
busm361 (13)
BzzAgent (12)
call center and queueing (11)
car buying (2)
Carbonite (1)
chicago (1)
click fraud (1)
click-to-ship (21)
clocky (2)
colin powell (2)
community (2)
company interviews (18)
company interviews (6)
complexity (32)
costs (8)
culture (1)
customer experience (4)
customer obsession (45)
customer segmentation (8)
customer service (13)
design thinking (11)
digg (4)
drum-buffer-rope (38)
dublin (1)
dynamic systems (24)
eBay (6)
economics (3)
efficiency (3)
ethnography (28)
family (18)
featuritis (15)
flexibility (1)
forecasting (2)
four performance dimensions (2)
Fun With The 2×2 Matrix (1)
game theory (7)
Gemba (60)
genchi genbutsu (65)
general (136)
germany (1)
google (15)
heijunka (64)
holidays (1)
how to be a human (1)
image uploading (1)
iphone (4)
ishikawa (67)
IT at Toyota (66)
just-in-time (4)
kaizen (1)
kanban (42)
law of instinct (1)
Leadership (33)
lean (148)
Lean Consumption Maps (94)
learning curve (1)
licketyship (1)
mark cuban (1)
martin luther king (1)
mary poppendieck (1)
metrics (73)
microsoft (6)
milton friedman (1)
moving average (1)
muda (64)
nba fines (1)
net promoter score (nps) (1)
obeya (36)
Off-Topic (1)
onstar (1)
operations (108)
pageviews (3)
pareto principle (38)
patent (1)
peanut butter manifesto (2)
philosophy (3)
Poka-Yoke (3)
poppendieck (3)
powerpoint sucks (2)
private equity (4)
process measures (5)
product development (19)
productivity (4)
quality (39)
queueing theory (40)
Raffle (1)
rational choice (2)
regression analysis (18)
respect for people (3)
root cause analysis (56)
sarah+palin (2)
seth godin (1)
simplicity principle (10)
six sigma (122)
snowboarding (2)
social media (3)
spam (1)
statistical process control (46)
strategy (44)
suburban (1)
supply chain (24)
takt time (8)
teaching (2)
team size (9)
technology (105)
the beer distribution game (1)
the profit tree (7)
The Visual Factory (11)
theory of constraints (41)
time (2)
timeline (3)
tony+hsieh (9)
toyota (73)
travel (1)
trump bankruptcy (1)
twitter (8)
uspto (1)
utah deal flow (2)
variation (69)
venture capital (1)
Visual Management (9)
waste (56)
website traffic (2)
Wing Chun (2)
wisdom of crowds (1)
wisdom teeth (1)
word-of-mouth marketing (18)
yahoo (2)
zappos.com (9)
zero defects (3)

WP Cumulus Flash tag cloud by Roy Tanck requires Flash Player 9 or better.


If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.

Comments

[...] Hyping Web 2.0: Techcrunc… [...]

Leave a comment

(required)

(required)


Additional comments powered by BackType