YouTube’s Queueing Properties

This entry is part 12 of 21 in the series queueing theory

YouTube has many intriguing queueing properties.  This article will primarily look at the mpeg-to-swf conversion and study out the queueing properties of that process.

Assumptions and Data

Below is a very simple, high-level process map of the steps to upload a video on YouTube:

image source

In summary, the user does the following:

  1. User uploads video
  2. User waits while video is uploaded
  3. Upload is completed
  4. User waits for video to be converted from MPEG to SWF
  5. Conversion from MPEG to SWF completes

Based on a small sample of 20 uploaded videos of an average file size of 3.5 MB, I calculate the mean for uploads to be ~180 seconds per MB and ~240 seconds per MB for the conversion.

Based on YouTube’s own disclosure, we also know that there are on average 65,000 video uploads per day.  We do not know the average file size.

Queueing Properties

Important items to note when studying the queueing properties of a system are the following:

  1. λ = Arrival Rate, or more specific, the time between arrivals.  For most queues, we can assume that the arrival distribution can be approximated by a Poisson distribution; which means that the time between arrivals are not deterministic, but random.
  2. μ = Service Rate, or more specific the time for a arrival to be serviced.

A poisson distribution typically looks skewed to the left or to the right — that is because the mean and the standard deviation is the same.  Here’s a standard picture of a poisson for server utilization:

What we see above is that as there are more simultaneous connections, there is a subsequent arrival rate batching — represented by the poisson curve above.

Given the data and notation, we can now attempt to better understand the queueing properties of the MPEG-to-SWF conversion.  Remember: the data I have assumes several things and, is most likely, completely off the mark.  But, it’s an attempt and, if anything, it’s fun to try.

Presuppostions Redux

So, 65,000 video uploads on a 12 hour day, gives us the following:

λ = 65,000 / 720 minutes = (90 / minute)
μ = 3.5 * 240 seconds = 840 seconds; 840 seconds / 60 = (14 conversion / minute)

Arguably, 14 conversion / minute is very low.  Let’s just assume that YouTube average service rate is 200 conversions / minute.   Given that, we can now learn about the queueing properties, which I describe below.

Average Number of Videos Waiting to be Converted

The equation to learn about the average number of files in the conversion process is the following:

Cw = (λ2 / μ(μ – λ))

So,

Cw = [(8100) / (200(200 - 90)] = .36

So, given the assumptions above, not even 1 video is waiting to be converted from MPEG-to-SWF.

Average Number of Videos in the Conversion Process

Cs = (μ – λ)

So,

Cs = [(90) / (200 - 90)] = .81

This means, given the assumptions above, that an any point in time, there is 1 video in the conversion system.

Average Time Spent Waiting

Tw = (λ / μ(μ – λ))

So, we get:

Tw = [(90) / (200(200 - 90))] = .004

This means as videos enter the conversion process, there is hardly any waiting — they are served almost immediately.

Average Time Spent in the System

Ts = (1 / (μ – λ))

So, we get,

Ts = (1 / (200 – 90)) = .009

This means, then, that as videos are uploaded and enter the conversion queue, they are served almost immediately, without any waiting.

Weaknesses in Analysis

Okay, the numbers above are pretty much pulled out of the clouds.  But, if we had real data, then you could just plug them into the equations above.  My guess is, though, that even if we had real numbers, the results would be really close to what I show above.  Why?  Well, for each input into the YouTube system, one could argue that it has very little impact on resources — this is a common property in telephony and in server modeling.  I see the same thing going on with YouTube.  The biggest challenge for YouTube is not computing resources, but storage capacity.

+++++

This post is part of a series on Queueing Theory. The other articles can be found here:

  1. Queueing Theory: Part 1
  2. Queueing Theory: Part 2
  3. Queueing Theory: Part 3
  4. Queueing Theory: Part 4
  5. What is Waste?
  6. On Time-Traps and Waste
  7. Call Centers as Queueing Systems
  8. Travel Time & Waste
  9. Little’s Law for Product Development
  10. YouTube’s Queueing Properties
Series Navigation«What is Waste?Queueing Theory and Terrorism»

Short URL: http://bit.ly/tjd7

Share This Post:



  • Digg
  • Twitter
  • Facebook
  • Google Bookmarks
  • del.icio.us
  • StumbleUpon
  • LinkedIn
  • MySpace
  • HackerNews
  • Reddit
  • Live
  • email

2-pizza teams (10)
3 C's (3)
37signals (1)
5S (38)
A3 Report (9)
adoption (7)
agile/software (59)
ajax (4)
amazon (53)
apple (3)
apple iphone (7)
axiom (3)
Aza Raskin (9)
backcountry.com (2)
berlin (1)
bill gates (1)
bill marriott (1)
blog tag (1)
book reviews (4)
bullwhip effect (5)
business (397)
business plans (3)
busm361 (13)
BzzAgent (12)
call center and queueing (11)
car buying (2)
Carbonite (1)
change management (5)
chicago (1)
click fraud (1)
click-to-ship (21)
clocky (2)
colin powell (2)
community (2)
company interviews (18)
company interviews (6)
complexity (32)
costs (8)
culture (7)
customer experience (10)
customer obsession (52)
customer recovery function (1)
customer segmentation (8)
customer service (17)
design thinking (14)
digg (4)
drum-buffer-rope (38)
dublin (1)
dynamic systems (24)
eBay (6)
economics (3)
efficiency (4)
ethnography (29)
family (18)
featuritis (15)
flexibility (1)
forecasting (2)
four performance dimensions (2)
Fun With The 2×2 Matrix (1)
game theory (7)
Gemba (67)
genchi genbutsu (68)
general (135)
germany (1)
google (15)
Gretchen Rubin (1)
heijunka (65)
holidays (1)
hoshin kanri (1)
how to be a human (1)
IDEO (2)
image uploading (1)
interviews (4)
iphone (5)
ishikawa (69)
IT at Toyota (67)
jason fried (1)
just-in-time (4)
kaizen (4)
kanban (46)
law of instinct (1)
Leadership (46)
lean (167)
Lean Consumption Maps (98)
learning curve (1)
licketyship (1)
mark cuban (1)
martin luther king (1)
mary poppendieck (1)
metrics (73)
microsoft (6)
milton friedman (1)
moving average (1)
muda (68)
nba fines (1)
net promoter score (nps) (1)
obeya (39)
Off-Topic (1)
onstar (1)
operations (108)
pageviews (3)
pareto principle (39)
patent (1)
peanut butter manifesto (2)
philosophy (3)
Poka-Yoke (6)
poppendieck (3)
powerpoint sucks (2)
private equity (4)
process measures (6)
product development (20)
productivity (4)
quality (41)
quasimodal design (1)
queueing theory (41)
Raffle (1)
rational choice (2)
regression analysis (18)
respect for people (6)
root cause analysis (60)
sarah+palin (2)
seth godin (1)
simplicity principle (10)
six sigma (128)
snowboarding (2)
social media (3)
spam (1)
statistical process control (46)
strategy (46)
suburban (1)
supply chain (24)
takt time (8)
teaching (2)
team size (9)
technology (104)
the beer distribution game (1)
The Happiness Project (1)
the profit tree (7)
The Visual Factory (11)
theory of constraints (41)
time (2)
timeline (3)
tony+hsieh (11)
toyota (75)
travel (1)
trump bankruptcy (1)
turnaround (5)
twitter (8)
uspto (1)
utah deal flow (2)
variation (69)
venture capital (1)
Visual Management (11)
waste (59)
website traffic (2)
Wing Chun (2)
wisdom of crowds (1)
wisdom teeth (1)
word-of-mouth marketing (18)
yahoo (2)
zappos.com (12)
zero defects (3)

WP Cumulus Flash tag cloud by Roy Tanck and Luke Morton requires Flash Player 9 or better.


If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.

Comments

[...] YouTube’s Queueing Properties [...]

[...] YouTube’s Queueing Properties [...]

[...] YouTube’s Queueing Properties [...]

I am studying this area as part of my MBA program. The potentially novel use is to apply it to the area of Information security server availability and data integrity modeling.

Do you have suggestions on how successive queue modeling might be done between computer networking inputs and Operating System resource management functions? If so, defenses against distributed denial of service attacks against servers could be devised.

Sorry for geeking out, I hope this still made sense.

Best Wishes,

Don Turnblade
CISSP
MBA (Expected 2009)

Leave a comment

(required)

(required)


Additional comments powered by BackType