WHAT WE'RE GONNA TALK

Just like in a cafe, we talk about everything. Nothing heavy. Just talk over a cup of coffee.


Sunday, April 8, 2012

HOW BIG ARE THE PORN SITES?

It is a truth universally acknowledged, that a person in possession of a fast internet connection must be in want of some porn.

While it’s difficult domain to penetrate — hard numbers are few and far between — we know for a fact that porn sites are some of the most trafficked parts of the internet. According to Google’s DoubleClick Ad Planner, which tracks users across the web with a cookie, dozens of adult destinations populate the top 500 websites. Xvideos, the largest porn site on the web with 4.4 billion page views per month, is three times the size of CNN or ESPN, and twice the size of Reddit. LiveJasmin isn’t much smaller. YouPorn, Tube8, and Pornhub — they’re all vast, vast sites that dwarf almost everything except the Googles and Facebooks of the internet.

While page views are a fine starting point, they only tell you that X porn site is more popular than Y non-porn site. Four billion page views sure sounds like a lot, but it’s only when you factor in what those porn surfers are actually doing that the size and scale of adult websites truly comes into focus.

We’ll start by laying the ground work, and then on the second page we have some real world figures from YouPorn [1], the second largest porn site on the web. If you like, take a moment to try and estimate the amount of traffic that YouPorn handles every second. Let us know in the comments if your guess is anywhere near.

Scale

[2]The main difference between porn and non-porn sites is the average duration of a visit: For a news site like Engadget or ExtremeTech, an average visit is usually between three and six minutes; enough time to read one or two stories. The average time spent on a porn site, however, is between 15 and 20 minutes.

Then you need to factor in that most websites are predominantly text and images, while the largest porn sites push streaming video. When you load the ExtremeTech home page, you’re talking about a couple of megabytes, and then maybe 500 kilobytes if you load an article. When you stream porn, assuming a low resolution of 480×200, you’re looking at around 100 kilobytes per second — which, over 15 minutes, is around 90 megabytes.

Then you need to multiply 90 megabytes by the number of monthly visits — which is around 350 million for Xvideos. This comes to around 29 petabytes of data transferred every month, or 50 gigabytes per second. To put this into comparison, your home internet connection is probably capable of transferring a couple of megabytes per second, which is about 25,000 times smaller.

In short, porn sites cope with astronomical amounts of data. The only sites that really come close in term of raw bandwidth are YouTube or Hulu, but even then YouPorn is something like six times larger than Hulu.

Infrastructure

Serving up videos requires a lot more resources than plain text and images, in terms of storage, CPU cycles, internal I/O, and bandwidth.

[3]While it obviously varies from site to site, most adult sites will probably store in the region of 50 to 200 terabytes of porn. This is quite a lot for a website (only something like Google, Facebook, Blogger, or YouTube would store more data), but in a world where 2TB drives are cheap and plentiful, this isn’t ultimately a very large amount. Last year we wrote about a Backblaze storage pod that can store 135TB in a 4U case, for just $7,400 [4].

CPU cycles and I/O will be a function of the bitrate of the streaming video and the number of page views. First the porn site has to serve up a dynamic, searchable database of thousands of videos, and then, when someone clicks on a video, that file needs to be read from a hard disk and streamed over the internet. If you’ve ever transferred a lot of big files over a local network (i.e. stressed both your hard drive and Ethernet port) you will know how taxing this is.

Actual hardware requirements are almost impossible to derive (they’re not publicized), but in the case of a large porn site we’re probably talking about racks of quad-CPU servers, gigabit switches, and load balancers. Software-wise, most large porn sites will use a very-high-throughput database such as Redis [5] to store and serve videos, and a light-weight HTTP server like Nginx [6] to serve up the web pages.

Finally, bandwidth. Referring back to our Xvideos example (based on an Ad Planner estimate), a large porn site will have to have enough connectivity to serve up 50 gigabytes per second, or 400Gbps. Bear in mind this is an average data rate, too: At peak time, Xvideos might burst to 1,000Gbps (1Tbps) or more. To put this into perspective, there’s only about 15Tbps of connectivity [7] between London and New York.

There are only so many ways of coping with this much traffic: You set up your own data center, rent a few racks in a very large data center, or use a cloud provider like Amazon AWS or Microsoft Azure.

A real-world example

The second largest porn site on the web, YouPorn, was kind enough to furnish us with some real-world facts and figures. You’ll be glad (or scared) to know that the estimated DoubleClick Ad Planner figures are actually quite a lot lower than reality.

YouPorn hosts “over 100TB of porn”, and serves “over 100 million” page views per day. All told, this equates to an average of 950 terabytes of data transfer per day, almost all of which is streaming video. This is around 28 petabytes per month, which means our 29PB estimate for Xvideos is on the low side; it probably serves 35 to 40PB per month.

It gets better! At peak time, YouPorn serves 4000 pages per second, equating to burst traffic in the region of 100 gigabytes per second, or 800Gbps. This is equivalent to transferring more than 10 dual-layer DVDs every second.

On the software-side of things, YouPorn’s primary data store is 100% Redis, with MySQL used as an admin tool to manage and add data to the Redis cluster. The site used to be primarily programmed in Perl with a MySQL backend, but in 2011 Perl was switched out for PHP and MySQL replaced with Redis. Nginx acts as the HTTP server, with both HAProxy and Varnish both used to load balance.

The Redis server deals with 300,000 queries per second, and between 8-15GB of data is logged every hour (visitor logs, behavior data, and so on). We’re told that this software stack should be capable of scaling up to 200 million views per day.

Sadly, YouPorn couldn’t tell us about its hardware infrastructure. Judging by the IP addresses of the YouPorn content delivery network (CDN), it’s probably not hosted by a cloud provider like Amazon, but rather in a large data center somewhere, with peering provided by Level 3.

To put that 800Gbps figure into perspective, the internet only handles around half an exabyte of traffic every day [10], which equates to around 50Tbps — in other words, a single porn site accounts for almost 2% of the internet’s total traffic. There are dozens of porn sites on the scale of YouPorn, and hundreds that are the size of ExtremeTech or your favorite news site. It’s probably not unrealistic to say that porn makes up 30% of the total data transferred across the internet.

The internet really is for porn.

By Sebastian Anthony 


****

Read more about the world of submarine fiber optic cables [11] (which carry all of that porn)


Endnotes

  1. some real world figures from YouPorn: http://www.extremetech.com/computing/123929-just-how-big-are-porn-sites/2
  2. : http://www.extremetech.com/wp-content/uploads/2012/04/xvideos-ad-planner.jpg
  3. : http://www.extremetech.com/wp-content/uploads/2011/07/backblaze-storage-pod-case.jpg
  4. store 135TB in a 4U case, for just $7,400: http://www.extremetech.com/computing/90634-how-to-build-your-own-135tb-raid6-storage-pod-for-7384
  5. Redis: http://redis.io/
  6. Nginx: http://en.wikipedia.org/wiki/Nginx
  7. there’s only about 15Tbps of connectivity: http://www.extremetech.com/computing/96827-the-secret-world-of-submarine-cables" title="The secret world of submarine cables
  8. Real-world numbers from YouPorn: http://www.extremetech.com/computing/123929-just-how-big-are-porn-sites/2
  9. : http://www.extremetech.com/wp-content/uploads/2012/04/youporn-blurred-out.jpg
  10. only handles around half an exabyte of traffic every day: http://www.extremetech.com/extreme/124561-ibm-to-build-exascale-supercomputer-for-the-worlds-largest-million-antennae-telescope" title="IBM to build exascale supercomputer for the world’s largest, million-antennae telescope
  11. Read more about the world of submarine fiber optic cables: http://www.extremetech.com/computing/96827-the-secret-world-of-submarine-cables" title="The secret world of submarine cables
----------
http://www.extremetech.com/computing/123929-just-how-big-are-porn-sites?print


No comments:

Post a Comment