How Static Resources (CSS, JS) are Served at Facebook.com?

5

This article talks about techniques used at Facebook.com to serve the static resources such as CSS, JS and Images files when someone accesses Facebook.com. If you are one of the developer at Facebook and worked on related modules, and disagree with one or more aspects of this article, please drop a message and I shall change the same appropriately. The article is aimed to present a perspective on how to handle the web static resources, based on how it is handled at facebook.com. Thank you for reading it further.

Back in February 2004?

Well, like most other startups, facebook got launched in February 2004 with usual manners of serving CSS & JS files as independent, separate files. As per Wikipedia page, Zuckerberg (picture below) wrote the software for the Facemash website (facebook predecessor) when he was in his second year of college and the website got launched in Octobar 2003. Few months later, in January 2004, Zuckerberg began writing the code for a new website, known as ‘theFacebook’ and the same got launched in February 2004.

Mark Zuckerberg (Source: Wikipedia)

Following is how the code used to look like for few years after the launch:

How Static Resources Got Managed in Initial Years of Facebook

How Static Resources Got Managed in Initial Years of Facebook (source: Phabricator.com)

 

However, as facebook started growing, the above way of managing the CSS & JS files needed to be changed because of following reasons:

  1. Management nightmare: It was difficult to manage all the CSS & JS files in various web pages as it was required to include right files in right web pages in right order. The error started to get in, in form of many not-needed resource files found in one or more web pages.
  2. Performance issue: The performance issue was related with large number of HTTP requests that was required to be made for every CSS & JS files.

Most of the startup websites adopt the above mentioned strategy as in the initial days one is least bothered about performance bottlenecks and management issues and more concerned about validating the idea in general. Fair enough!

Haste System & Other Optimization Techniques further 2007!

The CSS & JS files started getting managed by what is called as a Haste system.  As per the documentation, the haste system is used to scan the directories, read the package.json file for configuration changes, gather the dependencies and update a map of static resources for the given webpage. This solved the issue of manually  Following represents sample code on how Haste system use to manage the dependencies and bundle them in form of updating the map with the bundled data.

Haste System sample code for loading js, css dependencies

Haste System sample code for loading js, css dependencies

Along with Haste system, following were some additional optimization techniques that got adopted along the same time.

  1. Bundling all the JS, CSS files as one JS & CSS file and sending them over.
  2. Loading the resources file at the end of page rendering

Finally, Static Resources Delivered from Database!

Facebook started growing further across geographies as a result of which they started delivering webpages using 1000s of web servers.  As a result, following started appearing as some of the challenges:

  1. How to release static resources (CSS, JS, Images) on these 1000s servers with all the users having the latest copies? There may always be the lag in the release and resources version mismatch could become the critical issue.
  2. Version management of these static resources, in general.
  3. How to have users always get the fresh/latest copies of static resources without the need for clearing their browser cache? There could always be the case that the users might have accessed the page from a server where the webpage with most up-to-date resource file path would have got served. And, when the request for these resource files would have got sent back, the request could have landed on the web server where the latest files did not get released/pushed. This could end up having users with stale resource file for latest page, and thus, poisoned cache.

Following is pictorial representation of the problem/inconsistencies in relation with users having stale copies of resources in their browser cache and, thus, not getting consistent look and feel of the page:

 

Representing inconsistencies with static resources served from Web Servers

fig: Representing inconsistencies with static resources served from Web Servers

In the figure above, you may see that as user is trying to access the resource files, some of the resource files may not have got pushed in the appropriate servers where the request for the resource files came. This would have lead to what is termed as poisoned cache with stale resources for new version. To fix the above issue, facebook moved to the following technique:

  • Publish all the static resources in the database before pushing the updates in the webpage.
  • Have a php file, named as rsrc.php, query the database to get the appropriate version
  • Place the static resources in the web page like following link:
    Serving resource files with rsrc.php

    fig: Serving resource files with rsrc.php

    In the above example, you may want to note the file rsrc.php, version number v2 and, css files with cryptic names. Look at the diagram below representing how the request is processed and the resource files are delivered from the database.

    Resources files are retrieved from the database

    fig: Resources files are retrieved from the database

 

References:

http://www.phabricator.com/docs/phabricator/article/Things_You_Should_Do_Soon_Static_Resources.html

 

[adsenseyu1]

 

 

Ajitesh Kumar

Ajitesh is passionate about various different technologies including programming languages such as Java/JEE, Javascript, PHP, .NET, C/C++, mobile programming languages etc and, computing fundamentals such as application security, cloud computing, API, mobile apps, google glass, big data etc.

Follow him on Twitter and Google+.
Share.

5 Comments

  1. Really, a great article. Enjoyed reading it. Thanks for sharing.

    One small doubt, wont you think its costlier to load a resource from the db, rather than loading it from webserver cache. Fivyou have any benchmark results for the latency of loading resources from this approach.

    • The problem with the web server cache is syncing the new static resources with 1000s of web servers at the time of release and the possibility of end users having the stale/poisoned cache. Then, the static resources are obtained from database which is further optimized by putting rsrc.php on CDN (content delivery network) which cover up for the latency loss to a great extent. And, then the trade-off is surely there in between living with stale cache at user end vis-a-vis getting content from database.

  2. Howdy! I’m at work surfing around your blog from my new iphone!
    Just wanted to say I love reading through your blog and look forward
    to all your posts! Carry on the great work!

Leave A Reply


5 − four =