This article talks about techniques used at Facebook.com to serve the static resources such as CSS, JS and Images files when someone accesses Facebook.com. If you are one of the developer at Facebook and worked on related modules, and disagree with one or more aspects of this article, please drop a message and I shall change the same appropriately. The article is aimed to present a perspective on how to handle the web static resources, based on how it is handled at facebook.com. Thank you for reading it further.
Well, like most other startups, facebook got launched in February 2004 with usual manners of serving CSS & JS files as independent, separate files. As per Wikipedia page, Zuckerberg (picture below) wrote the software for the Facemash website (facebook predecessor) when he was in his second year of college and the website got launched in Octobar 2003. Few months later, in January 2004, Zuckerberg began writing the code for a new website, known as ‘theFacebook’ and the same got launched in February 2004.
Following is how the code used to look like for few years after the launch:
However, as facebook started growing, the above way of managing the CSS & JS files needed to be changed because of following reasons:
Most of the startup websites adopt the above mentioned strategy as in the initial days one is least bothered about performance bottlenecks and management issues and more concerned about validating the idea in general. Fair enough!
The CSS & JS files started getting managed by what is called as a Haste system. As per the documentation, the haste system is used to scan the directories, read the package.json file for configuration changes, gather the dependencies and update a map of static resources for the given webpage. This solved the issue of manually Following represents sample code on how Haste system use to manage the dependencies and bundle them in form of updating the map with the bundled data.
Along with Haste system, following were some additional optimization techniques that got adopted along the same time.
Facebook started growing further across geographies as a result of which they started delivering webpages using 1000s of web servers. As a result, following started appearing as some of the challenges:
Following is pictorial representation of the problem/inconsistencies in relation with users having stale copies of resources in their browser cache and, thus, not getting consistent look and feel of the page:
In the figure above, you may see that as user is trying to access the resource files, some of the resource files may not have got pushed in the appropriate servers where the request for the resource files came. This would have lead to what is termed as poisoned cache with stale resources for new version. To fix the above issue, facebook moved to the following technique:
In the above example, you may want to note the file rsrc.php, version number v2 and, css files with cryptic names. Look at the diagram below representing how the request is processed and the resource files are delivered from the database.
References:
http://www.phabricator.com/docs/phabricator/article/Things_You_Should_Do_Soon_Static_Resources.html
[adsenseyu1]
In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…
Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…
With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…
Anxiety is a common mental health condition that affects millions of people around the world.…
In machine learning, confounder features or variables can significantly affect the accuracy and validity of…
Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…
View Comments
Really, a great article. Enjoyed reading it. Thanks for sharing.
One small doubt, wont you think its costlier to load a resource from the db, rather than loading it from webserver cache. Fivyou have any benchmark results for the latency of loading resources from this approach.
The problem with the web server cache is syncing the new static resources with 1000s of web servers at the time of release and the possibility of end users having the stale/poisoned cache. Then, the static resources are obtained from database which is further optimized by putting rsrc.php on CDN (content delivery network) which cover up for the latency loss to a great extent. And, then the trade-off is surely there in between living with stale cache at user end vis-a-vis getting content from database.
Interesting, would love to know what else the rsrc.php doing other than retrieving the resources from db. Special thanks to you @Kumar for sharing such a wonderful information for all of us.
Thanks Sharath for your comments.
Howdy! I'm at work surfing around your blog from my new iphone!
Just wanted to say I love reading through your blog and look forward
to all your posts! Carry on the great work!