02/07/2014 08:56 pm ET Updated Feb 10, 2014

Breaking Phoenix: Identifying The Flaws Of The New Huffington Post

AMC Network Entertainment LLC

Everyone knows how hard it is to develop high performance, web-scale project, but, it is equally hard to ensure that it is not unwittingly compromising itself.

"IE8 bugs can only be reproduced on IE8 not IE10 mimicking IE8."

This was a phrase we heard from Helena, our test engineering intern, time and time again when Phoenix ticket were marked unreproducible. The developers thought they could use IE10 in IE8 mode to reproduce the exact behavior we were reporting.

It all began in July 2013, when a couple of developers began working on a performance improvement project. The goal was to reduce entry page load times from ~13 seconds to under 5. From day one it was obvious that the project be called 'Phoenix' as a new front end code was being built from scratch on the ashes of the previous codebase to make it more extensible.

Our challenge in establishing Phoenix's quality was that we had numerous OS/browser and template combinations on which we had to ensure every little quirk we had to work.

Initially, a couple of us started exploratory testing and were reporting about 20 issues a week mostly about missing styling. Later we pulled test engineers from other projects and all eight members of our team focused solely on Phoenix. We dove into all modules and adopted a more methodical divide and conquer testing approach. We were then opening about 65 new bug tickets a week. At this point more developers were added to catch up with our bug reports.

One of our favorite series of tickets was about reports related to missing custom banners. We had hundreds of those banners and many weren't showing up on Phoenix. Those tickets kept one developer busy for two weeks analyzing 10 large SQL queries that used to take 3 seconds without rendering the right data to 2 new queries that work accurately in 60ms.

IE CSS issues were another on of our favorite ticket series. On any given day we could simply open Phoenix on IE and find new issues. We ended up writing 65 tickets for IE (all versions) alone.

Although we started with manual tests, we knew we could not possibly verify every feature working on 12+ OS/browser combos across our 55 verticals. Even though we have been using selenium to verify the UI functionality it was a tedious task to code the test to run in all combinations, especially against all the verticals.

Enter BirdKeeper.

We were able to extract all the logic that runs a test on all verticals and OS/browser combinations into a common place. That way each test engineer could focus on a generic test while BirdKeeper ran that test on all OS/browser combinations. To utilize BirdKeeper, an engineer had to simply extend BirdKeeper and then overwrite the preen function to run the test suite they had created. For example, if we wanted to ensure that the twitter share functionality worked on every entry page, the code would be as follows:
public class TweetChecker extends BirdKeeper {
public void preen() {
BirdKeeper was also used for some semi-automated tests to visually confirming things such as the alignment of banners in various browser instances that it opened for a variety of pages automatically.

We huddled together to both collaborate and focus on the task at hand and we were able to report ~110 new tickets each week, but we still had a ways to go.

We had to make sure all the analytics were in place i.e. every page view is reported properly to comScore, Omniture etc., and every user interaction was being tracked properly. To manually click everything on all the different templates and verticals and verify it in the backend it would have taken a really long time. We had to test about 30 interactions with a page and a page from each of our 53 verticals in the 12 OS/browser combinations.

There were two steps to solve the above challenge. The first step was a frontal assault with a program that leveraged selenium to drive traffic and pass it through browsermob, an embedded proxy server allowing us to capture all outgoing tracking calls. The second step was a mixture of viewing reports and querying the database to compare current production traffic and phoenix traffic to uncover inconsistencies between the two. This gave us our next batch of bug reports.

The principal goal of the project was to reduce page load time for all entry pages. We had to be on top of it to make sure that we weren't ruining the page load with new code being committed. We wrote a program called Footprints that uses the API provided by to fetch the load times and chart them on Google docs internally.

The page load time we care about is the repeat load time, because after the initial load everything will be cached. This time was being tracked for every vertical on every deploy we made on about 50 pages run on 6 location/browser combinations. We were monitoring the load times daily and developers were taking appropriate action when needed. Below is one of the load time chart we were tracking ( MV means Mountain View) .

All good things come to an end, and we ran out of new permutations to try to break our code. We'll have to be satisfied with finding 604 bugs after trying 44527+ use cases. Finally on February 4th we had to let the bird fly out of the nest as we could not find any new launch blockers.