Analyzing Page Cycler Results


The page cyclers dump lists of test timings in their output.  They look like this (you can grab it from the "stdio" of the bots if necessary).

*RESULT times: t= [2775,48,1067,65,162,376,315, ...other numbers ...,153,262] ms 
[ OK ] PageCyclerTest.Intl1File (153717 ms)

Those timings are how long it takes, in milliseconds, for a set of pages to render.

How can we analyze this, especially across multiple runs?  Here's how I do it using The R Project for Statistical Computing:
  1. Grab the interesting test timings into variables.  For example, if you have two runs that you're comparing.

    > before = c(2775,48,1067, ...pasted from above... ,153,262)
    > after = c(...other timings...)

  2. Grab the list of test pages.  For example, look in data/page_cycler/intl1/pages.js ; those are the page names, in order of being run.  Note that the list of test pages is 10x shorter than the list of timings due to the test running each page ten times.

    > tests = c("126.com", "2ch.net", "6park.com", ...)

  3. Combine these into a data frame (R implicitly cycles the test list):

    > data = data.frame(before, after, tests)

  4. Compute e.g. the medians, using the test list as a factor.

    > meds = aggregate(data[,1:2], by=list(test=data$tests), median)
    > meds$worse_ms = meds$after - meds$before
    > meds$worse_pct = round(100 * meds$worse_ms/meds$before, 1)

  5. Now R can dump that median data in a nice tabular list.

    > meds[1:5,] 
              test before after worse_ms worse_pct
    1      126.com   50.0  75.0     25.0      50.0
    2      2ch.net   58.5  81.0     22.5      38.5
    3    6park.com  185.5 282.5     97.0      52.3
    4   affili.net   47.5  49.0      1.5       3.2
    5   allegro.pl   66.5  68.5      2.0       3.0

    > sorted_meds = meds[order(-meds$worse_pct),]

    > sorted_meds[1:5,]
               test before after worse_ms worse_pct
    23    goo.ne.jp   95.0 236.5    141.5     148.9
    22     golem.de   58.5 123.0     64.5     110.3
    18 excite.co.jp   61.0 110.0     49.0      80.3
    3     6park.com  185.5 282.5     97.0      52.3
    1       126.com   50.0  75.0     25.0      50.0

  6. Also, you can plot the list along with a line showing the line of before==after.  The plot below shows results for a different dataset, in which many tests were unchanged (those along the line), while others regressed horribly, from 120ms to 4000ms.  (This change was reverted.)

    > plot(meds$before, meds$after)
    > abline(a=0, b=1)

Comments