We have this amazing and fancy graph displaying utility called Graphite running on chromeos-stats. It's beautiful. You all should use it. This doc is about how to get data into the system so that you can view it in Graphite.
There are two different ways to get data into the system:
The first is to write data to the raw backend of Graphite, which is called carbon. It accepts data in the format of
The second is to write data to a service which will calculate statistics over the data you're sending, and then forward it onto carbon. This service is called statsd. It provides better information, as it will calculate min/mean/max, deviation, and provides a more intelligible interface. It also allows for better horizontal scaling in case we ever start logging a truely hilarious amount of stats. (Which we should!)
I would highly recommend using the statsd over carbon unless you have a specific reason to be sending data directly to carbon.
We have in
This guide serves to be copy-paste-able, so you should be able to take any snippet out of this doc and run it. Therefore, here's the import boilerplate you'll need when messing around with this code from within autotest:
If you prefer, you can find all the code listed in this doc (as of when this was published) in CL 45286.
As you go through and add some stats, or mess with the code shown here, at some point you're going to want to see how the data is shown on Graphite. Navigate to chromeos-stats. Drill down into
The first stat to examine is how to log how long a function takes to run. The easiest target for this is the scheduler tick. Let's define a fake scheduler tick function:
And now we have a few different ways that we can get the runtime of this function.
We can manually create a timer, and call
We can also take advantage of the decorator that is attached to the
Statsd timers report their value in milliseconds, so if you report a value by hand using
If you're looking to keep track of how frequently something occurs, a counter is a good choice. Statsd receives the counter stat, tallies it over time, and flushes the value of events per second to carbon and resets the counter to zero once every ten seconds. With counters, there are no extra statistics that statsd can compute. The normal ones of min, max, std_dev, etc. make no sense in the context of counters.
There also exists a
If you're looking to be able to send in a number, or if your stat doesn't really make sense as a timer or counter, then you should probably use a gauge. A gauge allows you to just report a number. The benefit of using a gauge over just sending raw data is that statsd will still compute the statistics about the stats you're sending like it normally does.
Values submitted by an average are automatically averaged against the values in the same bucket at the end of the flush interval. The only use case I can think of for this is if you're trying to measure something in a gauge that's very flaky, which is messing up all of the statistics that are being calculated. However, I can't even think of an example to use in our codebase, so I'm just mentioning this for completeness.
If all else fails, and you don't want any fancy statsd features, you can get statsd to send your data to graphite "pretty much unchanged". Note that the prefixing of your hostname still does happen (assuming you didn't turn it off).
One could use this to log the fact that something happened. Logging something so that there's an obvious spike when you're overlaying graphs doesn't need any sort of statistics calculated about it.
whisper-fetch can also output in JSON and you can specify the range of data you wish to view via the --from and --until command lines. The default is to look at a time slice of 24 hours.
Create the graph of the information you are interested and copy the URL. With the URL tack on &format=json and you will receive json formatted output with time slice and data.