many ages ago i was using mrtg (more or less happily)
to collect service and performance data for my personal infrastructure,
and created a collector script then called bigstat
which worked both
locally on the box with mrtg and remotely on clients. naturally that's all
ancient non-news: mrtg is dead, all hail rrdtool.
just a little less long ago (2011) i finally also switched to rrdtool, and found an inspiration for a graphing layout that i liked and extended for my own purposes: one honking big overview page with small graphs of the last 10 hours, each of which is clickable and expands to a graph quartet (showing a day/week/month/year plus a bit).
here is an example (the real overview page is about 12 rows long, not just 2):
initially i built this with HTML::Mason (and mod_perl), but when i started at opmantek i got exposed to mojolicious - and learned to like it, quite a lot actually. so here's the newest incarnation of what i started more than 10 years ago.
i've just started putting my contraption on github, and you'll find this one there: https://github.com/az143/rrdstat
data gathering
one component, the data gatherer, is just plain standalone perl:
rrdstat
is the tool (in /scripts/rrdstat
) and it wants a config
file in YAML, e.g. /etc/rrdstat.yml
. you'll want to tell it where
the rrd server sits, what to collect, and you'll likely want to run it
every X minutes from cron. lots of generic and semi-custom goodness,
but you'll have to Use The Source, Luke! as i haven't got enough
time to document everything for public consumption (after all that was
a project to satisfy primarily my needs).
if you want to collect your rrd data centrally, then you will likely want to use rrdtool in tcp mode:
i run mine like this, from /etc/inetd.conf
, on a custom port i called rrdtool
(just add it to /etc/services
):
rrdtool stream tcp nowait root /usr/sbin/tcpd /usr/bin/rrdtool - /var/lib/rrd
(naturally i have an /etc/hosts.allow/deny config that lets only legit
clients talk to that server.) this way, your distributed instances of
rrdstat
can tell rrdtool on the central box that they have updates
for particular rrd files, and the central box can do the storage and
graphing. (also check out persistfile
in the example
config/rrdstat.yml
: if your rrd server isn't reachable then rrdstats
can accumulate readings in this file and just flush them when the
server reappears.)
there is one other small helper script that sucks solar radiation data from the bureau of meteorology site, for graphing together with my solar panel data.
data graphing
the rest of the stuff provides the graphing application. it, also, needs a
config file (by default /etc/mrrd_sections.conf
, plain perl, example
in config/
) which specifies what graphs should be shown in which
order and what magic options should be used (e.g. whether temperature
and fan speed data should be combined on a graph, and with what scale
ratio for left and right axes).
figuring out what's possible is again an exercise in UTSL!, the reasonably well-documented code that does
all the work is in lib/Mrrd.pm
.
you can run the application via PSGI and mod_perl from apache (as i
do), or start it up manually: ./script/mrrd
and it'll listen on
http://localhost:3000. rrd graph images will be saved in
public/img
, so the web user (or whoever owns the mrrd process) needs
write rights for that.
as always: feel free to contact me if you've been inspired by my stuff, got things to work but changed them totally, or hated what i did etc. etc. sharing stuff is more fun if you hear back from others.
postscriptum
you might wonder why i, while working for
a company that makes network management software, am not
using our FOSS product,
NMIS? the
reason for that is pretty simple: NMIS is at this point somewhat
biased towards handling stuff via SNMP, while my homegrown rrdstat
script doesn't need or use SNMP. i/we do plan to improve NMIS wrt. more
flexible service monitoring in the very near future, but for now my
homegrown system (plus a bit of mon) has been sufficient.