Posted
3 months
ago
by
Paul Millar
With some unfortunate timing, it looks like the "Axis of Openness" webpages (SourceForge, Slashdot, Freshmeat, ...) have gone for a burton. There seems to be some networking problems with these sites, with web traffic timing out. Assuming
... [More]
traceroute output is valid, the problem appears soon after traffic leaves the Santa Clara location of the Savvis network [dead router(s)?]
This is a pain because we've just done the v0.10 release of MonAMI and both the website and the file download locations are hosted by SourceForge. Whilst SourceForge is down, no one can download MonAMI!
If you're keen to try MonAMI, in the mean-time, you can download the RPMs from the (rough and ready) dev. site:
http://monami.scotgrid.ac.uk/
The above site is generously hosted by the ScotGrid project [their blog].
Thanks guys! [Less]
Posted
3 months
ago
by
Paul Millar
After many months of work, v0.10 has been tagged and source-/binary-RPMs and tar-balls are available.
This is a major release with many enhancements to MonAMI. Perhaps the two improvements that top the list are:
adaptive
... [More]
monitoring,writing monitoring data into a database.Some other note-worthy changes include:
New plugins:varnish (for monitoring a Varnish server),grmonitor (for reporting to gr_Monitor, as previously mentioned),Updates to existing plugins:mauisupport for QoS (a maui term) monitoring added,
added a timeout option (maui can take ages to reply sometimes).Torquebetter error handling (the library has a somewhat amusing way reporting problems),enforce thread-safety (some torque library API isn't),Gangliafixed gmond.conf parser,
transmission now less bursty (reduces likelihood of overloading gmond)
unicast support: sending data to just the one gmond, support for multiple gmonds (for failover in unicast deployments) pencilled in for the next release.nulladjustable time delay (useful when playing with adaptive monitoring)MySQLadded per-Table monitoring statistics (also can now act as a reporting plugin).Other changes:Added the "MonAMI by Example" tutorial (has been available from the web for a while)MonAMI-core will use the recent history of a monitoring target's response time when estimating how long it future requests will take. This uses quite a nice algorithm, which responds quickly to a service suddenly taking a longer time to respond, but isn't fooled if a service responds very quickly.Added per-Thread CPU profiling. This is so, if someone says "MonAMI is consuming vast amounts of CPU" we can figure out why.Spring-clean of user-guide and tutorial: lots of effort has gone into this, mostly in ensuring a consistency in the typesetting. The document should look a lot nicer now and hopefully be easier to read.
You can download MonAMI from the SourceForge page:
http://sourceforge.net/project/showfiles.php?group_id=151885
or configure your YUM to download it automatically. Details are available here:
http://monami.scotgrid.ac.uk/
Enjoy! [Less]
Posted
9 months
ago
by
Paul Millar
Ladies and Gentlemen, MonAMI now has a new output plugin: grmonitor. This allows the latest version of gr_monitor (available from the project's home page) to connect to MonAMI and fetch the data it then plots.
gr_monitor plots data in 3D
... [More]
using an OpenGL library (e.g. the open-source Mesa). This allows you to pan around and see the live data from different points of view. On the right is a screen snapshot showing several Torque metrics.
gr_monitor expects data in a series of regular n-by-m grids. This is quite different to how MonAMI sees data (a tree structure) so the configuration has to map between the two. This makes it slightly verbose, but I'm hoping to add a few tricks to improve this. [Less]
Posted
9 months
ago
by
Paul Millar
The recent HEP-SysMan workshop was dedicated to monitoring: what software is available and how to configure it. I was honoured and delighted to be asked to give a presentation on MonAMI.
Well, given the meeting was a "workshop", I wanted
... [More]
to get people working! What better way than a hands-on tutorial: a step-by-step guide that walking you through increasingly more complex examples.
Pete and I had previously started something similar before as a GridPP wiki page, I wanted to convert this to DocBook so people had a good looking tutorial to work from. Since I wasn't too sure how long people would take, some extra material was added (e.g. using the MySQL plugin to save monitoring data). It took a surprisingly long time to get the tutorial good, which is one of the reasons things have been so quite recently.
This also finally forced me to figure out how to produce diagrams of datatrees. Thank's to GraphViz and some XSLT, the tutorial sports some nice diagrams. (Just need to add some to the user-guide now!)
The logistics were fun. Everyone needed their own environment to play with. Some people were able to used a spare machines at their home institute, but the rest used some 20 virtual machines that Ewan MacMahon managed to throw together. Each VM had its own install of Torque, maui and MySQL. Big thanks to Ewan!
Many people helped in getting this tutorial together. Mike Kenyon, Andrew Elwell, Caitriana Nicholson, Graeme Stewart and Tom Doherty (sorry if I've forgotten anyone!) all helped in proof reading and a big thanks also to Mona Aggarwal for organising the printed versions.
The meeting went well and people were happy with what they were doing. [Less]
Posted
10 months
ago
by
Paul Millar
Ganglia is a monitoring system that uses RRDTool for its storage and graphs. This provides an excellent solution for monitoring, but suffers from data becoming less detailed ("averaged out") when you look further back in time. This is deliberate
... [More]
, but does make later analysis of the data difficult.
If you wanted to keep detailed records of monitoring data with MonAMI that don't degrade over time, now you can, I've committed changes to the mysql plugin in CVS. In addition to monitoring a MySQL database, the plugin can now store information. You tell it which table and how to map the information into that table and it does the rest, it'll even create the table if it doesn't exist. [Less]