Blog Post
Monitoring NetApp Devices in GroundWork
February 10, 2016
NetApp devices are a great way to add network attached storage in a flexible, dynamic way to IT infrastructures. Getting storage handled in a reliable and repeatable way is especially important when dealing with virtualized systems like Openstack and VMware, and in fact NFS mounts from NetApps are often used in OpenStack deployments.
Monitoring the availability and performance of NetApp devices can be tricky, however. As recent trends toward virtual servers using dynamically-assigned storage continue, it’s important to stay ahead of the curve to avoid over-committing your storage resources. It may not be practical to simply use SNMP polling against known static volume parameters. You can use SNMP, of course, and there are Nagios plugins and other tools that can leverage SNMP mibs. These work, but often fall into the trap of not being dynamic, and therefore generate false positives.
You need dynamic monitoring if you are going to stay sane.
Of course you can use the NetApp management console, and it will work just fine for monitoring and management, but it will also silo the information collected into just that tool and skill domain.
That might be ok, if all you do is run storage, but for most of us, the NetApp is just one important component of our infrastructures. We are looking at VMs, containers, software-defined (and real!) networks and even public cloud resources, all as part of the infrastructure supporting the same applications.
What you want therefore is to deliver the critical current availability and performance information to the operations team. They need it to fix the problems that NetApp is one part of. In short, you also need unified monitoring.
Virtualization presents a challenge to maintaining unified monitoring across infrastructures, including storage volumes, and makes it hard to maintain the monitoring system, let alone set it up in the first place. You need a way to detect volumes and show their status, as they appear and disappear, change size, and even change names and connections.
You can also send logs to elasticsearch or splunk, for an event-based view of things. This is great to see when things are going really wrong and spewing out errors. Still, that might not help you do your job, and doing deeper analysis, while possible with such systems, is really more the province of data scientists than operations teams charged with keeping systems up and running, or administrators trying to predict when the overcommitted volume space will all get claimed.
Alternatively, you can use the API to dynamically pull data out of the NetApp systems you want to monitor, like the NetApp operations manager does. That’s the approach we decided to take. Using the lightweight GroundWork Cloud Hub™, which can run on any standard Java platform, we built a NetApp connector, which can connect to the NetApp API and collect data at regular intervals, and then send that on to the GroundWork API on any running GroundWork server.