Preface
The operations and network management
market has a long history going back a decade or more. Nearly every organization and
nearly every company has dabbled in it to a lesser or greater extent. Why has no one
product or strategy emerged as the absolute best solution for operations management?Back in the late 90's I began studying the answer to this question. The
research involved reviewing hundreds of products from many organizations. From
universities, from open source / free developers, on to the biggest vendors in the market.
What I have come to realize is that an operations
management platform must exhibit these key attributes:
- Interoperability
The typical approach of management vendors has been to create a new API, a
new framework, a new management bus, or something along those lines and impose this on the
customer. The marketing gimmick being, "switch to our framework and you'll enjoy
management bliss!" The problem is that the switch almost always means you lose the
ability to manage some device or application you have in your environment. This is not to
say that typical management vendors have been trying to do this kind of thing
intentionally to their customers (although some might disagree), but rather, they often
have an alternative agenda. I.e., they have a platform, operating system, or hardware they
need to support and the management model is used to push that agenda rather than to solve
the management problem for the customer. The right approach is
not to impose a management framework, API, or protocol on the customer, but to build a
product that supports the existing management frameworks that a customer has deployed,
whether free, public, private, or proprietary.
- Flexibility
The software must be flexible enough to tie into existing systems, to
provide different views to every user, to customize the display for customers, to generate
customized reports and to adaptively solve the problem of management for the customer.
Easy to use "pre-canned" products are great, until you need to do somehting
custom with them. The software be as
easy to use as ready to go pre-canned software but must also be flexible enough so that
the customer can do all of the things they need to do.
- Ease of Use
Management applications are notoriously complex to use, install, and
maintain. I remember the story of one early adopter of AutoNOC in Texas. The data center
down the street had done the "responsible" thing and spent several million
dollars on an installation of a large SNMP based framework (I'll be nice and not mention
any names). The vendor had flown in six consultants to build the network model and install
the software. The consultants did not really understand the customers network and In the
process of discovering the network, they crashed the network because the consultants
didn't understand the network. The lesson here is that customers understand their networks
far better than external consultants and any workable platform must not require
consulting. It must be easy enough for the customer to deploy themselves. The customer will always build a better model of their network than a
consultant, but the software must be easy enough to allow them to do this on their own.
- Infinite Histories
Thirty seconds? Twenty-four hours? Sorry, that is not enough data to fully
understand how the network is working. Expensive and hard to maintain SQL servers are not
going to be able to keep up with the volume of data a proper network operations platform
generates. Event-centric systems are not optimized to manage the massive volume of
performance data that is required for success. Fully understanding what is happening on a
network is only possible when you can go back in time to six months ago, or a year ago and
see the real high resolution data. Proper management and network
analysis requires the ability to store a near infinite amount of high resolution data.
- Scalability
A scalable operations platform means not only that the software grows with
the size of the network, but it is easy to grow the network model. Centralized models lose
resolution and exponentially increase bandwidth requirements the closer to the edge that
you get. Distributed networks are severable and independently managed. Which is right? The
answer is neither is right. Real world networks mix a centralized and a distributed
management model. The software must support both types well to be the right product.
Events and root cause must come together at some point for global analysis while the model
must be fully severable to a degree. The Internet itself follows the partly centralized,
partly distributed model. The right management platform will
work well in both a distributed and a centralized management paradigm. We think of this as
a hybrid scalability model.
- Reliability
The management platform is the last and final wall insuring network
availability. It simply must be reliable. If your applications or network goes down, the
management platform must not. Maximizing reliability must be a key component of every design
decision in the technology.
- Performance
A good management solution has to run on rails. High performance
operations management is a more sophisticated problem than even the most powerful
databases. Databases have specific data request queries and actions to respond to and that
is the only thing they do. An operations management solution has to manage all of the
devices, all of the users, perform report generation, query requests, everything, all
asynchronously and it has to do it in a fully live, real-time environment. High performance operations management must leverage
the latest in computing technologies to maximize the quality of the end user experience.
With all of these requirements on operations vendors, it
is easy to understand why the perfect solution has not emerged until today.
Our focus on balancing these criteria from the very
beginning has led to an unmatched architectural and design understanding of everything
necessary to deliver the optimal management experience for the customer.
It is this focus that has led to the latest release of
our core operations platform, AutoNOC 2.2.
Kyle Lussier
President and Architect of AutoNOC |