Part of the job of a technical engineer is to design and build new infrastucture. An engineer requires information at their fingertips to be able to perform effectively.

As part of an engineers job they are often required to :

  • Research the environments he/she is working with.
  • See and understand the connectivity of devices.
  • Find IP address information and understand how those IP addresses are being used.
  • Plan based upon the existing networking infrastructure.

A well maintained and provisioned monitoring solution is critical to planning, development and fault finding within an infrastructure.

If a monitoring system begins to break down, silos of information appear. People lose faith in the information provided by the monitoring system. Alerts don’t get to the right people.

The assumptions made, can have serious ramifications for project deadlines and weaken the resiliency of the infrastructure as a whole.

  • Project managers fail to plan effectively as they don’t know what is currently live in their environments.
  • Engineers cannot assign IP addresses, cause conflicts and do not understand the infrastructure properly.
  • DBA’s are blind to potential issues, resulting in inefficient databases.
  • Hardware failures go undetected and cause outtages.
  • Software errors and events cause outtages.
  • Managers cannot create reports for capacity.
  • Problems take a lot longer to fault find and being proactive is impossible.
  • The monitoring system database struggles due to being on a shared database server.
  • The poller is overwhelmed with polling jobs.
  • Auto network discovery is not enabled and people are asking why devices are not showing up?
  • Thresholds are not set correctly, nobody understands what constitutes a bad / good threshold.
  • Dependencies and relationships between devices, applications and connectivity are not setup.

Awareness, efficiency and resiliency will all suffer.

 

If the monitoring system is given the resources it needs, the users are provided with training and everything that needs to be monitored is monitored, all of the above starts to go away.

Time can then be spent on automating support processes, planning new infrastructure and fault finding.

Organisations need to take monitoring seriously otherwise everything slows down.