En medio de la pandemia COVID-19 y cuando todavía faltan varias semanas para que en Argentina tengamos el pico de la curva, en U&R trabajamos codo a codo para ayudar a nuestros clientes a afrontar el desafío de administrar y monitorear remotamente la infraestructura de TI. La obligación de estar en casa, que también corre para los equipos de TI, requiere nuevas formas de administrar y monitorear servidores, dispositivos y aplicaciones.
Los equipos de TI necesitan nuevas formas de monitorear y administrar remotamente los servidores. Por ejemplo, los administradores de sistemas deben poder rebootear, reiniciar y acceder a un servidor si algo sale mal, sin ir a la ubicación física del servidor.
Como Partner de Nagios, les acercamos esta nota recientemente publicada (en inglés) que será útil para quienes monitorean su infraestructura de TI con la plataforma Nagios como con ITmetro. Ante cualquier inquietud, aquí estamos para ayudarlos, hoy más que nunca. Como hace desde hace más de 20 años.
Aprovechamos este contacto para informarles que estamos preparando un webinar sobre Introducción a Nagios XI que estará listo en las próximas semanas y será de gran interés para la comunidad de TI.
Con más de 20 años de experiencia, U&R se especializa en proveer servicios de consultoría y desarrollo de herramientas de software que ayudan a identificar y resolver problemas de infraestructura de TI antes de que afecten los procesos críticos del negocio.
How to remotely monitor and manage servers and devices
Software is the most flexible method for remote management, which makes it an ideal way for IT teams to start. We always recommend using one, if not two, software-based VPN service for remote accessibility. Here’s what we recommend for these services:
VPN Client 1:
OpenVPN is a popular client that can be easily configured to establish a connection on reboot and accept a wide variety of routing policies on a per-client basis. This method relies on a DNS/IP-based server connection (i.e., a direct tunnel). It is ideal for common remote tasks on the virtual machine.
VPN Client 2:
Establishing an OpenVPN (or another primary VPN) is a great first step, but these connections are vulnerable to volatility, including disconnection. We recommend adding another VPN client for redundancy. LogMeIn’s Hamachi service is a useful secondary option. Hamachi establishes a secure link by utilizing its IP network, and it can install as a service and run at startup. Deploying this secure cloud-based, third-party service is important and will likely be necessary at times.
Begin setting up the VPN service(s) with any established protocols that your organization uses, keeping any regulatory compliance restrictions in mind. Some software-based VPN clients require RSA or two-factor authentication.
Conocé más sobre nuestro servicio de Administración Nagios aquí
In addition to setting up VPN clients, there are a variety of other ways that you can remotely manage your server and devices. Some of these options require on-site access to set them up. If you have the option to request access to your office locations, these, paired with implementing VPN services, are effective solutions for remote management.
Second Virtual Machine:
If you have the physical resources to do so, consider adding another virtual machine on the host to troubleshoot the mission-critical virtual machine remotely. Add all of the local tools you will need, such as an SSH and FTP client, remote desktop manager, VNC client, and third-party firmware utility managers. Consider this virtual machine as your on-site toolbox. Add VPN clients 1 and 2 to this toolbox virtual machine, too.
Backup Physical Server:
Add, even if it is at reduced physical capacity, a second virtual machine host. Keep this host idle. Spin up a toolbox virtual machine on this host. If you’re running Microsoft Hyper-V, skip this step and install VPN clients 1 and 2 right on the host’s operating system. If space, energy, budget, or scale are issues, consider instead adding a small thin workstation, like Intel NUC, or another simple workstation that you can still use as a physical presence aside from the primary virtual machine host.
Managed PoE Switch:
If you have Power over Ethernet (PoE)-dependent-devices that benefit from a port bounce, add a simple managed switch. Doing so allows you to dial into the switch and remotely disable/enable any port on a device device that freezes, crashes, or performs abnormally. This will also provide you with another IP address that you can use to monitor and troubleshoot to determine when a fault occurs.
4G LTE Modem / Secondary WAN:
When adding a second form of physical media isn’t possible, add a fail-over WAN. Establishing a Cradlepoint or Digi Transport-based modem at the head-end of the network will offer you more information and remote abilities. These modems often have cloud-based remote management suites that allow administrators to console in and perform remote tasks. Those tasks could be as simple as pinging or accessing an SSH terminal for internal hosts or devices, which offers an additional entry method. In addition to providing a redundant WAN link, these modems also support yet another VPN client interface.
Internet of Things (IoT) Switch/Relay:
A remote power outlet, which you will often see in data centers as metered Power Distribution Units (PDU), will be necessary at some point. Similar to the managed network switch, bouncing outlets remotely can save hours and, in some cases, days of downtime. Some network-controlled power switching units offer a “watchdog” service. These watchdog services allow administrators to define an IP address for the unit to constantly ping. If that ping times out, it can power cycle an outlet (typically a modem or router). This outlet then doubles as another IP address you can ping and monitor as a performance metric to narrow down the root cause of an outage.
4G LTE Console/Relay:
A final remote management option is to have a completely separate 4G LTE-based relay or serial-command based device as an additional connection option.
When monitoring servers and devices remotely, it’s important to not only build multiple paths to the remote system but also to keep every method documented and top-of-mind when troubleshooting. By adding all of these additional layers of accessibility and management, each layer can not only be used to solve an issue, but can provide additional insight into what may be down or having issues. For example, if you are unable to access more than one physical or virtual device, try accessing the next device in the topology that this device may rely on for connectivity. Troubleshooting with this parent/host tool set will allow you to narrow down what the issue may be, which will result in faster resolution time.
How to use Nagios XI for proactive monitoring
Once you have the tools set up to manage your devices and servers remotely, it’s essential to monitor them to identify issues proactively. You can use Nagios XI to monitor all of the methods and nodes detailed above, which gives you full visibility of your network on one platform. Use it to gather information about how your server and devices are performing, and to receive alerts when something goes awry.
The more things you set up to monitor in Nagios XI, the quicker and easier it will be to diagnose problems that occur. Plus, the more devices that you make accessible in a dashboard-level view, the more effective and efficient your troubleshooting powers are when physical access to the office space is limited.
In addition to monitoring important components of your IT infrastructure, you can use Nagios XI to monitor the health of your physical office space. This monitoring could include setting up alerts when the temperature in the server room exceeds a certain level, placing spill indicators in break rooms to detect water leakages, or monitoring security systems for any unauthorized door entry. This gives you the ability to keep tabs on your offices without requiring an employee to go to them physically.