A major telecommunications company operates a test laboratory where technology partners must test their solutions before, they are integrated into production. The lab runs a vSphere and Cloud Director infrastructure on a mix of hardware, mainly HPE (Hewlett Packard Enterprise).
Thanks to Cloud Director, each partner has one or several tenants, and they run their applications as silos. vRealize Operations (now Aria Operations) was installed, but not regularly used or maintained.
In addition to monitoring and alerting in the infrastructure, the customer was interested in a new parameter of our present-day reality: power consumption. The lab is critical to testing the telco infrastructure and partners use up a lot of resources, but never release them afterwards. They had no clear visibility on resource and power utilization. Like everyone today, the customer needs to save drastically on power consumption.
comdivision collaborated with the telco and VMware on-site teams to reorganize the operations in several steps:
First, we upgraded Operations to the latest version and installed the latest application packs for HPE One, Dell and others. Since this is a large network with various vCenters, we transitioned to the new cloud proxy architecture, where every proxy collects data for a vCenter and sends the compressed data to Aria Operations. We reconfigured the Operations cluster to handle all the metrics coming in.
To start our power-saving journey, we first needed to establish the reclaimable resources, specifically the hosts that can be reclaimed per vSphere cluster. Of course, we did not reinvent the wheel, and being part of the large VMware vExpert family, we were aware of the excellent work done by Brandon Gordon of VMware regarding cost calculation. After consulting with him, we installed a freely available solution called “Reclaimable Hosts Dashboard for vRealize Operations”. The package includes supermetrics, views, and dashboards, and is available here. This allowed the customer, for the first time, to view cluster utilization and underused resources.
This led to the most challenging aspect of the project! Since most hosts are HPE blades, it only makes sense to turn them off if a complete enclosure can be powered down. We needed a dashboard with an easy-to-understand visualization of the reclaimable hosts per cluster and their locations in enclosures. Thanks to the HPE One integration and object relationships, we were able to create a highly visual heat map. Thus, we had one dashboard containing all the information the customer needs to regroup and turn off resources!