BDECP SRE Toil Elimination Dashboards

Capacity Management
Compute Footprint by CIO Areas Snapshot of allocated compute resources by CIO and Business Function. Supports new workload placement and rationalization of incoming capacity requests.
Provisioned Compute Capacity by ESXi Cluster Count of active VMs on each ESXi cluster along with total CPU and Memory provisioned as percentage of its physical limit. Provides a view of how guest workloads are balanced across the clusters.
Platform Administration
Patch Window Distribution by Node Role Monthly patch window assignment by K8s cluster and node role. Maintains line of sight to uniform distribution of servers by availability groups for supporting cluster & workload resiliency. Any data showing up on "Needs Attention" column must be addressed quickly.
Individual VM Mapping to ESXi Clusters A searchable list of all BDECP VMs with ESXi host cluster and vCenter assignments. Assists with quickly locating VMs during various operational and support activities.
K8s Node Role Distribution by ESXi Cluster Count of K8s nodes of a given role within a BDECP cluster that are residing on an ESXi cluster. Helps in detecting & avoiding high concerntration of specialized K8s nodes on a single ESXi cluster for risk mitigation.
NFS cross-reference for PVs & Namespaces Provides a searchable listing of NFS PV mount paths along with cluster, Namespace & PVC. Allows for rapid identification of Isilon dependency as well as maintenance and incident impact on environment & tenant workload
ESS cross-reference for PVs & Namespaces Provides a searchable listing of ESS PV mount paths along with cluster, Namespace & PVC. Allows for rapid identification of Isilon dependency as well as maintenance and incident impact on environment & tenant workload
SSL Certificate Lifecycle Management Summary view of SSL certificate count by certificate type and expiration date. Facilitates proactive management of certificate expiration.
Useful Links
BDECP ASMS Application and Site Management System
UCMDB Extracts These are Power BI reports which help applications to find/update data related to servers. Follow UCMDB report access instructions to access the report.
On-Call List BDECP On-Call list to contact in case of any platform changes.
Teams SharePoint Sharepoint provides details about BDECP Platform, office hours and clusters information.
Azure DevOps It contains all the Epics,Features and the current tasks for the team.
Loadbalancer Configuration Application Delivery Controller Portal. It contains all the details of GM's GTM and LTM loadbalancers.
BC Report It contains data related to applications certification/expiration.
GPLD Group Administration It contains information about AD groups, owners and members etc.