Observability Engineer
Iselin, New Jersey
Full Time
$150k - $180k
Our client is seeking a seasoned Observability Engineer to design, implement, and maintain enterprise-grade observability capabilities across their systems and services. This role plays a critical part in ensuring application reliability, performance, and operational excellence by building standardized, scalable observability frameworks on top of their OpenShift Kubernetes cluster.
The ideal candidate has deep expertise in observability platforms, strong operational skills, and the ability to partner with engineering, product, and operations teams to elevate monitoring, alerting, and telemetry maturity across the organization.
Key Responsibilities Platform Setup & Engineering-
Design, implement, and maintain the enterprise observability platform on the client’s OpenShift Kubernetes (K8s) environment.
-
Ensure all services, pipelines, and solutions are fully monitored with proper alerting, logging, tracing, and heartbeat checks.
-
Build and maintain a centralized visualization layer for metrics, alerts, and KPIs.
-
Enhance observability capabilities to support scale, durability, availability, and performance.
-
Develop and maintain standardized observability service contracts for development and platform engineering teams.
-
Integrate observability capabilities with enterprise tools, platforms, and operational systems.
-
Support service engineers with observability design and assist product managers in defining meaningful observability KPIs.
-
Ensure all processes and configurations comply with enterprise controls and governance.
-
Build and maintain Infrastructure-as-Code (IaC) and deployment pipelines for the observability platform.
-
Modernize operational processes and drive automation improvements across observability systems.
-
Contribute documentation including testing procedures, training materials, operational guides, and software delivery artifacts.
-
Strong understanding of setting up, configuring, and maintaining observability platforms.
-
Expertise in:
-
Centralized visualization
-
Monitoring & heartbeat checks
-
Infrastructure monitoring
-
Application Performance Monitoring (APM)
-
Alerting systems
-
Logging and log aggregation
-
Distributed tracing
Ability to standardize and enforce observability service contracts across engineering teams.
Advanced hands-on experience with tools such as:
Dynatrace
New Relic
Datadog
ELK Stack
SigNoz
Additional Skills-
Kubernetes / Containers (OpenShift preferred)
-
Argo CD
-
Infrastructure as Code (IaC)
-
General operational skills including Git, Ansible, Bash
-
7+ years of experience in engineering, DevOps, SRE, platform engineering, or observability-focused roles.
-
Proven success implementing and scaling observability tools in large enterprise environments.
-
Solid understanding of modern distributed systems, cloud-native architectures, and CI/CD pipelines.
-
Strong communication skills and ability to collaborate across engineering, product, and operations teams.