Observability Engineer · DevOps · Platform Engineering
Observability and DevOps engineer with 10+ years of experience enabling engineering teams with production-grade telemetry, building observability platforms from scratch, and engineering custom incident response tooling. Expert in Grafana, Datadog, and OpenTelemetry — with a consistent track record of turning black-box troubleshooting into automated, proactive operations and building the purpose-built tools that make it possible.
Datadog · Grafana · OpenTelemetry · OpsGenie · Kubernetes · Argo CD · Next.js · Vercel · Okta · Slack API · Statuspage.io · Incident Management · SRE · AWS
Grafana · Grafana OnCall · Datadog · Prometheus · InfluxDB · Clickhouse · BigQuery · Databricks · SQL · AWS · Zendesk · WorkspaceOne · Looker · IoT
Grafana · Zabbix · C# / .NET · SQL Express · Cloudflare · Active Directory · PHP · Google Ads · Adobe Suite · SEO
4-node k3s cluster on Proxmox. Hosts personal projects and services via Cloudflare Tunnel with zero open inbound ports. Full GitOps: Argo CD syncs from GitHub, OpenTelemetry + KSM ships metrics to both Datadog and Grafana.
k3s · Proxmox · Argo CD · Cloudflare · OpenTelemetry
Next.js 16 + React 19 portfolio with live Prometheus health checks, React Flow infra diagram, and an xterm.js interactive terminal. Deployed via GitOps: GitHub Actions → ghcr.io → Argo CD → k3s.
Next.js · React · Prometheus · Docker · Kubernetes
Custom Next.js web app built as an internal replacement for Statuspage.io after the client needed customizations the vendor couldn't provide. Integrates Okta SSO, OpsGenie, and the Slack API for fully configurable component health views and role-based stakeholder access. Deployed on Vercel.
Next.js · Vercel · Okta · OpsGenie · Slack API
Solo-built Slack bot that automates incident response coordination via the OpsGenie API. Auto-routes responders into dedicated incident channels by component, manages stakeholder communications, and posts live status updates directly to the Health Status Dashboard — eliminating manual triage and driving consistent incident process.
Slack API · OpsGenie · AWS
Built a comprehensive internal monitoring platform from scratch for a 50-person law firm. Integrated Zabbix metrics, a custom keystroke logger (daily counts + application/website tracking), a delta-based screenshot service (15s interval, skips duplicates), login/logout event tracking, idle time analysis, and server room temperature monitoring via web scrape — all surfaced in a Grafana dashboard with scheduled daily PDF reports to management.
Grafana · Zabbix · C# · SQL Express · Windows
Designed and built a multi-component C# application to solve a recurring operational problem: clients waiting unnoticed while paralegals claimed they were never notified. Built a receptionist UI to log arrivals, a background Windows service on each paralegal's workstation that surfaced real-time alerts requiring acknowledgment, and an office manager dashboard showing live wait times and acknowledgment timestamps — all backed by a SQL Express database.
C# · .NET · SQL Express · Windows
Rebuilt the website for Gorayeb & Associates — a prominent Manhattan personal injury law firm — from a slow, unoptimized site to a fast, well-ranked one. Migrated to a self-managed Ionos VM running PHP/HTML/JS, cutting page load from 10+ seconds to under 2 seconds. Configured Cloudflare for DNS, image compression, and email security (DMARC, SPF, DKIM). Built conversion-focused landing pages that supported a $25K/month Google Ads account.
PHP · Cloudflare · Google Ads · SEO · Linux