Observability Engineer · DevOps · Platform Engineering
Observability and DevOps engineer with 10+ years of experience enabling engineering teams with production-grade telemetry, building observability platforms from scratch, and engineering custom incident response tooling. Expert in Grafana, Datadog, and OpenTelemetry — with a consistent track record of turning black-box troubleshooting into automated, proactive operations and building the purpose-built tools that make it possible.
Datadog · Grafana · OpenTelemetry · OpsGenie · Kubernetes · Argo CD · Next.js · Vercel · Okta · Slack API · Statuspage.io · Incident Management · SRE · AWS
Grafana · Grafana OnCall · Datadog · Prometheus · InfluxDB · Clickhouse · BigQuery · Databricks · SQL · AWS · Zendesk · WorkspaceOne · Looker · IoT
Grafana · Zabbix · C# / .NET · SQL Express · Cloudflare · Active Directory · PHP · Google Ads · Adobe Suite · SEO
Custom Next.js web app built as an internal replacement for Statuspage.io after the client needed customizations the vendor couldn't provide. Integrates Okta SSO, OpsGenie, and the Slack API for fully configurable component health views and role-based stakeholder access. Deployed on Vercel.
Next.js · Vercel · Okta · OpsGenie · Slack API
Solo-built Slack bot that automates incident response coordination via the OpsGenie API. Auto-routes responders into dedicated incident channels by component, manages stakeholder communications, and posts live status updates directly to the Health Status Dashboard — eliminating manual triage and driving consistent incident process.
Slack API · OpsGenie · AWS
Made the case for Grafana at an IoT workplace safety startup and built the entire observability stack solo — integrating 7 data sources including Clickhouse SQL, Prometheus, InfluxDB, BigQuery, Databricks, an in-house API, and an inventory database. Wired Grafana OnCall to auto-generate fully-contextualized Zendesk tickets, driving 80%+ of all support ticket creation automatically.
Grafana · Grafana OnCall · Clickhouse · InfluxDB · BigQuery · Prometheus · IoT
Built a comprehensive internal monitoring platform from scratch for a 50-person law firm. Integrated Zabbix metrics, a custom keystroke logger (daily counts + application/website tracking), a delta-based screenshot service (15s interval, skips duplicates), login/logout event tracking, idle time analysis, and server room temperature monitoring via web scrape — all surfaced in a Grafana dashboard with scheduled daily PDF reports to management.
Grafana · Zabbix · C# · SQL Express · Windows
Rebuilt the website for Gorayeb & Associates — a prominent Manhattan personal injury law firm — from a slow, unoptimized site to a fast, well-ranked one. Migrated to a self-managed Ionos VM running PHP/HTML/JS, cutting page load from 10+ seconds to under 2 seconds. Configured Cloudflare for DNS, image compression, and email security (DMARC, SPF, DKIM). Built conversion-focused landing pages that supported a $25K/month Google Ads account.
PHP · Cloudflare · Google Ads · SEO · Linux
4-node k3s cluster on Proxmox with a production-grade observability stack: Prometheus, Loki, Tempo, and Grafana. Browser RUM via Grafana Faro → Grafana Alloy → Tempo, enabling end-to-end distributed traces from browser click to server span. Exposed via Cloudflare Tunnel with zero open inbound ports. Full GitOps: Argo CD syncs from GitHub.
k3s · Proxmox · Argo CD · Cloudflare · OpenTelemetry · Tempo · Grafana Faro
Next.js 16 + React 19 portfolio with live Prometheus health checks, React Flow infra diagram, and an xterm.js interactive terminal. Deployed via GitOps: GitHub Actions → ghcr.io → Argo CD → k3s.
Next.js · React · Prometheus · Docker · Kubernetes