Anthony Ruiz · USA
I build systems that make data tell the story — marrying business intelligence with infrastructure telemetry so teams can stop reacting and start anticipating.
$ whoami
Observability Engineer · DevOps · Platform Engineering
I got into technology through curiosity and necessity — starting as a self-taught IT tech at a Manhattan law firm and growing into owning everything from network infrastructure to digital advertising to custom software. That scrappiness never left. Today I design and operate observability platforms that give engineering teams real signal in production: the kind of dashboards and alerting pipelines that mean your on-call engineer wakes up to a clear picture, not a wall of noise. If something needs building, I build it — I've replaced vendor products with purpose-built internal tools, made the case for better solutions and then delivered them solo, and integrated data sources across stacks that were never meant to talk to each other. I care about systems that are honest — metrics that reflect reality, runbooks that actually get used, and infrastructure that can explain itself. When I'm not building for clients, I'm running a self-hosted k3s homelab on Proxmox, shipping this portfolio site through a GitOps pipeline, and looking for ways to push observability deeper into every layer of the stack.
$ ls -la tools/

10+
years in production infrastructure
~80%
ticket reduction at StrongArm via telemetry and Grafana OnCall
Built twice
made the case for better tooling, then delivered it solo — at two separate companies
$ cat resume.yaml
TekStream Solutions
Observability SRE embedded within the digital enablement org at a top-5 U.S. restaurant chain, enabling telemetry adoption (logs, traces, metrics) across ~30 engineering teams and hundreds of Kubernetes microservices. A primary POC for all Datadog, Grafana, OpenTelemetry, and OpsGenie troubleshooting across the sub-org. Sole engineer behind a custom Next.js Health Status Dashboard and Slack Incident Command bot — integrating OpsGenie, Okta, and the Slack API — replacing Statuspage.io with a purpose-built incident coordination platform.
StrongArm Technologies
Lead support engineer at an IoT workplace safety startup, responsible for all client troubleshooting, onboarding, and serving as the technical voice of the product on sales calls. Made the case for Grafana and single-handedly built the company's entire observability stack from scratch — integrating Clickhouse SQL, in-house APIs, and inventory databases into a unified operational picture. Wired Grafana OnCall to auto-generate Zendesk tickets with full client context, eliminating black-box troubleshooting and driving 80%+ of all support ticket creation automatically.
PFR IT Consulting Co
Grew from part-time IT technician to the sole technical, web, and marketing owner of a prominent 50-person Manhattan law firm. Shortly after joining, identified $10K/month in wasted Google Ads spend and took over full management of their $25K/month account. Rebuilt the website from scratch (10s+ → under 2 seconds), administered the Windows AD network, and independently built a suite of custom C# tools and a Grafana-backed monitoring platform covering security, productivity, and system health.
$ ls -la ~/projects
project_03
Custom Next.js web app built as an internal replacement for Statuspage.io after the client needed customizations the vendor couldn't provide. Integrates Okta SSO, OpsGenie, and the Slack API for fully configurable component health views and role-based stakeholder access. Deployed on Vercel.
project_04
Solo-built Slack bot that automates incident response coordination via the OpsGenie API. Auto-routes responders into dedicated incident channels by component, manages stakeholder communications, and posts live status updates directly to the Health Status Dashboard — eliminating manual triage and driving consistent incident process.
project_05
Built a comprehensive internal monitoring platform from scratch for a 50-person law firm. Integrated Zabbix metrics, a custom keystroke logger (daily counts + application/website tracking), a delta-based screenshot service (15s interval, skips duplicates), login/logout event tracking, idle time analysis, and server room temperature monitoring via web scrape — all surfaced in a Grafana dashboard with scheduled daily PDF reports to management.
project_06
Designed and built a multi-component C# application to solve a recurring operational problem: clients waiting unnoticed while paralegals claimed they were never notified. Built a receptionist UI to log arrivals, a background Windows service on each paralegal's workstation that surfaced real-time alerts requiring acknowledgment, and an office manager dashboard showing live wait times and acknowledgment timestamps — all backed by a SQL Express database.
project_07
Rebuilt the website for Gorayeb & Associates — a prominent Manhattan personal injury law firm — from a slow, unoptimized site to a fast, well-ranked one. Migrated to a self-managed Ionos VM running PHP/HTML/JS, cutting page load from 10+ seconds to under 2 seconds. Configured Cloudflare for DNS, image compression, and email security (DMARC, SPF, DKIM). Built conversion-focused landing pages that supported a $25K/month Google Ads account.
$ grafana-cli dashboard import
End-to-end telemetry — metrics, traces, and incidents wired together.live
Nodes Ready
4/4
all healthy
Pods Running
66
3 pending
Restarts (1h)
12
check logs
Avg CPU
3%
normal
up metric · 30-day uptime$ pvesh get /nodes
Proxmox homelab — all workloads on one physical host · scroll or use controls to explore· node metrics live
$ ssh anthony@homelab
Interactive shell — try help
$ ping anthony
Open to interesting infrastructure challenges, consulting, or just talking shop about homelabs and platform engineering.