apr_
home/portfolio/Staff Monitoring & Productivity

April 16, 2026 · 9 min read · Engineering / IT

Building a Staff Monitoring & Productivity Platform for a 50-Person Law Firm

From Zabbix metrics to keystroke logs to Grafana dashboards — all from scratch

How I built a comprehensive internal monitoring platform integrating Zabbix, a custom C# keystroke logger, delta-based screenshot capture, idle time analysis, a client arrival notification system, and server room telemetry — all surfaced in Grafana with automated weekly PDF reports.

TL;DR

A 50-person law firm needed visibility into staff productivity and IT health that no off-the-shelf tool could provide. I built the whole stack from scratch — metrics, telemetry agents, a client arrival notification system, dashboards, and automated reporting.

The firm handled multi-million dollar cases, which made visibility into staff activity a critical need — both operationally and from a security standpoint. We had deployed a third-party monitoring tool that tracked websites visited, logged keystroke counts, and recorded employee screens. It was expensive, constantly unreliable, and the vendor support was no help. I had super admin privileges in a Windows Active Directory environment and gradually realized I could build something more dependable and cost-effective from scratch. The result is a stack I built end to end: Zabbix for metrics collection, a C# keystroke logger, a delta-based screenshot service, idle time analysis, a client arrival notification system, and Grafana dashboards — running on a shoestring budget with no off-the-shelf dependencies beyond those two.


What the Platform Had to Do

  • Track login/logout and lunch-break events per workstation via timeclock system integration
  • Measure idle time and flag inactive sessions
  • Count daily keystrokes per user with per-application and per-website breakdown
  • Capture delta-based screenshots at 15-second intervals, skipped when no meaningful screen change occurred
  • Alert staff in real time when a client arrived at the front desk, with an escalation path to the office manager
  • Monitor server room temperature via UPS admin page scraping
  • Surface everything in Grafana with weekly PDF reports — per-paralegal breakdowns and a cross-staff comparison — emailed automatically to management

Architecture

Metrics backboneZabbix — agent on each workstation, custom items for custom telemetry
Keystroke loggerCustom C# Scheduled Task — daily counts + per-application/website breakdown
Screenshot serviceCustom C# Windows service — perceptual hash deduplication, 15s capture interval
Idle timeC# Generic Host via registry Run key — P/Invokes GetLastInputInfo, pushes to Zabbix trapper item
Timeclock integrationDirect SQL Express query to existing timeclock DB — login/logout events surfaced in Grafana
Client arrival systemC# receptionist app + SQL Express event store + per-workstation polling service → modal Windows alert
Server room monitoringPython web scrape of local UPS admin page → Zabbix → Grafana alert on threshold breach
VisualizationGrafana — office overview, per-user, and per-machine dashboards
ReportingLinux VM cron job — Ghostscript renders SQL query results to PDF, emailed weekly with per-paralegal breakdowns and a cross-staff comparison
All telemetry flows through Zabbix or SQL Express. Grafana and Ghostscript sit at the output layer — no data ever touches a third-party SaaS.

The Keystroke Logger

The keystroke logger is a C# application that hooks into the Windows low-level keyboard input event system to count keypresses in the active session. The goal was never to capture content — only to measure throughput and application context. Every keystroke event increments a counter for the current foreground window title, giving a per-application and per-website breakdown of daily typing activity. Those daily totals are written to SQL Express via Zabbix Sender, where they flow into the per-user Grafana dashboard.

Counts, not content

The deliberate design choice was to capture keystroke *counts* and the active application or browser tab title — never the actual key content. This was both a legal requirement and a trust decision from the outset. Staff knew monitoring was in place; capturing what they typed was never on the table.

My first rollout taught me an important lesson. I had the logger send a count to Zabbix on every single keystroke event. Even with only a few machines enrolled, I was seeing visible lag on the workstations. The logger was running on a single synchronous thread and the per-keystroke network calls were adding up fast. I re-architected it to aggregate counts in memory and flush them on a randomized interval — randomized to avoid a thundering-herd effect when 50 machines all try to phone home at the same second.

There was also a fundamental Windows constraint to work around: Services run in Session 0, which is isolated from the interactive user session. Low-level keyboard hooks and foreground window detection both require access to the interactive session, so running as a Windows Service was architecturally incompatible. Instead, the logger runs as a Windows Scheduled Task triggered at logon, placing it in the user's own session where it has full access to keyboard hooks and window focus events.

Delta-Based Screenshot Service

The screenshot service is a C# Windows service built on .NET 8's `BackgroundService` and `PeriodicTimer`. Every 15 seconds it fires a GDI32 `BitBlt` capture of the full desktop — GetDesktopWindow → GetWindowDC → CreateCompatibleBitmap → BitBlt — then immediately runs a perceptual hash comparison against the previous frame before deciding whether to keep it.

Perceptual Hash Deduplication

The hash algorithm scales every captured frame down to a fixed 1024×1024 canvas using high-quality bicubic interpolation, then computes a brightness value for every pixel using the ITU-R BT.601 luminance formula (0.30R + 0.59G + 0.11B) via unsafe pointer arithmetic over the raw LockBits scan buffer. Each pixel is thresholded to a single bit, producing a 1,048,576-element binary hash. The current hash is compared against the previous frame's hash by counting matching pixel positions. If the match count exceeds the configured threshold, the file is deleted — nothing happened on screen worth keeping.

Low sensitivity~522,500 match threshold — only skips near-identical frames (~50% pixel match)
Medium sensitivity~1,045,000 — default, skips frames with ~99.7% pixel similarity
High sensitivity~2,090,000 — extremely aggressive, almost never saves

The hash is held in memory between captures, so the first screenshot of each session is always saved regardless of threshold. In practice, the medium threshold eliminated the vast majority of idle-desktop and static-document captures while still catching every meaningful screen change — application switches, document edits, browser navigation.

Storage Layout and Ultra-Wide Support

Frames that pass the hash check are saved as JPEG thumbnails under {OutputDirectory}/Users/{WindowsUsername}/MM-dd-yy/, with filenames encoding the capture time as hmm tt.ss (e.g., 930am.05.jpeg = 9:30:05 AM). The service auto-detects the desktop's actual pixel dimensions at capture time and derives thumbnail width proportionally — outputWidth = round(ImageHeight × (screenWidth / screenHeight)) — so thumbnails are geometrically correct across standard 16:9, ultra-wide 21:9, and dual-monitor 32:9 configurations without any manual configuration.

Idle Time Analysis

Idle time detection runs as a separate component — IdleTimeWatcher — a .NET 8 Generic Host process that runs in the logged-on user's session. It P/Invokes `GetLastInputInfo` from user32.dll to measure combined keyboard and mouse idle time, then pushes that value (in seconds) on a randomized 2–10 second interval. The randomized interval avoids thundering-herd load on the Zabbix server across 50 workstations firing in lockstep. I wrote a full breakdown of this component in Idle Time Network Monitoring with Zabbix.

The process runs as a registry Run key entry (HKCU\Software\Microsoft\Windows\CurrentVersion\Run) rather than a Windows Service — GetLastInputInfo returns 0 when called from Session 0 (the service isolation context), so a service is architecturally incompatible with this approach. The Run key launches the process in the interactive user session at logon, where it has valid access to the input system.

Data flows to Zabbix via zabbix_sender.exe using a Zabbix trapper item keyed idletime. The sender identifies the host by Environment.MachineName, which maps 1:1 to the Zabbix registered host name. Two Zabbix triggers drive the attendance alerting: nodata(5m)=1 fires when no idle-time samples arrive for five minutes — interpreted as the user having logged out — and last()>1800 flags sessions that have been inactive for over 30 minutes while still connected. Both triggers feed a Zabbix Action that sends email to management with recovery messages on session resume.

The same component also supports Prometheus Remote Write, pushing idle_time_seconds as a Gauge metric with job and instance labels. In the Grafana layer this time series sits alongside the keystroke count and login event streams on the per-user dashboard, giving a complete picture of a workday: when the session opened, how much typing activity occurred, and how long the workstation sat idle between bursts.

Timeclock System Integration

The firm already had a timeclock system in place for tracking when staff arrived and left. Rather than build a separate attendance tracking mechanism from scratch, I connected directly to the timeclock's SQL Express database and pulled login, logout, and lunch-break events as they were recorded. These events were surfaced in Grafana alongside the idle time and keystroke data, giving the office manager a complete picture of each workday: when someone arrived, how active they were at their desk, and when they left.

Client Arrival Notification System

The office manager had a recurring frustration: when a client arrived at the front desk and waited, staff would later claim they hadn't known the client was there. There was no reliable way to verify or disprove that. I built a system to close that gap entirely.

The receptionist side was a simple C# desktop application. When a client arrived, the receptionist selected the responsible paralegal or legal assistant from a radio button list and entered the client's name in a text field. On submit, the event was written to a SQL Express table with a timestamp.

On every enrolled workstation, a background service polled that SQL Express table every 10 seconds. When it found a new arrival event addressed to that machine's user, it immediately popped up a modal Windows dialog with the client's name and arrival time. The dialog was blocking — staff could not interact with any other application until they acknowledged it. Acknowledgment was logged back to the database with a timestamp.

The modal was blocking by design — staff could not dismiss or work around it. Acknowledgment timestamps made the audit trail irrefutable.

No more plausible deniability

Once a staff member acknowledged the notification, the timestamp was recorded. The office manager could see in real time who had been notified and when — and if a client had been waiting longer than 15 minutes without acknowledgment, an automatic alert was sent to the office manager directly.

All arrival events, acknowledgment timestamps, and wait-time breaches were visible on the main monitoring dashboard in real time. This became one of the most impactful features of the entire platform — not because of its technical complexity, but because it solved a real operational problem that had been a source of friction in the office for years.

Server Room Temperature Monitoring

The server room had no dedicated environmental monitoring — just a UPS device with a built-in temperature sensor and a local admin web interface. I wrote a Python script that scraped that admin page on a regular interval, parsed the temperature reading, and pushed it to Zabbix as a custom trapper item. From there it fed into a Grafana panel with threshold-based alerts configured to email management if the temperature climbed above a safe operating range.

It was a quick solution, but it worked reliably. No additional hardware, no additional budget — just wiring an existing sensor into the observability stack.

Grafana Dashboards and Automated Reports

The centerpiece of the platform was a real-time office overview dashboard I kept on a dedicated screen. Each workstation had a stats row: a keystroke counter that ticked up in real time, the current active window title, idle time status, and the latest screenshot thumbnail. At a glance I could tell if someone was actively working, idle at their desk, or had logged out entirely.

Beyond the overview, there were per-user and per-machine views that showed the full day's timeline — keystroke activity over time, application usage breakdown, and a screenshot gallery organized by hour. The office manager used the screenshot viewer to spot-check staff activity throughout the day without needing to interrupt anyone or ask for a status.

The reporting pipeline ran on a Linux VM I was already using for other projects. A cron job fired weekly, queried SQL Express for the previous week's keystroke counts, site activity, and idle time, then passed the results to Ghostscript to render a formatted PDF. The PDF was emailed directly to management — no manual step, no one needing to remember to pull data. Each report included individual breakdowns per paralegal alongside a cross-staff comparison chart so performance differences were immediately visible.

Lessons Learned

  • Send on interval, not on event. My first keystroke deployment sent a network call on every keypress. Even a handful of machines grinding through a workday created enough synchronous traffic to cause visible lag. Aggregating in memory and flushing on a randomized interval fixed it immediately — and the randomization mattered: 50 machines firing at the same second is a different problem than 50 machines spread across a 10-second window.
  • Windows Services run in Session 0. APIs that measure interactive user activity — keyboard hooks, foreground window detection, `GetLastInputInfo` — return nothing useful from a service process. If you need to observe a user session, you need to run in that session. Scheduled Tasks at logon are the right mechanism.
  • Existing infrastructure is leverage. I didn't need to build attendance tracking because the timeclock database already existed. I didn't need a new sensor for the server room because the UPS already had one. Most of the work was integration, not invention.
#Grafana#Zabbix#C##SQL Express#Windows#Monitoring
apr·Anthony Paul Ruiz
ci/cdhosted onk8sviacloudflare