JO Jason M. Oliverjmoliver.ai
I solve infrastructure problems that live between ownership boundaries.

Case Studies

Representative problem domains

These are anonymized patterns, not client disclosures. They show the kind of engineering work I want the site to communicate.

Exabyte-scale storage fabric diagnostics

The Setup

Large distributed storage environments where NAS, S3/object access, metadata activity, network fabric behavior, and customer workload patterns all influence perceived performance.

The Engineering

Correlate client symptoms with storage telemetry, fabric counters, protocol behavior, workload timing, and known topology or configuration constraints.

My Role

Identify the likely failure domain, collect the evidence path, explain the operational impact, and drive the investigation toward a validated conclusion.

The Outcome

Separate workload-side pressure from platform-side behavior, reduce ambiguity, and create a repeatable diagnostic path for engineering and customer-facing teams.

Compute and GPU cluster workload readiness

The Setup

AI/HPC environments where GPU utilization, data access, storage latency, fabric behavior, and orchestration decisions combine into workload throughput.

The Engineering

Validate the data path across client, compute node, network, storage service, and telemetry layers while watching for starvation, queueing, topology bottlenecks, and bad assumptions.

My Role

Separate GPU, storage, fabric, scheduler, and workload-side constraints so the team can act on the real limiter instead of the loudest symptom.

The Outcome

Identify whether the limiting factor is workload shape, storage behavior, network path, host configuration, scheduler placement, or instrumentation.

Cross-layer failure domain triage

The Setup

Ambiguous platform incidents where several teams see partial symptoms but no single layer clearly owns the failure.

The Engineering

Build a timeline, define evidence boundaries, trace the request path, compare telemetry against observable behavior, and remove unsupported explanations.

My Role

Turn fragmented signals into a shared operating picture and keep the discussion anchored to evidence rather than ownership assumptions.

The Outcome

Convert scattered evidence into a practical operating picture, then document the mitigation, validation method, and longer-term engineering follow-up.

Portable diagnostics and operational tooling

The Setup

Constrained customer or lab environments where evidence collection must be fast, low-risk, portable, and understandable by more than one engineer.

The Engineering

Use Bash, Python, PowerShell, packet capture, syslog/Splunk-friendly output, and structured logs to collect host, network, storage, and workload context.

My Role

Make evidence collection repeatable, portable, and understandable so diagnostics can survive handoff, escalation, and later review.

The Outcome

Improve time-to-evidence, reduce repeated manual effort, and make diagnostic workflows easier to hand off, review, and reuse.