Platform Reliability
Diagnose infrastructure behavior where availability, performance, and customer impact do not line up cleanly.
Recruiter View
Strong alignment: Staff/Principal platform engineering, distributed infrastructure, SRE/reliability, AI/HPC infrastructure, storage and data platforms, solutions architecture, and customer-facing technical leadership.
I bring 15+ years of enterprise infrastructure experience across distributed systems, enterprise storage, VMware, Linux, networking, automation, customer escalations, and AI/HPC infrastructure. I am strongest in roles where production behavior, customer impact, and engineering reality have to be aligned.
I am not looking to be a generic queue-based support engineer. My best fit is architecture, escalation ownership, platform reliability, infrastructure diagnostics, customer-facing technical leadership, and systems engineering where cross-layer reasoning matters.
Diagnose infrastructure behavior where availability, performance, and customer impact do not line up cleanly.
Trace problems across client, compute, network, storage, virtualization, telemetry, and workload behavior.
Turn low-level evidence into clear action for engineering, support, product, customer, and leadership teams.
Systems thinking across compute, storage, network, virtualization, observability, and automation.
Failure-domain analysis, operational evidence, repeatability, incident learning, and remediation design.
Technical explanation, customer stabilization, architecture review, and engineering-aligned communication.
Distributed storage, AI OS/platform support, NAS/S3, GPU workload behavior, customer escalations, telemetry correlation.
Enterprise storage, VMware escalation, server/storage/network interaction, validation testbeds, automation, L4 diagnostics.
Infrastructure, reliability, automation, documentation, diagnostics, and advisory work. Selected details limited by confidentiality.