All roles

Senior Customer Experience Engineer – Cloud Reliability & SLO Monitoring Specialist

Remote · USA Full-time New today

About arenaflex – Pioneering the Future of Cloud Services

arenaflex is a global leader in cloud infrastructure, empowering organizations of every size to innovate, scale, and thrive. Our mission is to enable every person and every organization on the planet to achieve more through reliable, secure, and intelligent cloud solutions. With a relentless focus on customer obsession, continuous improvement, and inclusive culture, arenaflex has built a reputation for delivering world‑class performance, security, and support across a portfolio of cloud services that power mission‑critical workloads worldwide.

Why This Role Matters

In the fast‑moving world of cloud computing, customers place their most critical applications, data, and brand reputation on the arenaflex Cloud. When the platform meets its high standards for quality and reliability, customers succeed; when it falls short, their end‑users feel the impact. As a Senior Customer Experience Engineer, you will be the bridge between arenaflex’s engineering excellence and the real‑world expectations of our customers. Your work will directly influence how customers experience reliability, performance, and trust in the arenaflex Cloud.

Role Overview

We are seeking a seasoned, customer‑obsessed engineer who specializes in Service Level Objectives (SLOs) and Service Level Indicators (SLIs). You will join the Observability team, a high‑impact, fast‑growing group that designs, builds, and operates the monitoring and automation framework that keeps our customers’ workloads running smoothly. This role blends deep technical expertise with strong communication skills, enabling you to collaborate with customers, product teams, and service engineers to define, implement, and continuously improve SLO‑driven reliability.

Key Responsibilities

  • Customer Collaboration: Partner with customers to define realistic SLOs and SLIs that align with their business goals, risk tolerance, and compliance requirements.
  • Instrumentation & Measurement: Design and embed instrumentation in customer workloads to capture SLO‑relevant metrics, ensuring accurate, low‑overhead data collection.
  • SLO Breach Detection: Build automated detection pipelines that surface SLO violations in real time, providing actionable alerts to both customers and internal teams.
  • Remediation Automation: Develop self‑healing playbooks, troubleshooting guides, and automated remediation scripts that reduce mean‑time‑to‑resolution for SLO breaches.
  • Cross‑Team Integration: Work closely with service engineering, platform reliability, and product groups to correlate customer‑defined SLOs with arenaflex platform SLOs, creating a unified view of reliability across the stack.
  • Data‑Driven Insights: Analyze historical SLO data to identify trends, reliability risks, and opportunities for performance optimization; present findings to stakeholders with clear recommendations.
  • Proactive Customer Engagement: Conduct regular SLO health reviews, share performance dashboards, and advise customers on best practices to exceed their reliability targets.
  • Performance Optimization: Lead initiatives to improve system scalability, latency, and resource efficiency, ensuring that both arenaflex services and customer workloads consistently surpass defined SLOs.
  • Documentation & Knowledge Sharing: Create and maintain comprehensive documentation, runbooks, and knowledge‑base articles that capture SLO definitions, monitoring setups, and remediation procedures.
  • Mentorship & Culture Building: Mentor junior engineers, champion inclusive collaboration, and contribute to arenaflex’s culture of continuous learning and diversity.

Required Qualifications

  • Bachelor’s degree in Engineering, Computer Science, or a related discipline, or equivalent practical experience.
  • Minimum 4 years of experience designing, implementing, debugging, and launching commercial software products or web services.
  • At least 3 years of hands‑on experience in Site Reliability Engineering (SRE) or Customer Reliability Engineering (CRE) within a cloud environment (arenaflex, AWS, or GCP).
  • Demonstrated expertise in creating, managing, and evolving SLOs and SLIs for cloud‑based customers.
  • 2+ years of experience in a client‑facing role, with a track record of building trusted relationships and delivering technical guidance.
  • Strong programming skills in at least one of the following: Python, Go, Java, or PowerShell, with the ability to write clean, maintainable code for instrumentation and automation.
  • Deep understanding of observability stacks (e.g., Prometheus, Grafana, OpenTelemetry), alerting systems, and incident response workflows.
  • Excellent communication skills—both written and verbal—and the ability to translate complex technical concepts into clear, actionable recommendations for non‑technical stakeholders.
  • Eligibility to meet arenaflex, customer, and/or government security screening requirements, including background checks.

Preferred Qualifications

  • Master’s degree in Engineering, Computer Science, or a related field, combined with 6+ years of relevant industry experience.
  • 8+ years of software development or reliability engineering experience, preferably within large‑scale cloud platforms.
  • Additional 2+ years of direct customer‑facing experience, especially in enterprise or regulated environments.
  • Experience building large‑scale automation frameworks, CI/CD pipelines, or infrastructure‑as‑code solutions (e.g., Terraform, ARM templates).
  • Familiarity with compliance standards such as ISO 27001, SOC 2, or GDPR, and the ability to embed compliance considerations into SLO design.
  • Proven track record of publishing technical articles, blog posts, or conference talks on reliability, observability, or cloud engineering.

Core Skills & Competencies

  • Analytical Mindset: Ability to dissect complex data sets, spot patterns, and drive data‑backed decisions.
  • Customer Obsession: A relentless focus on delivering value to customers, anticipating their needs, and exceeding expectations.
  • Collaboration: Comfortable working across distributed teams, fostering inclusive dialogue, and aligning diverse perspectives toward a common goal.
  • Problem‑Solving: Resourceful in troubleshooting ambiguous issues, designing creative solutions, and iterating quickly.
  • Automation First: Passion for building repeatable, automated processes that reduce manual toil and improve reliability.
  • Growth Mindset: Eagerness to learn new technologies, share knowledge, and continuously improve both personal and team performance.

Career Growth & Learning Opportunities

arenaflex invests heavily in the professional development of its engineers. In this role, you will have access to:

  • Mentorship programs with senior leaders in reliability, product, and engineering.
  • Sponsored certifications (e.g., Certified Kubernetes Administrator, Certified SRE Professional).
  • Internal technical conferences, hackathons, and innovation days that encourage experimentation.
  • Opportunities to transition into senior technical leadership, product management, or specialized reliability architect tracks.
  • Cross‑functional projects that expose you to the broader arenaflex ecosystem, including AI, data analytics, and security.

Culture, Values, and Inclusion at arenaflex

Our culture is built on four pillars: Customer Obsession, Measure What Matters, No Dead‑Ends, and Whatever It Takes. We celebrate diversity, encourage authentic expression, and provide flexible work arrangements that empower each team member to bring their best self to work. Whether you thrive in a collaborative office setting, a remote environment, or a hybrid model, arenaflex supports the work style that maximizes your productivity and well‑being.

We also champion a growth mindset: you’ll be encouraged to experiment, fail fast, and iterate. Our inclusive policies ensure that every voice—regardless of background, identity, or experience—contributes to shaping the future of cloud reliability.

Compensation, Perks, and Benefits

arenaflex offers a competitive total rewards package that includes:

  • Base salary ranging from $112,000 to $218,400 (adjusted for location and experience).
  • Annual performance bonuses and equity grants that align your success with the company’s growth.
  • Comprehensive health, dental, and vision coverage for you and your dependents.
  • Generous paid time off, parental leave, and flexible holiday policies.
  • Retirement savings plans with company matching contributions.
  • Professional development stipend, tuition reimbursement, and access to a vast library of learning resources.
  • Wellness programs, on‑site fitness centers, and virtual wellness benefits.
  • Employee resource groups (ERGs) focused on diversity, inclusion, and community outreach.

How to Apply

If you are ready to make a tangible impact on the reliability of mission‑critical cloud workloads and thrive in a dynamic, inclusive environment, we want to hear from you. Submit your application through the arenaflex careers portal, and include a resume that highlights your SLO/SLI experience, customer‑facing achievements, and any relevant automation projects.

Apply Now – Join arenaflex!

Closing Statement

At arenaflex, your expertise will help turn our customers into lifelong fans. By designing robust SLO monitoring solutions, you will empower organizations to deliver reliable, high‑performing services to their end‑users, reinforcing arenaflex’s reputation as the most trusted cloud provider. Join us, bring your passion for reliability, and help shape the future of cloud experiences.

Apply for this job

Related roles

Remote Data Entry Specialist – Full‑Time Work‑From‑Home Position with Comprehensive Training, Flexible Hours, and Career Growth Opportunities at arenaflex

Remote · USA Full-time

Remote Customer Service Representative – Travel & Airline Support for arenaflex – $24/hr – Full‑Time – Austin, TX

Remote · USA Full-time

Remote Network Support Engineer – Enterprise IT Infrastructure & Incident Management (arenaflex)

Remote · USA Full-time

Senior Data Engineering & Analytics Engineer – Remote – Cloud Data Lake, DevOps & Business Intelligence Solutions at arenaflex

Remote · USA Full-time

Remote Data Entry Associate – Entry‑Level Online Position for Teens at arenaflex – Flexible Virtual Work Experience

Remote · USA Full-time

Remote Customer Service Representative – Specialty Pharmacy Patient Support (Work From Home, New Jersey)

Remote · USA Full-time

Remote Customer Service Coordinator – Bank Voice Operations (Work‑From‑Home) – Financial Services Support Specialist

Remote · USA Full-time

Remote Data Entry Clerk – Accurate Database Management & Invoice Processing Specialist (Remote)

Remote · USA Full-time

Remote Data Entry Specialist – Home‑Based Accurate Information Management, Database Maintenance & Quality Assurance

Remote · USA Full-time

Remote Virtual Travel Data Entry Specialist – Precise Itinerary Management, Client Coordination, and Remote Team Collaboration

Remote · USA Full-time

Data Operations Analyst (AI-Assisted Data Transformation & Structuring)

Remote · USA Full-time

Advanced Project Manager, PCS/Rod Control

Remote · USA Full-time

[Remote] Director, US Agriculture

Remote · USA Full-time

Licensed Life Insurance Agent – Flexible Schedule & Growth Opportunities

Remote · USA Full-time

Manager, Paid Social

Remote · USA Full-time

Remote Patient Monitoring (RPM) Medical Assistant

Remote · USA Full-time

Customer Support Representative

Remote · USA Full-time

Sightview - Software Engineer II PHP

Remote · USA Full-time

Product Operations Manager

Remote · USA Full-time

Contract Practitioner – AI Skills Assessment Validation

Remote · USA Full-time