RemoteAtlas
Find Jobs
CompaniesBlogPost a Job
RemoteAtlas

Discover curated remote jobs and work from anywhere. Updated daily with roles from top companies worldwide.

Remote Jobs by Role

  • Remote Engineering Jobs
  • Remote Design Jobs
  • Remote Product Manager Jobs
  • Remote Marketing Jobs
  • Remote Sales Jobs
  • Remote Data Jobs
  • Remote DevOps Jobs
  • Remote Support Jobs
  • Remote Customer Success Jobs
  • Remote Cybersecurity Jobs
  • Remote Mobile Developer Jobs

More Roles

  • Remote QA Jobs
  • Remote HR & People Jobs
  • Remote Finance Jobs
  • Remote Operations Jobs
  • Remote Management Jobs
  • Remote AI & Machine Learning Jobs
  • Remote Writing & Content Jobs
  • Remote Video & Animation Jobs
  • Remote Translation & Localization Jobs
  • Remote IT Support Jobs
  • Remote Community Management Jobs

Remote Jobs by Location

  • Remote Jobs in the US
  • Remote Jobs in Europe
  • Remote Jobs - Work from Anywhere
  • Remote Jobs in the UK
  • Remote Jobs in the Americas
  • Remote Jobs in EMEA
  • Remote Jobs in APAC
  • Remote Jobs in Canada

Company

  • Browse All Jobs
  • Blog
  • Companies
  • About Us
  • Post a Job
  • Contact Us
© 2026 RemoteAtlas. All rights reserved.
Terms & ConditionsPrivacy Policy
Home/Remote Engineering Jobs/Playonsports/Senior Site Reliability Engineer
P

Senior Site Reliability Engineer

Playonsports

RemoteFull-timePosted about 2 months ago
Software DevelopmentDevOps & Infrastructure

Summary

Playonsports is hiring a Senior Site Reliability Engineer to join their Software Development team. Playon is looking for an experienced Senior Site Reliability Engineer to help us strengthen the reliability, performance, and scalability of our systems. Key skills: CI/CD.

About the role

Playon is looking for an experienced Senior Site Reliability Engineer to help us strengthen the reliability, performance, and scalability of our systems. This role sits at the intersection of software engineering and operations — focused on building the tools, automation, and visibility that enable our teams to deliver resilient software at scale.
 
You’ll work closely with application engineers, DevOps, and QA teams to evolve our infrastructure, CI/CD pipelines, observability frameworks, and reliability practices. This is a hands-on engineering role with a strong emphasis on automation, performance analysis, and continuous improvement.
 
The Outcomes You’ll Deliver:
 
In the first few months, You'll focus on building a clear understanding of our systems and establishing the foundation for stronger observability across our platforms. As you settle in, your scope will grow to include broader reliability and performance initiatives.
 
• Assess and improve visibility: Work with engineering teams to review our current dashboards, metrics, and logs, identify the biggest gaps, and make targeted improvements that help us better understand system health.
• Tighten monitoring and alerting: Refine alerts and dashboards for the most critical services so we can catch issues earlier and respond faster.
• Build observability into delivery: Add instrumentation and telemetry into existing build and deploy processes to make reliability checks part of our normal release workflow.
• Clarify what "reliable" means: Help define initial SLIs and SLOs for a few core user flows, aligning the team on what good performance and availability look like.
• Streamline incident response: Partner with the Event Commander/on-call rotation to improve how we communicate, coordinate, and follow up during incidents.
• Reduce manual effort: Automate routine checks and monitoring tasks to free up engineers for more impactful work. Over time, you'll take on a larger role shaping how we measure, monitor, and improve reliability across all services — setting standards, mentoring others, and helping engineering teams make data-driven decisions about performance and stability.

In this role, you can expect to

  • Contribute to system observability i.e implementing, improving metrics, alerting, and dashboards for better insight and faster recovery.
  • Develop automation, tooling, and monitoring solutions to support high service availability.
  • Partner with application and quality engineering teams to implement best practices in reliability, release automation, and testing.
  • Drive operational excellence through proactive incident prevention, blameless postmortems, and capacity planning.
  • Participate in on-call rotations to support critical services and ensure rapid response to incidents.
  • To thrive in this role, you have

  • Solid experience in Python, especially for automation, tooling, and data-driven operational tasks.
  • Proficiency in at least one (Java, C++, or Go).
  • Strong understanding of Linux systems, cloud infrastructure (AWS, GCP, or Azure), and modern deployment practices (Docker, Kubernetes, Terraform).
  • Experience with CI/CD pipelines, version control, and automated testing frameworks.
  • Experience with observability tools (e.g., Prometheus, Grafana, ELK, Datadog, etc.) and log/metric analysis for diagnosing issues.
  • Proven experience facilitating and documenting Critical User Journeys translating them to actionable SLA/SLO for automation.
  • Demonstrated ability to collaborate with cross-functional teams and communicate clearly in high-impact situations.
  • A problem-solver who approaches reliability as a shared responsibility across engineering.
  • Familiarity with AI-augmented development tools (Claude, Codex) as part of a modern engineering workflow.
  •  
    Nice to Have
  • Experience writing or maintaining end-to-end or integration tests for distributed systems.
  • Background in performance testing, capacity planning, or chaos engineering.
  • Contributions to internal developer tooling or reliability-focused frameworks.
  • Exposure to security, compliance, or change management processes in production environments.
  • Relevant certifications.
  • Related jobs

    A
    Aurorasolar
    Staff Analytics EngineerNew

    Aurorasolar·Remote — United States

    Full-time$136.2K - $216.7KSoftware DevelopmentData & Analytics
    18h
    K
    Koalahealth
    Principal Software Engineer, AI & PlatformNew

    Koalahealth·Remote

    Full-timeSoftware Development
    1d
    C
    Senior Corporate Security EngineerNew

    Charliehealthepd·Remote, United States

    Full-time$180k+Software DevelopmentSecurity
    17h
    CH
    Charlie Health
    Senior Corporate Security EngineerNew

    Charlie Health·Remote, United States

    Full-time$180k+Software DevelopmentSecurity
    17h
    More remote engineering jobsMore remote devops jobs