Back to All Careers

Senior Software Engineer, SRE

  • Full Time
  • US (Bay Area, California Preferred)

Rafay Systems delivers a SaaS-first, enterprise-grade Kubernetes Operations Platform that enables companies to deploy and operate modern infrastructure and applications across data centers, public cloud and edge environments. The platform manages the full life-cycle of your Kubernetes infrastructure and modern applications in a single, easy-to-use, integrated platform. The platform has been built from the ground up for enterprise-class automation, security, governance,visibility, interoperability and is combined with expert services & support. We work hard, are inspired by passion for our product and are always challenging ourselves to reach further and achieve more.

Job Description

We are looking for a seasoned Senior Software Engineer, SRE  who can make significant contributions to on-call support, perform incident triage, recovery, platform deployment, and performance monitoring our multi-tenant SaaS Kubernetes Operations Platform for a multi-cloud environment. Rafay is at the forefront of Kubernetes technology and we offer unique opportunities to develop new technology and to be part of a team that encourages positive change through outside-of-the-box thinking. We hold high expectations for ourselves and challenge team members to continually seek improvement. Rafay offers opportunities to work in a collaborative environment that rewards creative thinking and provides opportunities to advance professional careers in advanced technology development. As the first of our kind, we are truly in a class of our own.

Responsibilities

  • Work to understand any arising issues and overall application performance by enacting monitoring solutions
  • Conduct consistent and thorough analysis of current systems and work to reduce the quantity of existing problems, suggesting new solutions to help upgrade & refine such systems
  • Provide support across a broad range of areas including monitoring, processes & tools, architecture, and Root Cause Analysis

Skills and Qualifications

  • Minimum of 4 years of sysadmin experience in handling large-scale distributed system software deployments in cloud and in an on-premise environment.
  • Strong cloud management foundation.
  • Unix shells, Python & Go programming proficiency.
  • Outstanding teammate who can collaborate and influence in a multifaceted environment.
  • Excellent interpersonal, and written communication skills.
  • Excellent debugging and troubleshooting skills.
  • Ability to define standard operating procedures for supported platform features.
  • BS degree in Computer Science or a related technical field involving coding, or equivalent practical experience.

Trusted by leading companies