Site Reliability Engineer

 

Description:

The Developer Productivity Reliability Engineering (DPRE) team enables the PayPal Developer experiences that power PayPal’s business. We help to deliver scalable, exceptionally efficient, highly reliable developer experience services, platform & infrastructure in a hybrid cloud environment working with our partner teams. We are on a journey to transform the developer experience by incorporating our world class site practices into our developer platforms. Thousands of engineers at PayPal commit code daily and every commit trigger’s CI/CD process to evaluate it against the high-quality standards set by PayPal. We are talking enabling daily automated releases at large PayPal scale here.

Job Description:
What you need to know about the role:

The Developer Productivity Reliability Engineering (DPRE) team enables the PayPal Developer experiences that power PayPal’s business. We help to deliver scalable, exceptionally efficient, highly reliable developer experience services, platform & infrastructure in a hybrid cloud environment working with our partner teams.
We are on a journey to transform the developer experience by incorporating our world class site practices into our developer platforms. Thousands of engineers at PayPal commit code daily and every commit trigger’s CI/CD process to evaluate it against the high-quality standards set by PayPal. We are talking enabling daily automated releases at large PayPal scale here.


Meet our team:

The Developer Productivity Reliability Engineering (DPRE) team enables the PayPal Developer experiences that powers PayPal’s business. We help to deliver scalable, exceptionally efficient, highly reliable developer experience services, platform & infrastructure in a hybrid cloud environment working with our partner teams.
We are on a journey to transform the developer experience by incorporating our world class site practices into our developer platforms. Thousands of engineers at PayPal commit code daily and every commit trigger’s CI/CD process to evaluate it against the high-quality standards set by PayPal. We are talking daily automated releases at large PayPal scale here.
As an SRE team, we are constantly learning, while being proud of our systems knowledge, technology breadth, and intellectual curiosity. We are driven to empower PayPal developers in the best way possible and treat developer issues with utmost seriousness just like any production incident.


Your way to impact:

As a DevOps engineer in the team, you will work within a team of highly talented engineers to automate developer incident detection, triage and mitigation.
In this role for the SRE team, you will use your skills & experience as a software engineer, to influence a self-serve platform vision that caters to developers’ requests, and execution of the key capabilities of PayPal’s Developer experience platform.
You will be part of the team that is Customer Focused & values Inclusion, Diversity, Growth Mindset, Innovation, Delivering Value, Career Growth & Work/Life Balance. 


Your day to day:

Continuously monitor the performance, health, and availability of developer platforms using observability tools like Datadog. 
Respond to alerts triggered by monitoring systems and triage/resolve incidents within SLA. 
Perform post-incident investigations to identify and fix root causes. 
Participate in a shared on-call schedule to ensure 24/7 system reliability. 
Troubleshoot and resolve issues related to developer platforms during on-call hours.
Automate incident detection, triage & mitigation on developer platforms.
Set up monitoring systems to identify potential issues and trigger effective alerts for timely resolution.
Build automated event correlation on developer platforms.
Automate infrastructure provisioning, configuration management, and scaling using tools like Ansible, Puppet.
Review code changes to ensure quality, security, and adherence to DevOps best practices.
Work closely with development teams, sysadmins, and stakeholders to ensure smooth collaboration and communication.
Continuously evaluate and integrate new tools to enhance monitoring, automation, and system management capabilities.
Participate fully in all scrum team activities.


What do you need to bring:

BS/MS degree in a technology related field (e.g. Engineering, Computer Science, etc.) is desired.
8+ years of proven experience.
Strong incident triage : debugging skills.
Experienced with DevOps processes and any related technologies across a CI/CD pipeline.
Hands-on experience with Harness CI/CD platform is highly desired 
Proficient in scripting languages like Python.
Proficient in open telemetry and observability solutions like Datadog, Splunk Signal FX, Kibana, Grafana.
Familiarity with Docker and Kubernetes.
Experience of maintaining cloud infrastructure and highly available environments.
Experience in automating processes and apps using CCI/CD approach.
Proficient in using Configuration as Code, infrastructure-as-code tools such as ansible, puppet, chef & terraform.
Experience in Databases SQL (MySQL) and NoSQL (Elastic Search /MongoDB/Cassandra).
Self-Starter, Data Driven, Excellent communication skills to collaborate effectively with cross-functional teams.
Strong problem-solving skills.
Demonstrated experience of building distributed systems with high scalability & availability.
 We know the confidence gap and imposter syndrome can get in the way of meeting spectacular candidates. Please don’t hesitate to apply. 
 

Organization PayPal
Industry Engineering Jobs
Occupational Category Site Reliability Engineer
Job Location Austin,USA
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Experienced Professional
Experience 8 Years
Posted at 2025-01-07 7:09 pm
Expires on 2025-02-21