Site Reliability Engineer (SRE) Job at PayPay, Remote

aTJjNXpOOXk4WEEyeUc3UlkrcUUvMnBYMlE9PQ==
  • PayPay
  • Remote

Job Description

About PayPay

PayPay is a FinTech company that has grown to over 69M (as of May 2025) users since its launch in 2018. Our team is hugely diverse with members from over 50 different countries.

OUR VISION IS UNLIMITED_

We dare to believe that we do not need a clear vision to create a future beyond our imagination. PayPay will always stay true to our roots and realize a vision (future) that no one else can imagine by constantly taking risks and challenging ourselves. With this mindset, you will be presented with new and exciting opportunities on a daily basis and have the opportunity to grow and reach new dimensions that you could never have imagined. We are looking for people who can embrace this challenge, refresh the product at breakneck speed and promote PayPay with professionalism and passion.
※ Please note that you cannot apply or be selected in parallel with PayPay Corporation, PayPay Card Corporation and PayPay Securities Corporation.

Job Description

At PayPay, we’re constantly working on improving our systems and processes to support PayPay’s exponential growth. As an SRE at PayPay, we strive towards ensuring high availability and top-level performance so that our users can have flawless and reliable service exceeding expectations.
Considering PayPay’s growth, we are looking for experienced SREs who can deliver insights into system bottlenecks and ensure system reliability and scalability, while increasing the number of services that our company offers.
We are looking for individuals who can bring informed and unique viewpoints, enjoy collaborating with a cross-functional team and are actively pushing boundaries to develop reliable and scalable solutions and positive user experiences.

Key Responsibilities

  • Analyze current technologies used in the company and develop monitoring and notification tools to improve observability and visibility.
  • Ensure system stability by pre-emptively verifying failure scenarios and implement solutions to reduce MTTR
  • Develop solutions to improve system performance with a focus on high availability, scalability and resilience
  • Integrate telemetry and alerting platforms to track and improve reliability of systems
  • Implement industry best practices for system development, configuration management and system deployment
  • Ensure seamless flow of information between teams by documenting knowledge gained
  • Be up to date on modern technologies and trends to advocate for inclusion within products if they add value
  • Participate in incident management including troubleshooting production issues, driving root cause analysis (RCA) and actively sharing lessons learned to improve system reliability and internal knowledge.

Qualifications

  • Experience troubleshooting, tuning high performance microservice architectures running on Kubernetes and AWS in highly available production environments.
  • 5+ years experience in software development in Python, Java, Go, etc with strong fundamentals in data structures, algorithms, problem solving and complexity analysis.
    *During the selection process, you will have a coding challenge.
  • Curious and proactive in finding performance bottlenecks, scalability and resilience problem areas and addressing them.
  • Experience with observability tools and gathering data.
  • Database knowledge such as RDS, NoSQL, distributed TiDB, etc.
  • Excellent communication skills, collaborative and getting things done attitude.
  • Enjoy taking up a challenge and driving it to conclusion.

Preferred Qualifications

  • Container image management and optimization.
  • Experience in large distributed system architecture and capacity planning.
  • Understanding of IaC, automation tools, terraform, cloud formation, etc.
  • Background in SRE/DevOps concepts and implementation.
  • Experience in managing monitoring tools like CloudWatch, VictoriaMetrics, Prometheus and reporting with Snowflake and Sigma.
  • In depth knowledge of web technologies such as CloudFront, Nginx, etc.
  • Experience in designing, implementing or maintaining disaster recovery strategies and multi-region architecture to ensure high availability, resilience, and business continuity across critical systems.
  • Language ability in Japanese is a plus.

PayPay 5 senses

  • Please refer  PayPay 5 senses  to learn what we value at work.

Working Conditions 

Employment Status

  • Full Time

Office Location

  • Hybrid Workstyle (flexible working style including Remote and office)
    ※There are no fixed rules regarding office attendance in Product group; it depends on each individual's discretion.

Work Hours

  • Super Flex Time (No Core Time)
  • In principle, 9:00am-5:45pm + 1h break (actual working hours: 7h45m + 1h break)

Holidays

  • Every Sat/Sun/National holidays (In Japan)/New Year's break/Company-designated Special days

Paid leave

  • Annual leave (up to 14 days in the first year, granted proportionally according to the month of employment. Can be used from the date of hire)
  • Personal leave (5 days each year, granted proportionally according to the month of employment)
    *PayPay's own special paid leave system, which can be used to attend to illnesses, injuries, hospital visits, etc., of the employee, family members, pets, etc.

Salary

  • Annual salary paid in 12 installments (monthly)
  • Based on skills, experience, and abilities
  • Reviewed once a year
  • Special Incentive once a year *Based on company performance and individual contribution and evaluation
  • Late overtime allowance

※Payroll payment can be changed to digital salary payment “PayPay Paycheck” for an amount set by you

Benefits

  • Social Insurance (health insurance, employee pension, employment insurance and compensation insurance)
  • 401K
  • Translation/Interpretation support
  • VISA sponsor + Relocation support

Other Information:

Job Tags

Remote job, Full time, Work at office, Visa sponsorship, Relocation package, Flexible hours

Similar Jobs

UT Health East Texas

Registered Nurse / RN Trauma IMC Job at UT Health East Texas

Overview: Join our team as a night shift, full-time, Trauma Intermediate Care (IMC) Registered Nurse (RN) in Tyler, TX. Why Join Us? Thrive in a People-First Environment and Make Healthcare Better Thrive: We empower our team with career growth opportunities, tuition... 

The Home Depot

Online Video Producer Job at The Home Depot

 ...Work closely with partners within the organization to identify video content needs, overseeing the process from concept to completion. As the owner of this content development, the Online Video Producer will be responsible for ensuring the end product meets company brand... 

Bayer

Process Engineer II Job at Bayer

At Bayer were visionaries, driven to solve the worlds toughest challenges and striving for a world where 'Health for all Hunger for none is no longer a dream, but a real possibility. Were doing it with energy, curiosity and sheer dedication, always learning from unique...

Spot On Media

Data Entry Clerk Remote | Part-Time or Full-Time | No Experience Needed Job at Spot On Media

Were looking for reliable and detail-oriented individuals to join our team as Remote Data Entry Clerks. This is a flexible opportunity ideal for someone who wants to work from home, stay organized, and handle simple administrative tasks in a supportive environment. Whether...

Department of Consumer Affairs

Field Investigator Job at Department of Consumer Affairs

 ...upon the rules and regulations of the Contractors State License Law. In all job functions, employees are responsible for...  ...interviewing techniques; duties of Federal, State, and local law enforcement agencies; provisions of the laws, rules, or regulations enforced...