Site Reliability Engineer (SRE) Job at Openkyber, Georgia

dFVSRmZDS2huU2pkbVdZa2Z3cVJlQTFHUXc9PQ==
  • Openkyber
  • Georgia

Job Description

Job Summary We are seeking an experienced Site Reliability Engineer (SRE) to join the Applied AI and Data Science program. This role focuses on deploying, monitoring, and optimizing cloud-based applications and infrastructure to ensure high availability and performance. The ideal candidate will have strong expertise in AWS, containerized microservices, infrastructure automation, and monitoring tools.

Key Responsibilities:
  • Release Management: Build and deploy application, service, and infrastructure releases; validate system integrity post-deployment; document release notes.
  • Production Support: Maintain 99.999% availability of critical systems; monitor infrastructure and applications; perform root cause analysis for outages; respond to incidents.
  • Monitoring & Alerting: Implement monitoring policies; build dashboards; track system efficiency and resource consumption; alert stakeholders for SLA deviations.
  • Optimization: Manage resource scaling; optimize system performance and resource utilization.
  • Team Collaboration: Assist with user support; coordinate with onshore/offshore teams; develop bug fixes; become an expert in system architecture and deployment pipelines.
Required Qualifications:
  • 6+ years of DevOps or SRE experience in large, complex environments.
  • Strong background in software development (OOP) and ability to read/debug code.
  • Expertise in AWS services (EKS, S3, DocumentDB) and Terraform for Infrastructure as Code.
  • Experience with Kubernetes, containerized microservices, and cloud deployments.
  • Proficiency with GitLab or similar CI/CD tools for pipeline management.
  • Hands-on experience with monitoring tools such as Datadog or Splunk.
  • Bachelors degree in a related field or equivalent experience.
Preferred Qualifications:
  • Familiarity with Python, Node.js, React, TypeScript, and GraphQL.
  • Exposure to relational (SQL) and NoSQL databases.
  • Experience with Docker, Redis, and ORM frameworks.
  • Knowledge of experimentation, statistical testing, and data analysis.
  • Masters degree in a related field is a plus.

Education: Bachelors Degree

Job Tags

Similar Jobs

Ernst & Young

Data Analyst - Tech Con - Data and Analytics - Data Arch and Eng - FSO - Mgr - Mult Pos - 1677754 Job at Ernst & Young

 ...yourself, and a better working world for all. Data Analyst - Technology Consulting - Data and Analytics (Data Architecture and Engineering) Financial...  ...leader-enabled hybrid model. Our expectation is for most people in external, client serving roles to work together... 

Amazon

Applied Scientist, Frontier Robotics and Foundation Models Job at Amazon

 ...A leading technology company in San Francisco seeks an Applied Scientist for robotics initiatives. The role involves designing advanced deep learning models and collaborating on innovative projects that reshape the future of robotics. Candidates must have a PhD in a relevant... 

Commonpoint

Lifeguard Job at Commonpoint

Agency Overview: Commonpoint is a multifaceted community center dedicated to sustaining and enhancing the quality of individual, family and communal life throughout New York City, offering services to people of all ages, ability levels, stages of life and backgrounds...

Unilever

Territory Development Manager Job at Unilever

- Unilever Food Solutions Location: Remote- Minneapolis Who We Are Unilever Food Solutions (UFS) is the 3bn+ foodservice division of Unilever. It leads the dynamic Food Service market across its categories and has ambitious growthobjectives, marketing... 

SmartIPlace

Sr. Software Test Analyst || Lansing, MI (Hybrid) [Locals Only] || Onsite Interview Job at SmartIPlace

 ...and ADA Compliance testing. The other resource to act in a tester and analyst role for various in-flight projects. The resource...  ...(Quality Assurance, User Acceptance, System Integration, Accessibility, Performance, Regression, Post Deployment Validation, Data Conversion...