Site Reliability Engineer (SRE) Job at Openkyber, California

dFVaQWR5S2hreW5ZbldjdmZBdWFjQUJNUkE9PQ==
  • Openkyber
  • California

Job Description

Overview:

Dataflix is seeking a highly experienced Senior or Lead Platform Engineer/Site Reliability Engineer (SRE)/Hadoop Admin to manage and enhance our petabyte-scale, on-premises data platform. This platform is built using the open-source Hadoop ecosystem. The ideal candidate brings deep technical expertise, a strong understanding of distributed systems, and extensive experience operating and optimizing large-scale data infrastructure.

Responsibilities:
  • Own and operate the end-to-end infrastructure of a large-scale, on-prem Hadoop-based data platform, ensuring high availability and reliability.
  • Design, implement, and maintain core platform components, including Hadoop, Hive, Spark, NiFi, Iceberg, ELK, OpenSearch and Ambari.
  • Automate infrastructure management, monitoring, and deployments using CI/CD pipelines (GitLab) and scripting.
  • Implement and enforce security controls, access management, and compliance standards.
  • Perform system upgrades, patching, performance tuning, and troubleshooting across platform components.
  • Optimize observability and telemetry using tools like Prometheus, Grafana, and OpenTelemetry for real-time performance monitoring and alerting.
  • Proactively monitor system health, resolve incidents, and conduct root-cause analyses to prevent recurrence.
  • Collaborate with data engineering, analytics, and infrastructure teams to align platform capabilities with evolving needs.
Requirements:
  • 10+ years of experience in Platform Engineering, Site Reliability Engineering, or similar roles, with proven success managing large-scale, distributed Hadoop infrastructure.
  • Deep expertise in the Hadoop ecosystem, including HDFS, YARN, Hive, Spark, NiFi, Ambari, and Iceberg.
  • Strong Linux system administration skills (CentOS/Rocky preferred), including system tuning, performance optimization, and troubleshooting.
  • Proficiency in containerization and orchestration using Docker and Kubernetes.
  • Solid experience with automation and Infrastructure as Code, leveraging tools like GitLab CI/CD and scripting in Python and bash.
  • Practical knowledge of monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry) and understanding of system health, alerting, and telemetry.
  • Familiarity with networking concepts, security protocols, and data compliance requirements.
  • Experience managing petabyte-scale data platforms and implementing disaster recovery strategies.
  • Understanding of data governance, metadata management, and operational best practices.

Job Tags

Similar Jobs

MaineHealth Accountable Care Organization

Hybrid Advanced Heart Failure & Transplant Cardiologist Job at MaineHealth Accountable Care Organization

A leading healthcare organization is looking for a Heart Failure cardiologist to join a skilled team in Portland, ME. This full-time role includes inpatient care at Maine Medical Center, teaching responsibilities, and collaboration with multidisciplinary care teams. Ideal...

Route Elite

FedEx Ground Delivery Driver Job at Route Elite

 ...Join our team and begin your future in FedEx Delivery TODAY through Rollin Logistics ! Our distribution center is located at Fife. We are looking for people who have been Local Drivers, Route Drivers, Truck Drivers, Couriers, Pick Up Drivers, Delivery drivers - and... 

MSI Inc.

Entry Level Event Assistant Job at MSI Inc.

 ...with leading brands through experiential marketing, live events, and promotional activations . Driven by creativity,...  ...impressions. We are currently hiring an Entry-Level Event Assistant to support the planning and execution of live events, brand activations, and... 

SupportFinityâ„¢

Manager, Tax - iTaxTech Job at SupportFinityâ„¢

 ...At KPMG, you can become an integral part of a dynamic team at one of the world's top tax firms. Enjoy a collaborative, future-forward culture that empowers your success. Work with KPMG's extensive network of specialists; enjoy access to our Ignition Centers, where deep... 

TheIncLab

Jr. Product Designer Job at TheIncLab

 ...career can meet purpose as well. We are looking for a Junior Product Designer to play a key role in shaping modern, intuitive applications...  ...obtain a U.S. Security Clearance at the Secret or Top-Secret level. Existing clearance is preferred. Benefits At TheIncLab...