Devops SRE
Remote
Fulltime
JD -
Primary Responsibilities:
Responsible for Deployments to Cloud Infrastructure. Develop and setup Datadog monitors and tracking for Micro-services and Micro-Front-End applications.
Create custom metrics to track the page and API performance
Required Skillset:
• Proven experience (10 years) working as an SRE with a specific focus on Microsoft Azure Cloud services and OCI
• Deep understanding of Cloud services, including Docker and Kubernetes Service API and tooling in Azure and OCI.
• Proficiency in scripting and programming languages (e.g., PowerShell, Python) for automation, infrastructure management, and tool development.
• Experience with scalable networking technologies, including Linux, software-defined networking, network virtualization, open protocols, App acceleration, Load Balancers, DNS, virtual private networks, and their application in PaaS and IaaS technologies
• Strong incident management skills, with a data-driven and analytical approach to diagnosing complex issues.
• Familiarity with Infrastructure as Code (IaC) tools (e.g., Terraform, ARM templates) and configuration management tools.
• Excellent problem-solving skills, attention to detail, and a proactive attitude towards addressing operational challenges.
• Effective communication and collaboration skills, with the ability to work across teams and influence technical decisions.
• Experience with CI/CD pipelines and version control systems (e.g., Git).
• Develop and implement comprehensive monitoring and analytics solution using Datadog for a cloud-based microservices architecture
• Develop dashboards using modern monitoring tools (e.g. Dynatrace, AppDynamics, Splunk, etc)
• Analyze monitoring data to identify trends and root causes of incidents, leading to continuous improvement of system health.
• A strong understanding of DevOps principles and automation practices.