6 days ago
Live
devops4.0 - 6 yrsBangalore
awsazureterraformansiblekubernetesdockerjenkinscircleci+5 more
About the Role
We are seeking a Senior DevOps Engineer to join our team in Bengaluru, India. This is a full-time position with a hybrid work model, allowing for remote flexibility. The ideal candidate will have 4-6 years of hands-on DevOps experience and a passion for building scalable AI infrastructure.
Responsibilities
- Architect and build infrastructure that addresses real business problems using the best technology available.
- Collaborate directly with engineering teams to create platforms that resolve deployment and scaling challenges.
- Engage in hands-on work with cloud infrastructure, CI/CD pipelines, AI/ML deployment platforms, and monitoring systems.
- Make informed trade-offs between perfect infrastructure and rapid shipping, prioritizing reliability.
- Work closely with developers, AI engineers, and business stakeholders to foster collaboration.
- Explore new DevOps and AI infrastructure patterns that could lead to innovative breakthroughs.
- Assist in team building by identifying and recruiting other talented infrastructure professionals.
Requirements
Core DevOps Excellence
- 4-6 years of hands-on experience managing production systems, preferably with exposure to AI/ML systems.
- Deep knowledge of cloud infrastructure, with experience designing and managing AWS or Azure environments.
- Proficiency in Infrastructure as Code tools such as Terraform, Ansible, AWS CDK, or Azure Bicep.
- Expertise in containerization technologies, particularly Kubernetes and Docker in production settings.
- Experience in CI/CD pipeline architecture using tools like Jenkins, CircleCI, or GitHub Actions.
- Strong scripting skills in languages such as Python or Bash for automation.
- Hands-on experience with monitoring and observability tools like Prometheus, Grafana, ELK Stack, or NewRelic.
- Understanding of infrastructure security best practices.
AI Infrastructure & Modern DevOps
- Familiarity with AI/MLOps and experience in deploying and scaling AI/ML systems.
- Experience building enterprise-grade infrastructure that meets real business demands.
- Awareness of cost optimization strategies for efficient cloud infrastructure.
- Knowledge of SRE principles and experience working within tight SLOs in customer-centric environments.
Business & Collaboration Skills
- Ability to collaborate across diverse teams, technologies, and business functions.
- Empathy for customer needs and understanding of how infrastructure decisions impact user experience and business outcomes.
- Excellent communication skills, capable of explaining technical concepts to non-technical audiences.
- Appreciation for how technology infrastructure drives business success.
Culture & Mindset
- Self-starter mentality, thriving in ambiguous environments without detailed specifications.
- Platform thinking with a focus on building reusable, scalable infrastructure solutions.
- Commitment to delivering high-quality results with speed and low defect rates.
- Continuous learner who stays updated on DevOps and AI infrastructure trends.
This Role is for You If
- You are a platform-minded engineer who enjoys building infrastructure that enables teams to ship faster.
- You are excited about AI infrastructure challenges, including deploying models and managing data pipelines.
- You want to take cutting-edge AI systems into reliable production infrastructure.
- You are interested in the business impact of infrastructure decisions, not just the technical aspects.
- You thrive in dynamic environments where resources are limited and problems are undefined.
This Role is NOT for You If
- You prefer a strictly defined role with clear boundaries between DevOps, SRE, and platform engineering.
- You require detailed specifications before starting to build infrastructure solutions.
- You need extensive guidance and structure to be productive.
- You are looking for a maintenance-focused position rather than one that involves building new systems.
- You prefer to specialize deeply in one area of infrastructure rather than being versatile across the stack.
- You are uncomfortable with rapid changes in direction and technology choices.
Additional Details
- Work Mode: Hybrid, mostly remote with flexibility when required.
- Travel: Nil to minimal.
- Skills Focus: AWS, Azure, Terraform, Ansible, Kubernetes, Docker, Jenkins, CircleCI, Python, TypeScript, Prometheus, Grafana, ELK Stack, NewRelic.