IT
Cloud Operations Engineer
Description
Requirements :
- 4+ years of hands-on experience working with Amazon Web Services (primary) and exposure to Microsoft Azure cloud environments.
- Strong experience implementing Infrastructure as Code (IaC) using AWS CloudFormation, AWS CDK, or Terraform.
- Experience building and maintaining CI/CD pipelines using tools such as Bitbucket, Jenkins, and native Amazon Web Services deployment services.
- Hands-on experience with observability and monitoring platforms including Grafana, Prometheus, and cloud-native logging and monitoring solutions.
- Strong understanding of cloud-native architecture, including microservices, containerized workloads, and event-driven systems.
- Experience in database administration and performance optimization for PostgreSQL, MySQL, and Microsoft SQL Server, including query tuning, configuration tuning, index optimization, and database health monitoring.
- Familiarity with database maintenance and reliability practices, such as vacuum operations, dead tuple cleanup, bloat management, replication, and performance tuning.
- Familiarity with high availability and disaster recovery strategies for cloud infrastructure and databases.
- Knowledge of cost optimization and FinOps practices in multi-cloud environments.
- Experience with log aggregation and centralized logging systems such as Elasticsearch, Logstash, and Kibana.
Nice-to-Have Skills :
- Experience working in the Media & Entertainment industry, particularly with media streaming platforms and broadcast workflows.
- Familiarity with serverless and event-driven architectures using services such as AWS Lambda, Amazon EventBridge, or so.
- Experience with containerization and orchestration technologies such as Docker and Kubernetes.
- Familiarity with security and compliance frameworks including IAM governance, security monitoring, vulnerability management, and cloud security best practices.
Soft Skills
- Proactive, detail-oriented troubleshooting in live production environments
- Clear communicator especially during incidents and RCA documentation
- Strong self-discipline and consistency in solo/on-call hours
- High empathy and cultural sensitivity when working globally across teams and time zones
- Eagerness to experiment, learn, and improve with evolving DevOps trends
- Cultural fit with DevOps values: openness, automation, collaboration, and ownership
Responsibilities :
- Design, maintain, and operate multi-cloud infrastructure (AWS and Azure)
- Develop CI/CD pipelines using Bitbucket.
- Automate infrastructure provisioning (IAC) using CloudFormation, AWS CDK, and Terraform.
- Implement observability and monitoring using Grafana, Prometheus, CloudWatch
- Manage dashboards for SLA, availability, performance and cost insights
- Automate deployments, blue/green releases, rollbacks, and patching workflows
- Enforce security best practices: IAM policies, VPC, WAF, encryption, audit logging
- Investigate and resolve production issues across distributed systems
- Collaborate with application developers to increase deployment maturity
- Integrate cutting-edge technologies (e.g., serverless, event-driven)
- Design and operate scalable multi-cloud infrastructure across Amazon Web Services and Microsoft Azure to support high-availability production systems.
- Build and maintain CI/CD pipelines using Bitbucket to enable automated, reliable software delivery.
- Implement Infrastructure as Code (IaC) with Terraform, AWS CloudFormation, and AWS CDK to standardize infrastructure provisioning and environment management.
- Establish observability and monitoring frameworks using Grafana, Prometheus, and Amazon CloudWatch to ensure system reliability, performance visibility, and proactive incident detection.
- Manage SLA and performance dashboards to monitor availability, system health, and cloud cost optimization.
- Automate deployment strategies including blue/green releases, automated rollbacks, and patch management to reduce downtime and deployment risks.
- Enforce cloud security and networking best practices, including IAM governance, VPC architecture, WAF protection, encryption, and audit logging.
- Perform database administration and performance optimization across PostgreSQL, MySQL, and Microsoft SQL Server, including query tuning, configuration tuning, index optimization, and proactive database health monitoring.
- Maintain database reliability through table maintenance and performance management (vacuum operations, dead tuple cleanup, bloat management, replication, and high availability).
- Troubleshoot and resolve production issues across distributed systems, collaborating with development teams to improve deployment maturity and system resilience.
Offers :
- Opportunity to take on a decisive role within the company
- Educational environment and collaboration with top professionals
- Health insurance coverage and free access gym