Site reliability engineer
About The Position
We are seeking a skilled and experienced DevOps/SRE (Site Reliability Engineer) to join our dynamic SRE Platform team. As an SRE Engineer, you will play a crucial role in the design, development, and implementation of our infrastructure and deployment processes.
Your primary focus will be on maintaining and improving our system's reliability, scalability, and performance along with building back-office tools and developing the Infrastructure of Aqua's SaaS services.
If you are a motivated and talented individual with a passion for building reliable and scalable infrastructure, we would love to hear from you. Join our team and contribute to the success of our organization by ensuring the availability and performance of our systems through effective DevOps and SRE practices.
- Collaborate with development teams to design and implement scalable, reliable, and efficient systems and processes.
- Manage and enhance our AWS infrastructure using Terraform and Step Functions, ensuring high availability, security, and scalability.
- Develop and maintain back-office applications using Retool.
- Automate infrastructure provisioning, configuration management, and application deployment processes.
- Monitor and troubleshoot system and application performance using monitoring tools such as Datadog.
- Implement and maintain alerting systems and response procedures for timely issue resolution.
- Continuously improve the reliability, performance, and security of our systems through optimization and automation.
- Stay updated with the latest trends and best practices in DevOps and SRE domains and apply them to enhance our infrastructure.
- Bachelor's degree in computer science, Engineering, or a related field (or equivalent practical experience).
- Proven experience as a DevOps/SRE Engineer or a similar role.
- Strong knowledge and experience with Go programming language.
- Previous experience as back-end developer is an advantage.
- Proficiency in infrastructure-as-code tools, particularly Terraform.
- Expertise in managing and optimizing AWS services, including Lambda, Step functions and EKS..
- Experience with containerization technologies, such as Docker and Kubernetes.
- Solid understanding of CI/CD concepts.
- Familiarity with monitoring and observability tools like Datadog, Prometheus, or Grafana.
- Strong problem-solving skills and the ability to analyze and resolve complex technical issues.
- Excellent communication and collaboration skills, with the ability to work effectively in a team-oriented environment.
- Ability to adapt quickly to new technologies and methodologies.
- Experience with other cloud platforms, such as Azure or Google Cloud Platform (GCP).
- Knowledge of scripting languages like Python or Bash.
- Understanding of security best practices and experience with securing cloud-based applications and infrastructure.
- Familiarity with log management and analysis tools like ELK stack or Datadog.
Aqua is a unicorn in the exciting and fast-growing cloud-native security space, leading the digital transformation of the world’s largest organizations using containers, Kubernetes, and serverless.
Every Aquarian makes an impact. Our highly collaborative culture sparks bonds amongst Aquarians empowering us to boldly drive change. We thrive by constantly being challenged to deliver the best, most innovative technological solutions to our customers.
Join our team of cloud-native experts for a world of global opportunities and continuous career development!
The Complete Cloud Native Security Platform ☁️🔒
#aquasecteam #cloudsecurity #aquaseclife