We are building one of the best teams in Latin America, focused on improving the lives of millions of people undeserved by traditional banks. At Kueski, we are evolving our technology stack to support millions of users, leading in the fintech space with innovative technological solutions. The Engineering Development Team at Kueski designs, implements and maintains all the codebase used by both external and internal clients.
As a Site Reliability Engineer you will work as an integral part of our Platform team that will enable exciting new features. The ideal candidate has extensive Site Reliability knowledge and experience, and has previously managed modern cloud and vast infrastructure in a fast-paced, agile environment.
Strong background in Linux/Unix administration and scripting
Extensive experience managing and configuring public cloud providers, specifically AWS Cloud
Experience using advance bash
Experience using ansible
Experience with monitoring and analytics using Grafana or similar
Experience with configuring and maintaining Jenkins and Jenkins Pipeline
Knowledge of best-practice security and networking techniques for public facing systems
Strong experience with MongoDB, PostgreSQL, and related database technologies
Experience configuring, managing, and scaling Elasticsearch or similar
Knowledge of best practices and IT operations in an always-up, always-available mission critical service
Experience writing production ready code in Python and/or Ruby Solid understanding of backup/restore best practices
Excellent Troubleshooting Skills
Instrument code build tools and dashboards to help visualize and understand real-time system health, usage, and performance metrics.
Troubleshoot and resolve issues in our development, test and production environments.
Work with the platform team to identify and fix software/system performance bottlenecks and stability issues.
Configure applications focused on fault tolerance.
Automate day-by-day sysadmin workload.
Be the hero by saving the world everyday.
Fix scalability problems with very large and constant growing production systems.
Oversee the infrastructure and service health monitoring process to enable proactive issue mitigation and expedited issue resolution.
Design, manage, and maintain internal tools to support engineering, operations, research and/or support processes.
Contribute to overall system scalability to ensure Kueski ability to deliver high availability, low latency services.
Understand, implement, and automate security controls, governance processes, and compliance validation.
Stay up-to-date on relevant technologies, plug into user groups, understand trends and opportunities to ensure we are using the best possible techniques and tools.