Key Responsibilities:
- Monitor & Support Production Environments: Ensure high availability, performance, and operational stability of critical production systems.
- Incident Investigation & Resolution: Troubleshoot and resolve production incidents, analyse root causes, and implement preventative measures to avoid future occurrences.
- Collaboration for Deployment & Configuration: Work closely with development and infrastructure teams to deploy, configure, and optimise applications in production environments.
- Monitoring & Alerting Solutions: Design and implement robust monitoring systems to detect potential issues early, ensuring proactive remediation.
- Automation of Processes: Automate deployment and configuration tasks to enhance operational efficiency and minimise manual errors.
- Change Management: Create, document, and manage change requests for production deployments, ensuring strict adherence to established change management processes.
- Production Deployments: Execute deployments following established procedures and best practices to ensure smooth transitions.
- Application Performance & Scalability: Collaborate with development teams to ensure applications are optimised for performance and scalability in production environments.
- On-Call Support: Participate in on-call rotations to provide 24/7 production support as necessary.
Core Technology Skills:
- DevOps/Production Support Experience: Extensive experience maintaining, supporting, and troubleshooting production systems.
- Linux/Unix Administration & Scripting: Proficiency in Linux/Unix system administration and scripting (Bash, Python) with a focus on automation and system configuration management.
- Database Management: Strong knowledge of both SQL and No-SQL databases, particularly Oracle and Cosmos DB.
- CI/CD Expertise: In-depth understanding of Continuous Integration/Continuous Deployment (CI/CD) pipelines and associated tools.
- Azure DevOps (ADO): Experience in using ADO for managing CI/CD pipelines, version control, and release management processes.
- Monitoring & Alerting Tools: Familiarity with tools like Dynatrace and Application Insights for real-time performance monitoring and alerting.
- Configuration Management Tools: Experience with tools such as Chef for automating configuration management is a plus.
Additional Preferred Skills:
- Performance Profiling & Optimisation: Experience in performance tuning and optimisation techniques for enhancing system efficiency.