Soumya J.K.
Senior Site Reliability Engineer · AWS · Observability
Summary
Senior Site Reliability Engineer with 8+ years of experience in cloud infrastructure, observability, and CI/CD. Hands-on expertise in AWS, Terraform, and Elastic, with a focus on automation, reliability, and performance. Proven success in maintaining high-availability systems and improving platform efficiency. Strong collaborator across global teams in the UK, EU, and APAC regions.
Core skills
AWS (IAM, ECS, EC2, S3, CloudWatch, FIS)
Terraform
Elastic Cloud
Observability
Splunk
New Relic
Zabbix
Jenkins
Ansible
Git
Python
Shell
Linux
Jira
Postmortems
SLA tracking
Agile (Scrum)
Experience
Trainline
— Senior Site Reliability Engineer
· London, UK · Mar 2024 – Present
- Manage long-term observability data lifecycle in Elastic Cloud.
- Drive chaos engineering initiatives using AWS Fault Injection Simulator.
- Partner with UK, EU, and offshore teams for issue triage and feature rollout.
- Collaborated with cross-functional teams on POCs exploring MCP servers to support AI adoption and experimentation.
Trainline
— Site Reliability Engineer
· London, UK · Jul 2022 – Mar 2024
- Led migration from in-house Elasticsearch to Elastic Cloud, improving stability and cost visibility.
- Integrated AWS cost usage data into the observability stack to enable proactive cost optimisation.
- Partnered with finance and platform teams across geographies to align resource usage.
- Attempted to establish SLIs/SLOs to improve observability and measure service reliability and performance.
Trainline
— Senior Platform Operations Engineer
· London, UK · Mar 2020 – Sep 2021
- Designed secure AWS IAM architecture with SSO and temporary access elevation.
- Collaborated across business units to deploy IAM policies in multiple regions.
Trainline
— Platform Operations Engineer
· London, UK · Jan 2019 – Apr 2020
- Championed Terraform adoption, introducing reusable modules for dev teams.
- Conducted internal workshops for global teams to standardise IaC practices.
Wonga
— Senior Production Support Engineer
· London, UK · Jan 2018 – Dec 2018
- Managed observability stack (Splunk, Zabbix, New Relic) across hybrid AWS/data centre environments.
- Migrated CI/CD tools (Jenkins, Artifactory, Gerrit) to AWS, reducing on-prem maintenance.
Wonga
— Production Support Engineer
· London, UK · Feb 2017 – Dec 2017
- Ensured 24×7 platform reliability as part of a global production support team.
- Led postmortem reviews and coordinated with vendors and offshore teams for rapid resolution.
Infosys (Client: Goldman Sachs)
— Senior Systems Engineer
· Bangalore, India · Feb 2014 – Feb 2017
- Supported reconciliation systems processing millions of trades daily.
- Developed automation scripts improving incident resolution and operational efficiency.
- Led follow-the-sun support rotations between India, UK, and US teams.
Certifications
- AWS Solutions Architect – Associate
- AWS SysOps Administrator – Associate
- AWS Developer – Associate
- ITIL Foundation v3 (AXELOS)
Education
B.Tech in Information Technology
· University College of Engineering, Trivandrum · 2009 – 2013
Contact
Best way to reach me: