Executive Profile
Technical Operations & Operational Resilience Leader with 25 years' experience across global SaaS and Tier-1 financial services. Proven record leading major incident command, governing enterprise change processes, and strengthening platform reliability across large-scale Linux estates in 24/7 trading and cloud environments. Experienced in aligning Engineering, SRE, Security, and executive stakeholders during high-impact incidents while driving long-term systemic improvements. Combines deep technical credibility with operational governance leadership. RHCSA certified.
Executive Capabilities
Operational Resilience & Incident Leadership
- Major incident command (customer-facing & internal)
- Crisis bridge leadership & executive communication
- Blameless post-incident review governance
- Service restoration within SLA environments
Change & Risk Governance
- Chairing Change Advisory Board (CAB)
- Enterprise change risk oversight
- Production risk mitigation & rollback governance
- Continuous improvement of ITIL-aligned processes
Platform & Infrastructure Oversight
- Large-scale Linux estates
- High-availability & clustered environments
- DNS & core services architecture
- Disaster Recovery coordination & validation
Organisational Leadership
- Cross-functional coordination (Engineering, SRE, Security)
- Customer-facing incident communication
- Mentorship & capability development
- Product ownership for internal operational tooling
Career Highlights
Box · United Kingdom
Senior operational leader at the intersection of Engineering, SRE, Security, and Executive Leadership - responsible for incident command, change governance, and operational maturity across production environments.
- Leads enterprise-level response to critical production incidents affecting customers and internal systems
- Acts as executive liaison during major incidents, providing structured updates and impact analysis
- Chairs the Change Advisory Board, governing risk for all production changes
- Drives accountability through structured post-incident reviews and remediation tracking
- Acts as Product Owner for internal operational tooling; mentors engineers into senior roles
ServiceNow · United Kingdom
- Directed cross-functional response to major security incidents across a global cloud platform
- Improved detection workflows through alert automation; delivered executive-level reporting and trend analysis
- Represented Security Operations in customer audits; served as Linux & infrastructure SME
ServiceNow · United Kingdom
- Selected to lead architectural redesign and replacement of internal and public DNS infrastructure
- Delivered phased rollout with rollback safeguards; migrated public-facing DNS with zero major outage
ServiceNow · United Kingdom
- Orchestrated multi-team resolution of high-impact production incidents; reduced escalation risk through early SRE intervention
Nomura (via Southnet Ltd)
- Operational responsibility for 4,500+ Linux and Solaris servers in a 24/7 trading environment
- Delivered large-scale datacentre and storage migrations with minimal downtime; coordinated DR testing across business units
Earlier Career (2000–2011) - Senior Solaris Consultant and Infrastructure Architect roles across financial services and enterprise environments, specialising in high availability clustering, Oracle RAC deployments, DR architecture, and large-scale infrastructure automation. Full project portfolio available on request.
Certification
🎓 RHCSA - Red Hat Certified System Administrator
Profile
25 years' experience in IT, spanning investment banking (Reuters & Nomura), global SaaS operations, and security engineering. Comfortable owning the crisis bridge at 3am and presenting impact analysis to the board the same morning. Deep hands-on Linux and infrastructure expertise, combined with a strong operational governance record. RHCSA certified.
Technical Skills
| Certification | RHCSA (170-040-877) |
| Linux | RHEL (to 9.1), Fedora, CentOS |
| Virtualisation | VMware ESXi, vSphere, vCenter, VirtualBox |
| Solaris | Solaris 8–11 installation & administration |
| Clustering | Oracle Solaris Cluster to 4, Veritas Cluster (VCS) to 6, ClusterLabs |
| Storage | VxVM, VxFS, ODS/SVM, QFS, ZFS, LVM, Veritas NetBackup |
| Networking | TCP/IP, DNS (Bind, gdnsd, PowerDNS), DHCP, NIS, NFS, Samba, LDAP, Firewall, VPN |
| Security | SSH, VPN, AIDE, sudo, RBAC, Kerberos |
| Monitoring | BMC Patrol, Quest Foglight, Nagios, Splunk, syslog-ng |
| Automation | Puppet, Bash/KSH scripting |
| Service Desk | ServiceNow, BMC Remedy 7–8, Jira, Bugzilla |
| Microsoft | Windows Desktop/Server, Active Directory, Remote Access/VPN |
Professional Experience
Box · United Kingdom
- Provides leadership during critical incidents affecting internal and customer-facing systems - coordinating engineering response, keeping stakeholders (including customers) updated, and driving post-incident follow-through on process, procedure, and documentation improvements
- Oversees the Change Management process and chairs the Change Advisory Board, which holds overall ownership of the change lifecycle
- Provides guidance and continuous improvement of the Incident Management process
- Mentors team members and cross-functional engineers to support career progression
- Acts as Product Owner for several internal tools, providing direction across their development lifecycle
ServiceNow · United Kingdom
- Coordinated technical response on security and major incidents; resolved escalations from other teams
- Investigated security incidents and determined appropriate resolution paths
- Reviewed existing alerts to automate information gathering, reducing analyst workload
- Updated alert creation process to encourage better inter-team communication
- Provided automated reporting around incident statistics for leadership visibility
- Represented the Security Operations team in customer audits
- Acted as Subject Matter Expert for Linux and network infrastructure-related incidents
- Updated and enhanced documentation; mentored junior team members
ServiceNow · United Kingdom
Seconded to the Systems Engineering team to replace the entire DNS infrastructure (internal and public-facing).
- Designed a test plan to simulate expected production load on the new service
- Iterated through architecture options to determine optimal fit; validated designs in lab environments
- Created engineering documentation detailing the new services, upgrade paths, and rollback procedures
- Created and executed a crawl/walk/run rollout plan; oversaw validation testing across the internal environment
- Replaced public-facing DNS servers with full validation testing, including edge cases - zero major outage
- Created support documentation enabling another team to provide first-line support post-handover
ServiceNow · United Kingdom
- Acted as incident and crisis manager, orchestrating multi-team effort to resolve time-critical situations
- Handled escalations from the SRE team to resolve issues before they became crises
- Followed up on post-incident reviews and drove organisation-wide change to prevent recurrence
- Provided technical leadership for the SRE team, responsible for the availability and performance of ServiceNow's cloud platform
SamKnows (via Southnet Ltd) · United Kingdom
- Maintained 500+ servers globally; liaised with global partners and datacentres for upgrades, additions, and decommissions
- Implemented Puppet across the global estate; integrated Nagios with Puppet for reliable monitoring deployment
- Set up LDAP/Kerberos centralised authentication; enhanced reliability with DNS-based failover using gdnsd
- Used RRDtool for performance analysis; documented all enhancements and changes
Nomura (via Southnet Ltd) · Investment Banking
- BAU and on-call responsibility for 4,500+ Linux and Solaris servers in the EU, a large multi-site VMware estate, several VCS clusters, and Kerberos/NIS/DNS services
- Migrated services to new datacentres; upgraded servers to new global build standards with minimal downtime
- Migrated servers to upgraded storage - often with no reboots required
- Coordinated and conducted DR tests for business units
- Assisted Engineering in specification, architecture, and deployment of multiple system integration initiatives
- Maintained team and procedural documentation; mentored interns and graduate new joiners
Monitise Group Ltd (via Southnet Ltd)
- Designed and implemented a new backup solution using Symantec NetBackup with LTO5 drives for speed and native encryption
- Migrated standalone systems to a new corporate cluster to increase reliability; delivered knowledge transfer to the sysadmin team
Rank Interactive / Blue Square (via Southnet Ltd)
- Created a custom JET wrapper to rebuild datacentres including separation of PCI and non-PCI environments, with automated LDOM and zone installs
- Implemented RBAC policy; migrated from NIS to LDAP; installed and configured AIDE for system auditing
EDS (via Southnet Ltd)
- Implemented a custom JET wrapper to build a new production environment across multiple datacentres, including Sun Cluster and zones
- Created a JET wrapper to build RAC (10.2.0.4) on Sun Cluster in under 4 hours - from initial boot to a running database
- Set up 2 Oracle RAC clusters across 2 M8000s on the primary site and a RAC cluster on the DR M8000
- Delivered training and documentation to enable the team to use all tools created during the engagement
UCAS (via Southnet Ltd)
- Migrated high-traffic websites to a highly available configuration on new servers while changing the underlying infrastructure
- Prepared kit for DR; ran DR tests; configured NetBackup for newly purchased VTL setup
Capgemini (via Southnet Ltd)
- Reviewed the existing production environment for security and operational issues
- Built a Pre-Production ITSM (BMC Remedy) environment mirroring production; reconfigured production as a cluster for end-user resilience
Earlier Roles (2000–2008) - details available on request:
Senior Unix Systems Administrator & Infrastructure Architect - Kwari Ltd (Jun 2007–Apr 2008)
Senior Unix Systems Administrator - Cable & Wireless / Bulldog (Mar 2006–Jun 2007)
Unix Specialist - Nomura via Morse (Jul 2004–Mar 2006)
Associate Site Engineer - Reuters & Nomura via Sun Microsystems (May 2002–Jul 2004)
Support Hub Engineer - Sun Microsystems Inc. (Dec 2000–May 2002)
Certification
🎓 RHCSA - Red Hat Certified System Administrator (170-040-877)
Interests
Home networking & technology, reading, swimming, DIY.