Time Zone: EST / North America
Full-time / Long-term Contract
Role Overview
• Looking for an SRE with strong experience in OpenStack and private cloud environments
• Role focuses on production support, troubleshooting, and platform reliability
• Requires hands-on expertise in Linux, networking, and storage
• Involves close collaboration with engineering teams and customer interaction
Key Responsibilities
• Troubleshoot complex issues in OpenStack and Linux environments
• Manage and support OpenStack services including Nova, Neutron, Cinder, and Keystone
• Perform root cause analysis (RCA) and drive long-term fixes
• Participate in incident management and on-call rotations
• Monitor system performance, availability, and reliability
• Collaborate with engineering teams on fixes and improvements
• Communicate effectively with customers via calls and written channels
• Perform system optimization and performance tuning
Must-Have Skills
Linux, Networking Storage Fundamentals
• Strong understanding of Linux internals and system performance
• Experience with kernel tuning and troubleshooting
• Hands-on experience with filesystems and disk management
• Knowledge of partitions and system-level troubleshooting
• Experience with LVM and SCSI multipath
• Basic understanding of Ceph
• Ability to troubleshoot IO and performance issues
• Knowledge of DHCP, DNS, VLANs, and network bonding
• Understanding of basic routing concepts
OpenStack Operations Troubleshooting
• Hands-on experience with OpenStack services such as Nova, Neutron, Cinder, and Keystone
• Experience managing production environments
• Strong troubleshooting and debugging skills
• Ability to handle customer-facing technical issues
• Experience performing root cause analysis
Good To Have Skills
• Basic understanding of Kubernetes concepts
• Experience with monitoring tools like Prometheus and Grafana
• Knowledge of metrics, logging, and alerting systems
• Basic scripting skills in Python or Go
• Exposure to automation and observability practices
Soft Skills
• Strong problem-solving and analytical thinking
• Ability to work in high-pressure production environments
• Clear and effective communication skills
• Proactive mindset toward issue prevention
• Comfortable working in remote, distributed teams