Freedom Financial Network
Receive alerts when this company posts new jobs.
Principal Site Reliability Engineer
at Freedom Financial Network
This is an exceptionally exciting opportunity to support our development and datacenter operations teams. The ideal individual has 5 to 7 years of experience in datacenter multi-role support (network, security, storage, virtualization and server platforms), experience with VMware administration, *nix (particularly redhat/centos) administration, monitoring software such as Prometheus and Nagios, and the ability to script PHP, python, powershell, bash, or similar languages.
To be successful, we need to scale and you will have a significant amount of input on what our plan will be. You will also learn and try out new technologies that are just being adopted. Your role is critical in ensuring that our infrastructure can support our fast-paced growth on the customer side and that we can scale our development processes as our team grows.
All areas of responsibility listed below are essential to the satisfactory performance of this position by any/all incumbents, with reasonable accommodation, if necessary. Any non-essential functions are assumed to be other related duties as assigned.
KNOWLEDGE, SKILLS AND ABILITIES:
- Analyze our current development infrastructure/operations and update and improve upon
- Get hands on our code deployment Investigate and setup new processes. Specific things that we are interested in for the first few months are puppet/foreman/jenkins implementation and jboss fine-tuning.
- As tasks are completed, you will start contributing on some backend development (python/PHP/powershell)
- Your performance will be measure based on improved deployment process/times, reduction in server alarms and more efficient operation of The goal is that our app traffic scales without an increase in error reports and maintaining/improving our error rate, response times, and uptime.
- Proactively support 24x7x365 production and disaster recovery datacenter operations.
- Minimum 5 years’ experience supporting mission critical production websites
- You've monitored and managed scalable cloud and on-prem clusters that handle tremendous load
- You’re service-oriented, and enjoy working with engineers to make the software development process painless as possible
- You’ve delivered on-prem and cloud platforms, it services for google compute or azure, VMware-based clouds and/or hybrid infrastructures
- You have experience setting up, configuring and migrating to public cloud platforms including the use of migration tools
- You have experience operating devops/infrastructure automation
- You have strong python/PHP experience since we would like you to contribute on backend development as well
- You’ve carried a cell phone/pager, and know what to do when things go wrong (don't worry, we carry them too, and work hard to make sure they never go off!)
- You have strong iaas/paas background and a willingness to build a best-in-class environment
- Exceptional technical skills in this area including:
- Redhat/centos administration
- GCP networking experience
- GCP deployment automation
- Open source automation (ansible/jenkins/puppet/chef)
- Platform-as-a-service support
Bonus points for:
- Intellectual curiosity that motivates you to keep on top of technical trends
- Self-motivated and able to work independently and/or in small teams
- Fearless in the face of massive technical challenges
- You’ve got experience with continuous integration, log collection and analysis, builds, or performance monitoring
- You’ve managed server and app deployment, and follow secure deployment practice