Principal Software Reliability Engineer (SRE)
Location: Dearborn, Michigan US
Job Number: 10636
External Description:
At Ford Motor Company, we believe freedom of movement drives human progress. We also believe in providing you with the freedom to define and realize your dreams. With our incredible plans for the future of mobility, we have a wide variety of opportunities for you to accelerate your career potential as you help us define tomorrow's transportation.
The people of Ford Motor Credit Company have a 60-year commitment to helping put people behind the wheels of great Ford and Lincoln vehicles. By partnering with dealerships, we provide financing, personalized service and professional expertise to five thousand dealers and more than four million customers in over 100 countries around the world. If you're customer-focused, driven and seeking the opportunity to experience exciting challenges and growth, look no further.
We are looking for a Principal Software Reliability Engineer (SRE) to join our Ford Credit Command Center team. The Command Center mission is to help maintain a stable production environment through effective change, incident, and problem management. We are looking for a strong communicator and problem solver to join our team and help us transform as we embark on our SRE journey so we can improve the reliability of our software and serve our customers more effectively. If you are a team-player, have a passion for problem solving, and experience with SRE then this may be the role for you.
What you'll be able to do
As a Principle SRE you will be helping our transformation by:
- Play a key role in leading the Command Center through our SRE transformation journey
- Focus on the reliability and maintainability of existing and new systems
- Engineering creative solutions to problems, balancing between reliability and feature velocity while defining appropriate levels of service quality
- Serve as a liaison between Dev and Ops teams to ensure reliability is built into our software platforms
- Assist in design of SRE standards for new application onboarding and monitoring of existing applications and infrastructure
- Design processes to effectively define SLI's, SLO's, SLA's and error budgets
- Help identify and eliminate toil by process redesign and automation
- Ensure the right tools are in place to assess availability, latency, performance, efficiency, monitoring capabilities, emergency response actions, and capacity planning
- Optimize health monitors of applications and infrastructure using Dynatrace and Splunk
- Research and apply industry/Ford standards to maintain and improve processes
- Assist Command Center operations and eliminate toil by automating manual tasks
- Lead critical incident bridges to facilitate the containment of outages that impact operations
- Facilitate blameless post-mortems for major incidents
- Effectively capture and clearly communicate information related to major incidents
The minimum requirements we seek:
- Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering or related field or a combination of education and equivalent experience.
- 3+ years of relevant work experience in Devops
- 2+ years of experience in Java or python
- 1+ years of experience in GCP
- Master's degree in Computer Science, Computer Engineering, Electrical Engineering or related field or a combination of education and equivalent experience.
- Expertise in designing, analyzing, and troubleshooting distributed systems.
- Ability to debug, optimize code, and automate routine tasks
- Familiarity with cybersecurity tools, processes, and controls
- Prior Rally/PDO experience and familiarity with ITIL ITSM processes
- Experience with Google Cloud Administration
- Experience with BigQuery
- Effective communication skills both orally and written with various audiences
- Self-starter, motivated, ability to work independently and in a fast-paced environment
- Proven ability to develop strong working relationships, influence and motivate people
- Strong analytical skills with a logical mindset and problem-solving approach
- Knowledge of RDBMS, cloud technologies (preferably GCP), automation tools and programming experience
- Experience with configuration of monitoring and alerting tools preferably Dynatrace and Splunk
- Understanding of various operating systems including Unix/Linux, various network protocols and databases
As part of the Ford family, you'll enjoy excellent compensation and a comprehensive benefits package that includes generous PTO, retirement, savings and stock investment plans, incentive compensation and much more. You'll also experience exciting opportunities for professional and personal growth and recognition!
Candidates for positions with Ford Motor Company must be legally authorized to work in the United States on a permanent basis. Verification of employment eligibility will be required at the time of hire.
Visa sponsorship is not available for this position.
We are an Equal Opportunity Employer committed to a culturally diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, color, age, sex, national origin, sexual orientation, gender identity, disability status or protected veteran status.
For information on Ford's salary and benefits, please
https://corporate.ford.com/content/dam/corporate/us/en-us/documents/careers/2022-benefits-and-comp-GSR-sal-plan-2.pdf
Join our team as we build tomorrow! We believe in putting people first, working together, and facing challenges head-on, because we're Built Ford Tough. We're one team striving to make people's lives better while crafting value, delivering excellence and ultimately going for the win.
At Ford, the health and safety of our employees is our top priority. Vaccination has been proven to play a critical role in combatting COVID-19. As a result, Ford has made the decision to require U.S. salaried employees to be fully vaccinated against COVID-19, unless employees require an accommodation for religious or medical reasons. Being fully vaccinated means that an individual is at least two weeks past their final dose of an authorized COVID-19 vaccine regimen. As a condition of employment, newly hired employees will be required to provide proof of their COVID-19 vaccination or an approved medical or religious exemption.
Job Number: 62630BR
Community / Marketing Title: Principal Software Reliability Engineer (SRE)
Location_formattedLocationLong: Dearborn, Michigan US