Senior Engineering Manager - Machine Learning Platform in Jersey City, NJ at honor foundations

Date Posted: 12/5/2024

Job Snapshot

Job Description

What We Do

At Goldman Sachs, our Engineers don’t just make things – we make things possible.  Change the world by connecting people and capital with ideas.  Solve the most challenging and pressing engineering problems for our clients.  Join our engineering teams that build massively scalable software and systems, architect low latency infrastructure solutions, proactively guard against cyber threats, and leverage machine learning alongside financial engineering to continuously turn data into action.  Create new businesses, transform finance, and explore a world of opportunity at the speed of markets.

Engineering, which is comprised of our Technology Division and global strategists’ groups, is at the critical center of our business, and our dynamic environment requires innovative strategic thinking and immediate, real solutions.  Want to push the limit of digital possibilities?  Start here.

Who We Look For

We are seeking a highly skilled and visionary engineering leader to join our Artificial Intelligence Platforms organization to lead the team responsible for both the Python programming language along with our containerized Artificial Intelligence/Machine Learning (AI/ML) Runtime environments across the firm. As the head of Python, you will be responsible for not only establishing a vibrant and collaborative Python community across the firm, but also providing the tooling necessary to ensure that our thousands of Python users can achieve their strategic objectives in a frictionless manner. As the head of AI/ML Runtimes, you will be responsible for building the base environments that enable the delivery of most cutting edge AI/ML models across the firm, using the latest GPU hardware and frameworks across multiple cloud providers.

Key Responsibilities:

  • Establish and deliver on a strategic vision for the future of Python & AI/ML Runtimes across Goldman Sachs (GS)
  • Lead, mentor and manage a team of Python & ML Ops engineers
  • Foster a culture of innovation, collaboration and continuous improvement within the team
  • Collaborate with business customers, owners of open-source frameworks, and leading AI/ML hardware & software providers to design and implement containerized runtime environments that enable efficient development and deployment for ML/AI models, firmwide
  • Leverage MLOps and CI/CD best practices to implement fully automated build and deploy processes
  • Utilize your proficiency in Unix-based systems to ensure AI/ML software/frameworks function as intended
  • Create example notebooks that demonstrate how to effectively leverage the AI/ML software provided within the runtime environment
  • Evangelize MLOps best practices and the intricacies of the various AI/ML software/frameworks
  • Remain up to date with the latest advancements in Python & AI/ML frameworks and related technologies and driving adoption of those that align with strategic objectives

Basic Qualifications:

  • 6+ years of experience in Python programming for Machine Learning and/or application development
  • 6+ years of experience building and maintaining containerized runtime environments for Data Science and Machine Learning (e.g. PyTorch, TensorFlow)
  • 4+ years of experience managing engineering teams
  • 4+ years of experience with Unix-based systems
  • 4+ years of experience building automated CI/CD pipelines for containers
  • 4+ years of experience building and maintaining containerized runtime environments supporting GPUs (e.g. CUDA)

Preferred Qualifications:

  • Proven experience in leading and managing high-performing engineering teams
  • Strong understanding of Python frameworks, packages and tools
  • Prior experience in building containerized runtime environments and frameworks such as TensorRT, ONNX, MPI and DeepSpeed 
  • Experience with infrastructure-as-code tools, such as Terraform or CloudFormation
  • Experience with Kubernetes and other container orchestration platforms.
  • Experience running containers in the public cloud (e.g. AWS, GCP)
  • Strong problem-solving skills and the ability to work effectively in a fast-paced and collaborative environment.
  • Excellent communication skills and the ability to articulate complex technical concepts to both technical and non-technical stakeholders.

Salary Range

The expected base salary for this Jersey City, New Jersey, United States-based position is $150000-$250000. In addition, you may be eligible for a discretionary bonus if you are an active employee as of fiscal year-end.

Benefits

Goldman Sachs is committed to providing our people with valuable and competitive benefits and wellness offerings, as it is a core part of providing a strong overall employee experience. A summary of these offerings, which are generally available to active, non-temporary, full-time and part-time US employees who work at least 20 hours per week, can be found here.

', 'CorporateDescriptionStr' : '', 'OrganizationDescriptionStr' : '', 'ShortDescriptionStr' : 'Same as external posting ', 'ContentLocale' : 'en', '