Machine Learning Engineer, Performance Optimization
Databricks
On-site
San Francisco, CA, United States
Full-time
$192,000 -
$260,000
About Databricks
Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow.
About the Role
Explore and analyze performance bottlenecks in ML training and inference
Design, implement and benchmark libraries and methods to overcome aforementioned bottlenecks
Build tools for performance profiling, analysis, and estimation for ML training and inference
Balance the tradeoff between performance and usability for our customers
Facilitate our community through documentation, talks, tutorials, and collaborations
Collaborate with external researchers and leading AI companies on various efficiency methods
Design, implement and benchmark libraries and methods to overcome aforementioned bottlenecks
Build tools for performance profiling, analysis, and estimation for ML training and inference
Balance the tradeoff between performance and usability for our customers
Facilitate our community through documentation, talks, tutorials, and collaborations
Collaborate with external researchers and leading AI companies on various efficiency methods
Qualifications
Hands on experience the internals of deep learning frameworks (e.g. PyTorch, TensorFlow) and deep learning models
Experience with high-performance linear algebra libraries such as cuDNN, CUTLASS, Eigen, MKL, etc.
General experience with the training and deployment of ML models
Experience with compiler technologies relevant to machine learning
Experience with distributed systems development or distributed ML workloads
Hands on experience with writing CUDA code and knowledge of GPU internals (Preferred)
Publications in top tier ML or System Conferences such as MLSys, ICML, ICLR, KDD, NeurIPS (Preferred)
We value candidates who are curious about all parts of the company's success and are willing to learn new technologies along the way.
Experience with high-performance linear algebra libraries such as cuDNN, CUTLASS, Eigen, MKL, etc.
General experience with the training and deployment of ML models
Experience with compiler technologies relevant to machine learning
Experience with distributed systems development or distributed ML workloads
Hands on experience with writing CUDA code and knowledge of GPU internals (Preferred)
Publications in top tier ML or System Conferences such as MLSys, ICML, ICLR, KDD, NeurIPS (Preferred)
We value candidates who are curious about all parts of the company's success and are willing to learn new technologies along the way.

