Compensation: $220K–$300K + Equity
Department: Applied Research
Location: Remote (US-based) | Full-Time
We’re seeking a Senior Machine Learning Engineer to help optimize the performance of state-of-the-art foundation models across a diverse range of hardware environments. If you're passionate about performance tuning, systems-level thinking, and scaling ML workloads beyond NVIDIA/CUDA constraints, this is your chance to shape the frontier of AI infrastructure.
Design and maintain abstractions that scale model performance efficiently across heterogeneous hardware platforms—not just CUDA/NVIDIA.
Profile and optimize memory usage, latency, and throughput in PyTorch; build or integrate low-level solutions (e.g., Triton kernels) as needed.
Benchmark our model and system performance to guide product decisions around cost, throughput, and deployment tradeoffs.
Collaborate with hardware and systems partners to uncover bottlenecks and push for performance improvements in future iterations.
Work hand-in-hand with research and engineering teams to ensure systems are planned and built with efficiency in mind from the start.
Deep experience profiling and optimizing PyTorch code for performance (memory, latency, throughput).
Familiarity with tools like torch.compile , torch.XLA , PyTorch profiler, and memory or trace viewers.
Experience building performance-portable abstractions and optimizing ML pipelines for a variety of hardware/software stacks.
Strong understanding of transformer models and modern attention mechanisms.
Hands-on work with parallel inference strategies (tensor parallelism, pipeline parallelism, etc.).
Proficiency with Triton or CUDA, especially writing custom kernels and fusions for hot code paths.
Experience writing high-performance parallel C++, particularly in a machine learning context (e.g., data loading, inference).
Previous work building efficient ML demos or inference environments (Gradio, Docker, etc.).
Experience deploying models on non-NVIDIA hardware platforms.
You’ll be building the technical backbone that allows cutting-edge multimodal AI models to run smoothly and efficiently across the world. Your work will directly influence how our models scale and how accessible they are in terms of cost, performance, and reach.
Base Salary: $220,000 – $300,000 / year (based on experience & location)
Equity: Generous stock options
Benefits: Full health coverage, flexible PTO, home office support, and more
Join a lean, expert team building next-gen AI from the ground up. If you thrive at the intersection of ML, systems, and performance—and love solving deep efficiency challenges—we want to hear from you.
...Description Physician Affiliate Group of New York (PAGNY) has a Nurse Midwife opportunity with NYC Health + Hospitals/Lincoln. Lincoln Medical Center, located just 15 minutes from the heart of New York City, is an acute care teaching hospital, academically affiliated...
Job Description We are looking for a family law Paralegal/Legal Secretary for a small practice to undertake a variety of administrative and clerical tasks. You will work under the supervision of attorneys and will also provide support in assigned legal cases. The goal...
Position Summary: The Tutor will play a meaningful role in supporting small groups of growing readers virtually, multiple times a week... ...with technology (GoogleMeet, Google Docs, GoogleSheets, online platforms) Enjoys working with elementary school age students...
$20 Per Hour minimum GUARANTEED. Make Money! Have Fun! Flex Schedule! Requirements for wage guarantee: - 20 hour minimum work week - 3 shifts minimum per week, 1 being a weekend closing shift per week - 2 Weekend Late or Closing shifts (closing shift = an hour before...
The Lot Attendant/Detailer position will manage the display lot, fuel vehicles, clean vehicles for delivery, shuttle customers and vehicles, and detail vehicles for inventory. The part time position is expected to work Saturdays and flexible hours during the week.