Deep Learning Performance Architect - Shanghai/Hsinchu
Are you obsessed with performance? Do you like to work at the intersection of hardware and software? Do you live and breathe deep learning? NVIDIA is seeking world class programmers and performance architects who love to squeeze out every cycle of performance from deep learning codes. In this role, you will write code that ships in our deep learning libraries, as well as guide the direction of our future GPU architectures. This position offers the opportunity to have real impact in a fast-moving, technology-focused company.
What you'll be doing:
Develop state of the art, performance critical code to accelerate deep learning on NVIDIA's platforms.
Develop innovative HW, DSP, GPU and system architectures to extend the state of the art in deep learning performance and efficiency
Analyze and prototype key deep learning and data analytics algorithms and applications
Understand and analyze the interplay of hardware and software architectures on future algorithms and applications
Collaborate across the company to guide the direction of machine learning, working with software, research and product teams
What we need to see:
MS or PhD in relevant discipline (CS, EE, Math)
Track record of optimizing code for performance on CPUs or GPUs, including assembly or SIMD programming
Strong mathematical foundation in machine learning and deep learning
Experience working with deep learning frameworks like Caffe, TensorFlow and Torch
Strong programming skills in C, C++, Perl, or Python
Familiarity with GPU computing (CUDA, OpenCL, OpenACC) and HPC (MPI, OpenMP)