职位描述
1.Provide guidance, analysis, and technical leadership to ensure that AI workloads to achieve scalable performance across multi chips;
2.Design, implement, and optimize the collective communication libraries and related runtimes;
3.Execute integration efforts with various R&D teams to ensure E2E results and smooth customer workload deployments;
4.Promote open-source development and contribute to open-source projects related to multi-chip communications.
职位要求
1.BS, MS or PhD degrees in Computer Science, Electrical Engineering, or a related field;
2.Project experience in software development, with a focus on multi-chip communications and AI workload full-stack optimizations;
3.Strong knowledge of communication libraries such as NCCL, UCX. Good understanding of large-scale network behavior and the effect of distributed computing workloads on the network;
4.Understanding of the performance and programmability of specific architectures, such as x86, ARM, CUDA GPU, RISC-V CPU. Good knowledge of network protocols;
5.Strong programming skills in C++, with experience in Python and other scripting languages;
6.Familiarity with AI frameworks, such as TensorFlow, PyTorch, is preferred;
7.Strong problem-solving skills and the ability to analyze and optimize complex code;
8.Excellent communication skills and the ability to work effectively in a team environment;
9.Demonstrated commitment to open-source development and contributions to open-source projects.