Skip to content

[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3) #2788

[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3)

[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3) #2788