LEE
ABOUT
ARCHIVES
CATEGORIES
HOME
TAGS
Light
Dark
Outrageously large neural networks - the sparsely-gated mixture-of-experts layer [2017]
Jun 19, 2024
About 1 min
#paper-review
Background
Chat Vector - A Simple Approach...
python abstract class
Related Articles
Ziya2 - Data-centric Learning is All LLMs Need [2023]
PEFT's Target Modules Mappings
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training...
DeepSpeed Finetuning시 마주한 에러들
TOC