News
python train.py --data_dir data --model_dir checkpoints/400m --cfg baseline_355m --seq_len 1024 --batch_size 2 --grad_accum 16 --lr 3e-4 --warmup_steps 2000 --max ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results