Spaces:

nanotron
/

ultrascale-playbook

Running

App Files Files Community

Resources

View closed (96)

Questions?

#84 opened 8 months ago by

More ressources

#73 opened 8 months ago by

fix typo & a QUESTION about 16-bit training

#121 opened 4 days ago by

TP self attention figure

#120 opened 23 days ago by

typos

#119 opened 2 months ago by

Fix: Link Error on Page 17

#117 opened 3 months ago by

Potential Link Error on Page 17 of the Ultra-Scale Playbook

#116 opened 3 months ago by

Fix typo in memory footprint variable name

#115 opened 3 months ago by

sharing results on trained networks

#114 opened 4 months ago by

TP Question

#113 opened 4 months ago by

Question on the "Summarizing it all" figure

#111 opened 5 months ago by

How to understand the graph "Tensor parallelism with column linear + row Linear"

#109 opened 6 months ago by

Incorrect link in Data Parallelism?

#108 opened 6 months ago by

Thoughts on adding Hybrid Sharded Data Parallel to the guide

#107 opened 6 months ago by

Typo in Sequence Parallelism TO -> TP

#106 opened 6 months ago by

Wrong section title for FSDP?

#105 opened 7 months ago by

A mistake ? Weights/grads/optimizer stats memory for mixed precision

#104 opened 8 months ago by

Questions about pipeline parallelism

#103 opened 8 months ago by

Widget does not take TP into account for Parameter / Gradient / Optimizer State Sharding

#98 opened 8 months ago by

Am I misunderstanding Zero-1 and Zero-2?

#94 opened 8 months ago by

Fix description of Zero-1

#93 opened 8 months ago by

Few Errors

#86 opened 8 months ago by

How can the following figure be obtained, and is there a way to tag the name of each tensor during profiling?

#83 opened 8 months ago by

Thanks for sharing. Was looking for similar research to get to know about compute(AI+GPU)

#79 opened 8 months ago by

Make it easier to import into reader applications

#77 opened 8 months ago by