exbert

Runtime error

App Files Files Community

exbert / server /transformers /docs /source /main_classes /optimizer_schedules.rst

bhoov

git subrepo clone (merge) --branch=exbert-mods https://github.com/bhoov/transformers.git server/transformers

75466df about 5 years ago

raw

history blame

1.74 kB

	Optimizer
	----------------------------------------------------

	The ``.optimization`` module provides:

	- an optimizer with weight decay fixed that can be used to fine-tuned models, and
	- several schedules in the form of schedule objects that inherit from ``_LRSchedule``:
	- a gradient accumulation class to accumulate the gradients of multiple batches

	``AdamW``
	~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.AdamW
	:members:

	``AdamWeightDecay``
	~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.AdamWeightDecay
	:members:

	.. autofunction:: transformers.create_optimizer

	Schedules
	----------------------------------------------------

	Learning Rate Schedules
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	.. autofunction:: transformers.get_constant_schedule


	.. autofunction:: transformers.get_constant_schedule_with_warmup

	.. image:: /imgs/warmup_constant_schedule.png
	:target: /imgs/warmup_constant_schedule.png
	:alt:


	.. autofunction:: transformers.get_cosine_schedule_with_warmup

	.. image:: /imgs/warmup_cosine_schedule.png
	:target: /imgs/warmup_cosine_schedule.png
	:alt:


	.. autofunction:: transformers.get_cosine_with_hard_restarts_schedule_with_warmup

	.. image:: /imgs/warmup_cosine_hard_restarts_schedule.png
	:target: /imgs/warmup_cosine_hard_restarts_schedule.png
	:alt:



	.. autofunction:: transformers.get_linear_schedule_with_warmup

	.. image:: /imgs/warmup_linear_schedule.png
	:target: /imgs/warmup_linear_schedule.png
	:alt:

	``Warmup``
	~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.WarmUp
	:members:

	Gradient Strategies
	----------------------------------------------------

	``GradientAccumulator``
	~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.GradientAccumulator