File size: 5,987 Bytes
75466df
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
Converting Tensorflow Checkpoints
================================================

A command-line interface is provided to convert original Bert/GPT/GPT-2/Transformer-XL/XLNet/XLM checkpoints in models than be loaded using the ``from_pretrained`` methods of the library.

.. note::
    Since 2.3.0 the conversion script is now part of the transformers CLI (**transformers-cli**)
    available in any transformers >= 2.3.0 installation.

    The documentation below reflects the **transformers-cli convert** command format.

BERT
^^^^

You can convert any TensorFlow checkpoint for BERT (in particular `the pre-trained models released by Google <https://github.com/google-research/bert#pre-trained-models>`_\ ) in a PyTorch save file by using the `convert_tf_checkpoint_to_pytorch.py <https://github.com/huggingface/transformers/blob/master/transformers/convert_tf_checkpoint_to_pytorch.py>`_ script.

This CLI takes as input a TensorFlow checkpoint (three files starting with ``bert_model.ckpt``\ ) and the associated configuration file (\ ``bert_config.json``\ ), and creates a PyTorch model for this configuration, loads the weights from the TensorFlow checkpoint in the PyTorch model and saves the resulting model in a standard PyTorch save file that can be imported using ``torch.load()`` (see examples in `run_bert_extract_features.py <https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_extract_features.py>`_\ , `run_bert_classifier.py <https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_classifier.py>`_ and `run_bert_squad.py <https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/run_bert_squad.py>`_\ ).

You only need to run this conversion script **once** to get a PyTorch model. You can then disregard the TensorFlow checkpoint (the three files starting with ``bert_model.ckpt``\ ) but be sure to keep the configuration file (\ ``bert_config.json``\ ) and the vocabulary file (\ ``vocab.txt``\ ) as these are needed for the PyTorch model too.

To run this specific conversion script you will need to have TensorFlow and PyTorch installed (\ ``pip install tensorflow``\ ). The rest of the repository only requires PyTorch.

Here is an example of the conversion process for a pre-trained ``BERT-Base Uncased`` model:

.. code-block:: shell

   export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12

<<<<<<< HEAD
   transformers-cli --model_type bert \
=======
   transformers-cli convert --model_type bert \
>>>>>>> bfec203d4ed95255619e7e2f28c9040744a16232
     --tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
     --config $BERT_BASE_DIR/bert_config.json \
     --pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin

You can download Google's pre-trained models for the conversion `here <https://github.com/google-research/bert#pre-trained-models>`__.

OpenAI GPT
^^^^^^^^^^

Here is an example of the conversion process for a pre-trained OpenAI GPT model, assuming that your NumPy checkpoint save as the same format than OpenAI pretrained model (see `here <https://github.com/openai/finetune-transformer-lm>`__\ )

.. code-block:: shell

   export OPENAI_GPT_CHECKPOINT_FOLDER_PATH=/path/to/openai/pretrained/numpy/weights

<<<<<<< HEAD
   transformers-cli --model_type gpt \
=======
   transformers-cli convert --model_type gpt \
>>>>>>> bfec203d4ed95255619e7e2f28c9040744a16232
     --tf_checkpoint $OPENAI_GPT_CHECKPOINT_FOLDER_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config OPENAI_GPT_CONFIG] \
     [--finetuning_task_name OPENAI_GPT_FINETUNED_TASK] \


OpenAI GPT-2
^^^^^^^^^^^^

Here is an example of the conversion process for a pre-trained OpenAI GPT-2 model (see `here <https://github.com/openai/gpt-2>`__\ )

.. code-block:: shell

   export OPENAI_GPT2_CHECKPOINT_PATH=/path/to/gpt2/pretrained/weights

<<<<<<< HEAD
   transformers-cli --model_type gpt2 \
=======
   transformers-cli convert --model_type gpt2 \
>>>>>>> bfec203d4ed95255619e7e2f28c9040744a16232
     --tf_checkpoint $OPENAI_GPT2_CHECKPOINT_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config OPENAI_GPT2_CONFIG] \
     [--finetuning_task_name OPENAI_GPT2_FINETUNED_TASK]

Transformer-XL
^^^^^^^^^^^^^^

Here is an example of the conversion process for a pre-trained Transformer-XL model (see `here <https://github.com/kimiyoung/transformer-xl/tree/master/tf#obtain-and-evaluate-pretrained-sota-models>`__\ )

.. code-block:: shell

   export TRANSFO_XL_CHECKPOINT_FOLDER_PATH=/path/to/transfo/xl/checkpoint

<<<<<<< HEAD
   transformers-cli --model_type transfo_xl \
=======
   transformers-cli convert --model_type transfo_xl \
>>>>>>> bfec203d4ed95255619e7e2f28c9040744a16232
     --tf_checkpoint $TRANSFO_XL_CHECKPOINT_FOLDER_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--config TRANSFO_XL_CONFIG] \
     [--finetuning_task_name TRANSFO_XL_FINETUNED_TASK]


XLNet
^^^^^

Here is an example of the conversion process for a pre-trained XLNet model:

.. code-block:: shell

   export TRANSFO_XL_CHECKPOINT_PATH=/path/to/xlnet/checkpoint
   export TRANSFO_XL_CONFIG_PATH=/path/to/xlnet/config

<<<<<<< HEAD
   transformers-cli --model_type xlnet \
=======
   transformers-cli convert --model_type xlnet \
>>>>>>> bfec203d4ed95255619e7e2f28c9040744a16232
     --tf_checkpoint $TRANSFO_XL_CHECKPOINT_PATH \
     --config $TRANSFO_XL_CONFIG_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT \
     [--finetuning_task_name XLNET_FINETUNED_TASK] \


XLM
^^^

Here is an example of the conversion process for a pre-trained XLM model:

.. code-block:: shell

   export XLM_CHECKPOINT_PATH=/path/to/xlm/checkpoint

<<<<<<< HEAD
   transformers-cli --model_type xlm \
=======
   transformers-cli convert --model_type xlm \
>>>>>>> bfec203d4ed95255619e7e2f28c9040744a16232
     --tf_checkpoint $XLM_CHECKPOINT_PATH \
     --pytorch_dump_output $PYTORCH_DUMP_OUTPUT
    [--config XML_CONFIG] \
    [--finetuning_task_name XML_FINETUNED_TASK]