D0k-tor commited on
Commit
9f60554
·
1 Parent(s): dff288f

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +2 -1
app.py CHANGED
@@ -36,11 +36,12 @@ with gr.Blocks() as demo:
36
  """
37
  <div style="text-align: center; max-width: 1200px; margin: 20px auto;">
38
  <h1 style="font-weight: 900; font-size: 3rem; margin: 0rem">
39
- ViT Image-to-Text with LORA
40
  </h1>
41
  <h2 style="text-align: left; font-weight: 450; font-size: 1rem; margin-top: 2rem; margin-bottom: 1.5rem">
42
  In the field of large language models, the challenge of fine-tuning has long perplexed researchers. Microsoft, however, has unveiled an innovative solution called <b>Low-Rank Adaptation (LoRA)</b>. With the emergence of behemoth models like GPT-3 boasting billions of parameters, the cost of fine-tuning them for specific tasks or domains has become exorbitant.
43
  <br>
 
44
  LoRA offers a groundbreaking approach by freezing the weights of pre-trained models and introducing trainable layers known as <b>rank-decomposition matrices in each transformer block</b>. This ingenious technique significantly reduces the number of trainable parameters and minimizes GPU memory requirements, as gradients no longer need to be computed for the majority of model weights.
45
  <br>
46
  <br>
 
36
  """
37
  <div style="text-align: center; max-width: 1200px; margin: 20px auto;">
38
  <h1 style="font-weight: 900; font-size: 3rem; margin: 0rem">
39
+ 📸 ViT Image-to-Text with LORA 📝
40
  </h1>
41
  <h2 style="text-align: left; font-weight: 450; font-size: 1rem; margin-top: 2rem; margin-bottom: 1.5rem">
42
  In the field of large language models, the challenge of fine-tuning has long perplexed researchers. Microsoft, however, has unveiled an innovative solution called <b>Low-Rank Adaptation (LoRA)</b>. With the emergence of behemoth models like GPT-3 boasting billions of parameters, the cost of fine-tuning them for specific tasks or domains has become exorbitant.
43
  <br>
44
+ <br>
45
  LoRA offers a groundbreaking approach by freezing the weights of pre-trained models and introducing trainable layers known as <b>rank-decomposition matrices in each transformer block</b>. This ingenious technique significantly reduces the number of trainable parameters and minimizes GPU memory requirements, as gradients no longer need to be computed for the majority of model weights.
46
  <br>
47
  <br>