weizhiwang
/

LLaVA-Llama-3-8B

Text Generation

Inference Endpoints

Model card Files Files and versions Community

weizhiwang commited on Oct 25, 2024

Commit

dcc8dca

·

verified ·

1 Parent(s): 4a03fda

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ language:
 <!-- Provide a quick summary of what the model is/does. -->
 A reproduced LLaVA LVLM based on Llama-3-8B LLM backbone. Not an official implementation.
-Please follow my reproduced implementation [LLaVA-Llama-3](https://github.com/Victorwz/LLaVA-Llama-3/) for more details on fine-tuning LLaVA model with Llama-3 as the foundatiaon LLM.
 ## Updates
 - [5/14/2024] The codebase has been upgraded to llava-next (llava-v1.6). Now it supports the latest llama-3, phi-3, mistral-v0.1-7b models.
@@ -24,7 +24,7 @@ Follows LLavA-1.5 pre-train and supervised fine-tuning pipeline. You do not need
 Please firstly install llava via
 ```
-pip install git+https://github.com/Victorwz/LLaVA-Llama-3.git
 ```
 You can load the model and perform inference as follows:
@@ -76,7 +76,7 @@ In the background, there are two cars parked on the street, one on the left side
 ```
 # Fine-Tune LLaVA-Llama-3 on Your Visual Instruction Data
-Please refer to a forked [LLaVA-Llama-3](https://github.com/Victorwz/LLaVA-Llama-3) git repo for fine-tuning data preparation and scripts. The data loading function and fastchat conversation template are changed due to a different tokenizer.
 ## Benchmark Results

 <!-- Provide a quick summary of what the model is/does. -->
 A reproduced LLaVA LVLM based on Llama-3-8B LLM backbone. Not an official implementation.
+Please follow my reproduced implementation [LLaVA-Llama-Video-3](https://github.com/Victorwz/LLaVA-Video-Llama-3/) for more details on fine-tuning LLaVA model with Llama-3 as the foundatiaon LLM.
 ## Updates
 - [5/14/2024] The codebase has been upgraded to llava-next (llava-v1.6). Now it supports the latest llama-3, phi-3, mistral-v0.1-7b models.
 Please firstly install llava via
 ```
+pip install git+https://github.com/Victorwz/LLaVA-Video-Llama-3.git
 ```
 You can load the model and perform inference as follows:
 ```
 # Fine-Tune LLaVA-Llama-3 on Your Visual Instruction Data
+Please refer to our [LLaVA-Video-Llama-3](https://github.com/Victorwz/LLaVA-Video-Llama-3) git repo for fine-tuning data preparation and scripts. The data loading function and fastchat conversation template are changed due to a different tokenizer.
 ## Benchmark Results