sft-data / README.md
yuslzp's picture
Upload README.md
dba1671 verified
|
raw
history blame
2.33 kB
# Instruction for downloading data from the sft-data repository.
First, you would want to log in and access the huggingface data through using
```py
from huggingface_hub import login
login()
```
Then, you could either download the zip file of the all the sft data folders, which would look like
```py
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="sft-data.zip")
```
Notice that the `sft-data.zip` file above has the following structure:
```
sft-data
β”‚ README.md This README file.
└───alf Folder for ALFWORLD.
β”‚ β”‚ alfworld.json The JSON file for ALFWORLD.
β”‚ └───alf_data_folder Folder for the ALFWORLD environment.
β”‚ β”‚ alf_image_id_0 Folder 0 for ALFWORLD image data
β”‚ β”‚ alf_image_id_1 Folder 1 for ALFWORLD image data
β”‚ β”‚ alf_image_id_2 Folder 3 for ALFWORLD image data
β”‚ β”‚ alf_image_id_3 Folder 3 for ALFWORLD image data
β”‚ β”‚ alf_image_id_4 Folder 4 for ALFWORLD image data
└───blackjack Folder for blackjack environment in the `gym_cards`
β”‚ blackjack_data_folder Folder for blackjack image data.
β”‚ blackjack.json The JSON file for blackjack.
└───ezpoints Folder for ezpoints environment in the `gym_cards`.
β”‚ ezpoints_data_folder Folder for ezpoints image data.
β”‚ ezpoints.json The JSON file for ezpoints.
└───points24 Folder for points24 environment in the `gym_cards`.
β”‚ points24_data_folder Folder for points24 image data.
β”‚ points24.json The JSON file for points24.
└───numberline Folder for numberline environment in the `gym_cards`
β”‚ numberline_data_folder Folder for numberline image data.
β”‚ numberline.json The JSON file for numberline.
```
Also, you could choose to download the files for any environment out of the five ones. For example, you should be using the following code for downloading data from blackjack.
```py
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/llava-data", filename="blackjack.zip") # zip folder for image data folder
hf_hub_download(repo_id="LEVI-Project/llava-data", filename="blackjack.json") # JSON file
```
For ALFWORLD, notice that the zip file for the image data folder is `alf_data_folder.zip`.