Instruction for downloading data from the sft-data repository.
First, you would want to log in and access the huggingface data through using
from huggingface_hub import login
login()
Then, you could either download the zip file of the all the sft data folders, which would look like
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/sft-data", filename="sft-data.zip")
Notice that the sft-data.zip
file above has the following structure:
sft-data
β README.md This README file.
ββββalf Folder for ALFWORLD.
β β alfworld.json The JSON file for ALFWORLD.
β ββββalf_data_folder Folder for the ALFWORLD environment.
β β alf_image_id_0 Folder 0 for ALFWORLD image data
β β alf_image_id_1 Folder 1 for ALFWORLD image data
β β alf_image_id_2 Folder 3 for ALFWORLD image data
β β alf_image_id_3 Folder 3 for ALFWORLD image data
β β alf_image_id_4 Folder 4 for ALFWORLD image data
ββββblackjack Folder for blackjack environment in the `gym_cards`
β blackjack_data_folder Folder for blackjack image data.
β blackjack.json The JSON file for blackjack.
ββββezpoints Folder for ezpoints environment in the `gym_cards`.
β ezpoints_data_folder Folder for ezpoints image data.
β ezpoints.json The JSON file for ezpoints.
ββββpoints24 Folder for points24 environment in the `gym_cards`.
β points24_data_folder Folder for points24 image data.
β points24.json The JSON file for points24.
ββββnumberline Folder for numberline environment in the `gym_cards`
β numberline_data_folder Folder for numberline image data.
β numberline.json The JSON file for numberline.
Also, you could choose to download the files for any environment out of the five ones. For example, you should be using the following code for downloading data from blackjack.
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="LEVI-Project/llava-data", filename="blackjack.zip") # zip folder for image data folder
hf_hub_download(repo_id="LEVI-Project/llava-data", filename="blackjack.json") # JSON file
For ALFWORLD, notice that the zip file for the image data folder is alf_data_folder.zip
.