Sadjad Alikhani commited on
Commit
be1a7b3
1 Parent(s): e8c0c9d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -138
README.md CHANGED
@@ -35,14 +35,14 @@ Once you have Conda or Mamba installed, follow these steps to create a new envir
35
 
36
  #### **Step 1: Create a new environment**
37
 
38
- You can create a new environment called `lwm_env` (or any other name) with Python 3.9 or any required version:
39
 
40
  ```bash
41
  # If you're using Conda:
42
- conda create -n lwm_env python=3.9
43
 
44
  # If you're using Mamba:
45
- mamba create -n lwm_env python=3.9
46
  ```
47
 
48
  #### **Step 2: Activate the environment**
@@ -56,82 +56,76 @@ conda activate lwm_env
56
 
57
  ---
58
 
59
- ### 3. **Clone the Repository**
60
 
61
- After setting up the environment, clone the Hugging Face repository to your local machine using the following Python code:
62
 
63
- ```python
64
- import subprocess
65
- import os
66
- import sys
67
- import importlib.util
68
- import torch
69
 
70
- # Hugging Face public repository URL
71
- repo_url = "https://huggingface.co/sadjadalikhani/LWM"
 
 
 
72
 
73
- # Directory where the repo will be cloned
74
- clone_dir = "./LWM"
75
 
76
- # Step 1: Clone the repository if it hasn't been cloned already
77
- if not os.path.exists(clone_dir):
78
- print(f"Cloning repository from {repo_url} into {clone_dir}...")
79
- result = subprocess.run(["git", "clone", repo_url, clone_dir], capture_output=True, text=True)
80
 
81
- if result.returncode != 0:
82
- print(f"Error cloning repository: {result.stderr}")
83
- sys.exit(1)
84
- print(f"Repository cloned successfully into {clone_dir}")
85
- else:
86
- print(f"Repository already cloned into {clone_dir}")
87
-
88
- # Step 2: Add the cloned directory to Python path
89
- sys.path.append(clone_dir)
90
-
91
- # Step 3: Import necessary functions
92
- def import_functions_from_file(module_name, file_path):
93
- try:
94
- spec = importlib.util.spec_from_file_location(module_name, file_path)
95
- module = importlib.util.module_from_spec(spec)
96
- spec.loader.exec_module(module)
97
-
98
- for function_name in dir(module):
99
- if callable(getattr(module, function_name)) and not function_name.startswith("__"):
100
- globals()[function_name] = getattr(module, function_name)
101
- return module
102
- except FileNotFoundError:
103
- print(f"Error: {file_path} not found!")
104
- sys.exit(1)
105
-
106
- # Step 4: Import functions from the repository
107
- import_functions_from_file("lwm_model", os.path.join(clone_dir, "lwm_model.py"))
108
- import_functions_from_file("inference", os.path.join(clone_dir, "inference.py"))
109
- import_functions_from_file("load_data", os.path.join(clone_dir, "load_data.py"))
110
- import_functions_from_file("input_preprocess", os.path.join(clone_dir, "input_preprocess.py"))
111
- print("All required functions imported successfully.")
112
  ```
113
 
114
  ---
115
 
116
- ### 4. **Install Required Packages**
117
 
118
- Install the necessary packages inside your new environment.
119
 
120
- ```bash
121
- # If you're using Conda:
122
- conda install pytorch torchvision torchaudio -c pytorch
123
- pip install -r requirements.txt
124
 
125
- # If you're using Mamba:
126
- mamba install pytorch torchvision torchaudio -c pytorch
127
- pip install -r requirements.txt
128
  ```
129
 
130
- This will install **PyTorch**, **Torchvision**, and other required dependencies from the `requirements.txt` file in the cloned repository.
131
-
132
  ---
133
 
134
- ### 5. **Load the DeepMIMO Dataset**
135
 
136
  Before proceeding with tokenization and data processing, the **DeepMIMO** dataset—or any dataset generated using the operational settings outlined below—must first be loaded. The table below provides a list of available datasets and their respective links for further details:
137
 
@@ -155,100 +149,62 @@ The operational settings below were used in generating the datasets for both the
155
  - **Antennas at UEs**: 1
156
  - **Subcarriers**: 32
157
  - **Paths**: 20
158
-
159
- #### **Load Data Code**:
160
- Select and load specific datasets by adjusting the `dataset_idxs`. In the example below, we select the first two datasets.
161
-
162
  ```python
163
- # Step 5: Load the DeepMIMO dataset
164
- print("Loading the DeepMIMO dataset...")
165
-
166
- # Load the DeepMIMO dataset
167
- deepmimo_data = load_DeepMIMO_data()
168
-
169
- # Select datasets to load
170
- dataset_idxs = torch.arange(2) # Adjust the number of datasets as needed
171
- print("DeepMIMO dataset loaded successfully.")
 
 
 
 
 
172
  ```
173
 
174
  ---
175
 
176
- ### 6. **Tokenize the DeepMIMO Dataset**
177
-
178
- After loading the data, tokenize the selected **DeepMIMO** datasets. This step prepares the data for the model to process.
179
-
180
- #### **Tokenization Code**:
181
-
182
  ```python
183
- # Step 6: Tokenize the dataset
184
- print("Tokenizing the DeepMIMO dataset...")
185
-
186
- # Tokenize the loaded datasets
187
- preprocessed_chs = tokenizer(deepmimo_data, dataset_idxs, gen_raw=True)
188
- print("Dataset tokenized successfully.")
189
  ```
190
 
191
  ---
192
 
193
- ### 7. **Load the LWM Model**
 
 
 
 
194
 
195
- Once the dataset is tokenized, load the pre-trained **LWM** model using the following code:
 
 
196
 
197
- ```python
198
- # Step 7: Load the LWM model (with flexibility for the device)
199
  device = 'cuda' if torch.cuda.is_available() else 'cpu'
200
  print(f"Loading the LWM model on {device}...")
201
- model = LWM.from_pretrained(device=device)
202
- ```
203
-
204
- ---
205
-
206
- ### 8. **LWM Inference**
207
-
208
- Once the dataset is tokenized and the model is loaded, generate either **raw channels** or the **inferred LWM embeddings** by choosing the input type.
209
-
210
- ```python
211
- # Step 8: Generate the dataset for inference
212
- input_type = ['cls_emb', 'channel_emb', 'raw'][1] # Modify input type as needed
213
- dataset = dataset_gen(preprocessed_chs, input_type, model)
214
  ```
215
 
216
- You can choose between:
217
- - `cls_emb`: LWM CLS token embeddings
218
- - `channel_emb`: LWM channel embeddings
219
- - `raw`: Raw wireless channel data
220
-
221
  ---
222
 
223
- ###
224
-
225
- 9. **Post-processing for Downstream Task**
226
-
227
- #### **Use the Dataset in Downstream Tasks**
228
-
229
- Finally, use the generated dataset for your downstream tasks, such as classification, prediction, or analysis.
230
-
231
  ```python
232
- # Step 9: Print results
233
- print(f"Dataset generated with shape: {dataset.shape}")
234
- print("Inference completed successfully.")
235
- ```
236
-
237
- ---
238
-
239
- ## 📋 **Requirements**
240
-
241
- - **Python 3.x**
242
- - **PyTorch**
243
- - **Git**
244
 
245
  ---
246
-
247
- ### Summary of Steps:
248
-
249
- 1. **Install Conda/Mamba**: Install a package manager for environment management.
250
- 2. **Create Environment**: Use Conda or Mamba to create a new environment.
251
- 3. **Clone the Repository**: Download the project files from Hugging Face.
252
- 4. **Install Packages**: Install PyTorch and other dependencies.
253
- 5. **Load and Tokenize Data**: Load the DeepMIMO dataset and prepare it for the model.
254
- 6. **Load Model and Perform Inference**: Use the LWM model for generating embeddings or raw channels.
 
35
 
36
  #### **Step 1: Create a new environment**
37
 
38
+ You can create a new environment called `lwm_env` (or any other name) with Python 3.12 or any required version:
39
 
40
  ```bash
41
  # If you're using Conda:
42
+ conda create -n lwm_env python=3.12
43
 
44
  # If you're using Mamba:
45
+ mamba create -n lwm_env python=3.12
46
  ```
47
 
48
  #### **Step 2: Activate the environment**
 
56
 
57
  ---
58
 
59
+ #### **Step 3: Install Required Packages**
60
 
61
+ Install the necessary packages inside your new environment.
62
 
63
+ ```bash
64
+ # If you're using Conda:
65
+ conda install pytorch torchvision torchaudio -c pytorch
66
+ pip install -r requirements.txt
 
 
67
 
68
+ # If you're using Mamba:
69
+ mamba install pytorch torchvision torchaudio -c pytorch
70
+ pip install -r requirements.txt
71
+ ```
72
+ ---
73
 
74
+ ### 2. **Required Functions to Clone Datasets**
 
75
 
76
+ ```python
77
+ import subprocess
78
+ import os
 
79
 
80
+ # Function to clone a specific dataset scenario folder
81
+ def clone_dataset_scenario(scenario_name, repo_url, model_repo_dir="./LWM", scenarios_dir="scenarios"):
82
+ # Create the scenarios directory if it doesn't exist
83
+ scenarios_path = os.path.join(model_repo_dir, scenarios_dir)
84
+ if not os.path.exists(scenarios_path):
85
+ os.makedirs(scenarios_path)
86
+
87
+ scenario_path = os.path.join(scenarios_path, scenario_name)
88
+
89
+ # Initialize sparse checkout for the dataset repository
90
+ if not os.path.exists(os.path.join(scenarios_path, ".git")):
91
+ print(f"Initializing sparse checkout in {scenarios_path}...")
92
+ subprocess.run(["git", "clone", "--sparse", repo_url, "."], cwd=scenarios_path, check=True)
93
+ subprocess.run(["git", "sparse-checkout", "init", "--cone"], cwd=scenarios_path, check=True)
94
+ subprocess.run(["git", "lfs", "install"], cwd=scenarios_path, check=True) # Install Git LFS if needed
95
+
96
+ # Add the requested scenario folder to sparse checkout
97
+ print(f"Adding {scenario_name} to sparse checkout...")
98
+ subprocess.run(["git", "sparse-checkout", "add", scenario_name], cwd=scenarios_path, check=True)
99
+
100
+ # Pull large files if needed (using Git LFS)
101
+ subprocess.run(["git", "lfs", "pull"], cwd=scenarios_path, check=True)
102
+
103
+ print(f"Successfully cloned {scenario_name} into {scenarios_path}.")
104
+
105
+ # Function to clone multiple dataset scenarios
106
+ def clone_dataset_scenarios(selected_scenario_names, dataset_repo_url, model_repo_dir):
107
+ for scenario_name in selected_scenario_names:
108
+ clone_dataset_scenario(scenario_name, dataset_repo_url, model_repo_dir)
 
 
109
  ```
110
 
111
  ---
112
 
113
+ ### 3. **Clone the Model**
114
 
115
+ ```python
116
 
117
+ # Step 1: Clone the model repository (if not already cloned)
118
+ model_repo_url = "https://huggingface.co/sadjadalikhani/lwm"
119
+ model_repo_dir = "./LWM"
 
120
 
121
+ if not os.path.exists(model_repo_dir):
122
+ print(f"Cloning model repository from {model_repo_url}...")
123
+ subprocess.run(["git", "clone", model_repo_url, model_repo_dir], check=True)
124
  ```
125
 
 
 
126
  ---
127
 
128
+ ### 4. **Clone the Desired Datasets**
129
 
130
  Before proceeding with tokenization and data processing, the **DeepMIMO** dataset—or any dataset generated using the operational settings outlined below—must first be loaded. The table below provides a list of available datasets and their respective links for further details:
131
 
 
149
  - **Antennas at UEs**: 1
150
  - **Subcarriers**: 32
151
  - **Paths**: 20
152
+
 
 
 
153
  ```python
154
+ # Step 2: Clone specific dataset scenario folder(s) inside the "scenarios" folder
155
+ dataset_repo_url = "https://huggingface.co/datasets/sadjadalikhani/lwm" # Base URL for dataset repo
156
+ scenario_names = np.array(["city_18_denver",
157
+ "city_15_indianapolis",
158
+ "city_19_oklahoma",
159
+ "city_12_fortworth",
160
+ "city_11_santaclara",
161
+ "city_7_sandiego"]
162
+ )
163
+ scenario_idxs = np.array([3])
164
+ selected_scenario_names = scenario_names[scenario_idxs]
165
+
166
+ # Clone the requested scenario folders (this will clone every time)
167
+ clone_dataset_scenarios(selected_scenario_names, dataset_repo_url, model_repo_dir)
168
  ```
169
 
170
  ---
171
 
172
+ ### 5. **Change the working directory to LWM folder**
 
 
 
 
 
173
  ```python
174
+ if os.path.exists(model_repo_dir):
175
+ os.chdir(model_repo_dir)
176
+ print(f"Changed working directory to {os.getcwd()}")
177
+ else:
178
+ print(f"Directory {model_repo_dir} does not exist. Please check if the repository is cloned properly.")
 
179
  ```
180
 
181
  ---
182
 
183
+ ### 6. **Tokenize and Load the Model**
184
+ ```python
185
+ from input_preprocess import tokenizer
186
+ from lwm_model import lwm
187
+ import torch
188
 
189
+ preprocessed_chs = tokenizer(selected_scenario_names=selected_scenario_names,
190
+ manual_data=None,
191
+ gen_raw=True)
192
 
 
 
193
  device = 'cuda' if torch.cuda.is_available() else 'cpu'
194
  print(f"Loading the LWM model on {device}...")
195
+ model = lwm.from_pretrained(device=device)
 
 
 
 
 
 
 
 
 
 
 
 
196
  ```
197
 
 
 
 
 
 
198
  ---
199
 
200
+ ### 7. **Perform Inference**
 
 
 
 
 
 
 
201
  ```python
202
+ from inference import lwm_inference, create_raw_dataset
203
+ input_types = ['cls_emb', 'channel_emb', 'raw']
204
+ selected_input_type = input_types[0]
205
+ if selected_input_type in ['cls_emb', 'channel_emb']:
206
+ dataset = lwm_inference(preprocessed_chs, selected_input_type, model, device)
207
+ else:
208
+ dataset = create_raw_dataset(preprocessed_chs, device)
 
 
 
 
 
209
 
210
  ---