Sadjad Alikhani commited on
Commit
c92c0a2
1 Parent(s): 491d623

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -171,7 +171,13 @@ else:
171
 
172
  ### 8. **Tokenize and Load the Model**
173
 
174
- Now, tokenize the dataset and load the pre-trained LWM model.
 
 
 
 
 
 
175
 
176
  ```python
177
  from input_preprocess import tokenizer
@@ -189,6 +195,8 @@ print(f"Loading the LWM model on {device}...")
189
  model = lwm.from_pretrained(device=device)
190
  ```
191
 
 
 
192
  ---
193
 
194
  ### 9. **Perform Inference**
 
171
 
172
  ### 8. **Tokenize and Load the Model**
173
 
174
+ Before we dive into tokenizing the dataset and loading the model, let's understand how the tokenization process is adapted to the wireless communication context. In this case, **tokenization** refers to segmenting each wireless channel into patches, similar to how Vision Transformers (ViTs) work with images. Each wireless channel is structured as a \(32 \times 32\) matrix, where rows represent antennas and columns represent subcarriers.
175
+
176
+ The tokenization process involves **dividing the channel matrix into patches**, with each patch containing information from 16 consecutive subcarriers. These patches are then **embedded** into a 64-dimensional space, providing the Transformer with a richer context for each patch. In this process, **positional encodings** are added to preserve the structural relationships within the channel, ensuring the Transformer captures both spatial and frequency dependencies.
177
+
178
+ If you choose to apply **Masked Channel Modeling (MCM)** during inference (by setting `gen_raw=False`), LWM will mask certain patches, as it did during pre-training. However, for standard inference, masking isn't necessary unless you want to test LWM's resilience to noisy inputs.
179
+
180
+ Now, let's move on to tokenize the dataset and load the pre-trained LWM model.
181
 
182
  ```python
183
  from input_preprocess import tokenizer
 
195
  model = lwm.from_pretrained(device=device)
196
  ```
197
 
198
+ With this setup, you're ready to pass your tokenized wireless channels through the pre-trained model, extracting rich, context-aware embeddings that are ready for use in downstream tasks.
199
+
200
  ---
201
 
202
  ### 9. **Perform Inference**