Sadjad Alikhani
commited on
Commit
•
c92c0a2
1
Parent(s):
491d623
Update README.md
Browse files
README.md
CHANGED
@@ -171,7 +171,13 @@ else:
|
|
171 |
|
172 |
### 8. **Tokenize and Load the Model**
|
173 |
|
174 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
175 |
|
176 |
```python
|
177 |
from input_preprocess import tokenizer
|
@@ -189,6 +195,8 @@ print(f"Loading the LWM model on {device}...")
|
|
189 |
model = lwm.from_pretrained(device=device)
|
190 |
```
|
191 |
|
|
|
|
|
192 |
---
|
193 |
|
194 |
### 9. **Perform Inference**
|
|
|
171 |
|
172 |
### 8. **Tokenize and Load the Model**
|
173 |
|
174 |
+
Before we dive into tokenizing the dataset and loading the model, let's understand how the tokenization process is adapted to the wireless communication context. In this case, **tokenization** refers to segmenting each wireless channel into patches, similar to how Vision Transformers (ViTs) work with images. Each wireless channel is structured as a \(32 \times 32\) matrix, where rows represent antennas and columns represent subcarriers.
|
175 |
+
|
176 |
+
The tokenization process involves **dividing the channel matrix into patches**, with each patch containing information from 16 consecutive subcarriers. These patches are then **embedded** into a 64-dimensional space, providing the Transformer with a richer context for each patch. In this process, **positional encodings** are added to preserve the structural relationships within the channel, ensuring the Transformer captures both spatial and frequency dependencies.
|
177 |
+
|
178 |
+
If you choose to apply **Masked Channel Modeling (MCM)** during inference (by setting `gen_raw=False`), LWM will mask certain patches, as it did during pre-training. However, for standard inference, masking isn't necessary unless you want to test LWM's resilience to noisy inputs.
|
179 |
+
|
180 |
+
Now, let's move on to tokenize the dataset and load the pre-trained LWM model.
|
181 |
|
182 |
```python
|
183 |
from input_preprocess import tokenizer
|
|
|
195 |
model = lwm.from_pretrained(device=device)
|
196 |
```
|
197 |
|
198 |
+
With this setup, you're ready to pass your tokenized wireless channels through the pre-trained model, extracting rich, context-aware embeddings that are ready for use in downstream tasks.
|
199 |
+
|
200 |
---
|
201 |
|
202 |
### 9. **Perform Inference**
|