LWM: Large Wireless Model

This repository contains the implementation of LWM (Large Wireless Model), a pre-trained model for processing and extracting features from wireless communication datasets, specifically DeepMIMO. The instructions below will help you load DeepMIMO data, use the LWM model and weights, tokenize DeepMIMO scenario data, and generate either raw channels or the inferred LWM CLS or channel embeddings.

How to Use

LWM Inference

Clone the Repository

Clone the Hugging Face repository to your local machine using the following code:

import subprocess
import os
import sys
import importlib.util
import torch

# Hugging Face public repository URL
repo_url = "https://huggingface.co/sadjadalikhani/LWM"

# Directory where the repo will be cloned
clone_dir = "./LWM"

# Step 1: Clone the repository if it hasn't been cloned already
if not os.path.exists(clone_dir):
    print(f"Cloning repository from {repo_url} into {clone_dir}...")
    result = subprocess.run(["git", "clone", repo_url, clone_dir], capture_output=True, text=True)

    if result.returncode != 0:
        print(f"Error cloning repository: {result.stderr}")
        sys.exit(1)  # Exit on failure
    print(f"Repository cloned successfully into {clone_dir}")
else:
    print(f"Repository already cloned into {clone_dir}")

# Step 2: Add the cloned directory to Python path
sys.path.append(clone_dir)

# Step 3: Dynamic module import and function exposure
def import_functions_from_file(module_name, file_path):
    try:
        spec = importlib.util.spec_from_file_location(module_name, file_path)
        module = importlib.util.module_from_spec(spec)
        spec.loader.exec_module(module)

        # Extract functions from the module and make them globally accessible
        for function_name in dir(module):
            if callable(getattr(module, function_name)) and not function_name.startswith("__"):
                globals()[function_name] = getattr(module, function_name)

        return module
    except FileNotFoundError:
        print(f"Error: {file_path} not found!")
        sys.exit(1)

# Step 4: Import necessary functions
import_functions_from_file("lwm_model", os.path.join(clone_dir, "lwm_model.py"))
import_functions_from_file("inference", os.path.join(clone_dir, "inference.py"))
import_functions_from_file("load_data", os.path.join(clone_dir, "load_data.py"))
import_functions_from_file("input_preprocess", os.path.join(clone_dir, "input_preprocess.py"))
print("All required functions imported successfully.")

Load the LWM Model

After cloning the repository, you can load the LWM model with the following code:

# Step 5: Load the LWM model (with flexibility for the device)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Loading the LWM model on {device}...")
model = LWM.from_pretrained(device=device)

Load the DeepMIMO Dataset

Load the DeepMIMO dataset with this code:

# Step 6: Load dataset (direct call, no module prefix)
print("Loading DeepMIMO dataset...")
deepmimo_data = load_DeepMIMO_data()

Tokenize the DeepMIMO Dataset

After loading the dataset, you can tokenize the DeepMIMO dataset based on specific scenarios. The table below lists the available scenarios, their corresponding DeepMIMO pages, and relevant details:

Scenario	City	Link to DeepMIMO Page
Scenario 0	Denver	DeepMIMO City Scenario 18
Scenario 1	Indianapolis	DeepMIMO City Scenario 15
Scenario 2	Oklahoma	DeepMIMO City Scenario 19
Scenario 3	Fort Worth	DeepMIMO City Scenario 12
Scenario 4	Santa Clara	DeepMIMO City Scenario 11
Scenario 5	San Diego	DeepMIMO City Scenario 7

Operational Settings:

Antennas at BS: 32
Antennas at UEs: 1
Subcarriers: 32
Paths: 20

Tokenization Code:

You can adjust the number of scenarios by changing the scenario_idxs. In the example below, scenario 0 and 1 are selected.

# Step 7: Tokenize the dataset
scenario_idxs = torch.arange(2)  # Adjust the number of scenarios you want
print("Tokenizing the dataset...")
preprocessed_chs = tokenizer(deepmimo_data, scenario_idxs, gen_raw=True)

Use the scenario_idxs variable to select specific scenarios from the DeepMIMO dataset.
The dataset will be tokenized according to the chosen scenarios and preprocessing configurations.

This format separates the scenarios, operational settings, and the code clearly, making it more readable. The table provides a structured overview of the available scenarios with direct links to their respective pages on DeepMIMO.

LWM Inference

Choose the type of data you want to generate from the tokenized dataset, such as cls_emb, channel_emb, or raw:

# Step 8: Generate the dataset for inference (direct call, no module prefix)
input_type = ['cls_emb', 'channel_emb', 'raw'][1]  # Modify input type as needed
dataset = dataset_gen(preprocessed_chs, input_type, model)

Post-processing for Downstream Task

Use the Dataset in Downstream Tasks

Finally, you can use the generated raw channels and their inferred LWM embeddings in your downstream tasks:
```
# Step 9: Print results
print(f"Dataset generated with shape: {dataset.shape}")
print("Inference completed successfully.")
```

Requirements

Python 3.x
PyTorch
Git