lwm / README.md
Sadjad Alikhani
Update README.md
08a351c verified
|
raw
history blame
9.64 kB

๐Ÿ“ก LWM: Large Wireless Model

๐Ÿš€ Click here to try the Interactive Demo!

Welcome to the LWM (Large Wireless Model) repository! This project hosts a pre-trained model designed to process and extract features from wireless communication datasets, specifically the DeepMIMO dataset. Follow the instructions below to set up your environment, install the required packages, clone the repository, load the data, and perform inference with LWM.


๐Ÿ›  How to Use

1. Install Conda or Mamba (via Miniforge)

First, you need to have a package manager like Conda or Mamba (a faster alternative) installed to manage your Python environments and packages.

Option A: Install Conda

If you prefer to use Conda, you can download and install Anaconda or Miniconda.

  • Anaconda includes a full scientific package suite, but it is larger in size. Download it here.
  • Miniconda is a lightweight version that only includes Conda and Python. Download it here.

Option B: Install Mamba (via Miniforge)

Mamba is a much faster alternative to Conda. You can install Mamba by installing Miniforge.

  • Miniforge is a smaller, community-based installer for Conda that includes Mamba. Download it here.

After installation, you can use conda for environment management.


2. Create a New Environment

Once you have Conda (https://conda.io/projects/conda/en/latest/user-guide/install/index.html), follow these steps to create a new environment and install the necessary packages.

Step 1: Create a new environment

You can create a new environment called lwm_env:

conda create -n lwm_env

Step 2: Activate the environment

Activate the environment you just created:

conda activate lwm_env

Step 3: Install Required Packages

Install the necessary packages inside your new environment.

Install CUDA-enabled Pytorch

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

Note: If you have trouble installing the CUDA-enabled Pytorch, make sure the cuda version is compatibnle with your system. It can also because you have tried multiple install scripts. Try a new environment.

Install other required packages from conda-forge

conda install python numpy pandas matplotlib tqdm -c conda-forge

Install DeepMIMOv3 with pip

pip install DeepMIMOv3

3. Required Functions to Clone Datasets

The following functions will help you clone specific dataset scenarios:

import subprocess
import os

# Function to clone a specific dataset scenario folder
def clone_dataset_scenario(scenario_name, repo_url, model_repo_dir="./LWM", scenarios_dir="scenarios"):
    # Create the scenarios directory if it doesn't exist
    scenarios_path = os.path.join(model_repo_dir, scenarios_dir)
    if not os.path.exists(scenarios_path):
        os.makedirs(scenarios_path)

    scenario_path = os.path.join(scenarios_path, scenario_name)

    # Initialize sparse checkout for the dataset repository
    if not os.path.exists(os.path.join(scenarios_path, ".git")):
        print(f"Initializing sparse checkout in {scenarios_path}...")
        subprocess.run(["git", "clone", "--sparse", repo_url, "."], cwd=scenarios_path, check=True)
        subprocess.run(["git", "sparse-checkout", "init", "--cone"], cwd=scenarios_path, check=True)
        subprocess.run(["git", "lfs", "install"], cwd=scenarios_path, check=True)  # Install Git LFS if needed

    # Add the requested scenario folder to sparse checkout
    print(f"Adding {scenario_name} to sparse checkout...")
    subprocess.run(["git", "sparse-checkout", "add", scenario_name], cwd=scenarios_path, check=True)
    
    # Pull large files if needed (using Git LFS)
    subprocess.run(["git", "lfs", "pull"], cwd=scenarios_path, check=True)

    print(f"Successfully cloned {scenario_name} into {scenarios_path}.")

# Function to clone multiple dataset scenarios
def clone_dataset_scenarios(selected_scenario_names, dataset_repo_url, model_repo_dir):
    for scenario_name in selected_scenario_names:
        clone_dataset_scenario(scenario_name, dataset_repo_url, model_repo_dir)

4. Clone the Model

Next, you need to clone the LWM model from its Git repository. This will download all the necessary files to your local system.


# Step 1: Clone the model repository (if not already cloned)
model_repo_url = "https://huggingface.co/sadjadalikhani/lwm"
model_repo_dir = "./LWM"

if not os.path.exists(model_repo_dir):
    print(f"Cloning model repository from {model_repo_url}...")
    subprocess.run(["git", "clone", model_repo_url, model_repo_dir], check=True)

5. Clone the Desired Datasets

Before proceeding with tokenization and data processing, the DeepMIMO datasetโ€”or any dataset generated using the operational settings outlined belowโ€”must first be loaded. The table below provides a list of available datasets and their respective links for further details:

๐Ÿ“Š Dataset Overview

๐Ÿ“Š Dataset ๐Ÿ™๏ธ City ๐Ÿ‘ฅ Number of Users ๐Ÿ”— DeepMIMO Page
Dataset 0 ๐ŸŒ† Denver 1354 DeepMIMO City Scenario 18
Dataset 1 ๐Ÿ™๏ธ Indianapolis 3248 DeepMIMO City Scenario 15
Dataset 2 ๐ŸŒ‡ Oklahoma 3455 DeepMIMO City Scenario 19
Dataset 3 ๐ŸŒ† Fort Worth 1902 DeepMIMO City Scenario 12
Dataset 4 ๐ŸŒ‰ Santa Clara 2689 DeepMIMO City Scenario 11
Dataset 5 ๐ŸŒ… San Diego 2192 DeepMIMO City Scenario 7

It is important to note that these six datasets were not used during the pre-training of the LWM model, and the high-quality embeddings produced are a testament to LWMโ€™s robust generalization capabilities rather than overfitting.

Operational Settings:

  • Antennas at BS: 32
  • Antennas at UEs: 1
  • Subcarriers: 32
  • Paths: 20
# Step 2: Clone specific dataset scenario folder(s) inside the "scenarios" folder
dataset_repo_url = "https://huggingface.co/datasets/sadjadalikhani/lwm"  # Base URL for dataset repo
scenario_names = np.array(["city_18_denver",
                           "city_15_indianapolis",
                           "city_19_oklahoma", 
                           "city_12_fortworth",
                           "city_11_santaclara",
                           "city_7_sandiego"]
                          )

# Choose the desired scenario or secanrios (if you need the combined scenarios as a larger and more diverse scenario.).
scenario_idxs = np.array([0,1,2,3,4,5,6]) 
selected_scenario_names = scenario_names[scenario_idxs]

# Clone the requested scenario folders (this will clone every time)
clone_dataset_scenarios(selected_scenario_names, dataset_repo_url, model_repo_dir)

6. Change the working directory to LWM folder

if os.path.exists(model_repo_dir):
    os.chdir(model_repo_dir)
    print(f"Changed working directory to {os.getcwd()}")
else:
    print(f"Directory {model_repo_dir} does not exist. Please check if the repository is cloned properly.")

7. Tokenize and Load the Model

from input_preprocess import tokenizer
from lwm_model import lwm
import torch

preprocessed_chs = tokenizer(selected_scenario_names=selected_scenario_names, 
                             manual_data=None,
                             gen_raw=True)

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Loading the LWM model on {device}...")
model = lwm.from_pretrained(device=device)

8. Perform Inference

from inference import lwm_inference, create_raw_dataset
input_types = ['cls_emb', 'channel_emb', 'raw']
selected_input_type = input_types[0]
if selected_input_type in ['cls_emb', 'channel_emb']:
    dataset = lwm_inference(preprocessed_chs, selected_input_type, model, device)
else:
    dataset = create_raw_dataset(preprocessed_chs, device)

9. Explore the Interactive Demo

If you'd like to explore LWM interactively, check out the demo hosted on Hugging Face Spaces:
Try the Interactive Demo!


Now youโ€™re ready to dive into the world of Large Wireless Model (LWM), process wireless communication datasets, and extract high-quality embeddings to fuel your research or application!