sadhaklal's picture
added information about the auxiliary head
75354a2 verified
|
raw
history blame
4.55 kB
---
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
pipeline_tag: tabular-regression
library_name: pytorch
datasets:
- gvlassis/california_housing
metrics:
- rmse
---
# wide-and-deep-net-california-housing-v3
A wide & deep neural network trained on the California Housing dataset. It is a PyTorch adaptation of the TensorFlow model in Chapter 10 of Aurelien Geron's book 'Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow'.
![](https://raw.githubusercontent.com/sambitmukherjee/handson-ml3-pytorch/main/chapter10/Figure_10-15.png)
The model takes eight features: `'MedInc'`, `'HouseAge'`, `'AveRooms'`, `'AveBedrms'`, `'Population'`, `'AveOccup'`, `'Latitude'` and `'Longitude'`. It predicts `'MedHouseVal'`.
The first five features (`'MedInc'`, `'HouseAge'`, `'AveRooms'`, `'AveBedrms'` and `'Population'`) flow through the wide path.
The last six features (`'AveRooms'`, `'AveBedrms'`, `'Population'`, `'AveOccup'`, `'Latitude'` and `'Longitude'`) flow through the deep path.
Note: The features `'AveRooms'`, `'AveBedrms'` and `'Population'` flow through both the wide path and the deep path.
The model also has an auxiliary head. The main head and the auxiliary head output the same thing (`'MedHouseVal'`). As mentioned in the book, this is a regularization technique, to try and ensure that the "underlying part of the network" (i.e., the deep path) learns something useful on its own, without relying on the rest of the network.
Code: https://github.com/sambitmukherjee/handson-ml3-pytorch/blob/main/chapter10/wide_and_deep_net_california_housing_v3.ipynb
Experiment tracking: https://wandb.ai/sadhaklal/wide-and-deep-net-california-housing
## Usage
```
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing(as_frame=True)
from sklearn.model_selection import train_test_split
X_train_full, X_test, y_train_full, y_test = train_test_split(housing['data'], housing['target'], test_size=0.25, random_state=42)
X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full, test_size=0.25, random_state=42)
X_means, X_stds = X_train.mean(axis=0), X_train.std(axis=0)
X_train = (X_train - X_means) / X_stds
X_valid = (X_valid - X_means) / X_stds
X_test = (X_test - X_means) / X_stds
import torch
device = torch.device("cpu")
from dataclasses import dataclass
from typing import Optional
@dataclass
class WideAndDeepNetOutput:
main_output: torch.Tensor
aux_output: torch.Tensor
main_loss: Optional[torch.Tensor] = None
aux_loss: Optional[torch.Tensor] = None
loss: Optional[torch.Tensor] = None
import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin
class WideAndDeepNet(nn.Module, PyTorchModelHubMixin):
def __init__(self):
super().__init__()
self.hidden1 = nn.Linear(6, 30)
self.hidden2 = nn.Linear(30, 30)
self.main_head = nn.Linear(35, 1)
self.aux_head = nn.Linear(30, 1)
self.main_loss_fn = nn.MSELoss(reduction='sum')
self.aux_loss_fn = nn.MSELoss(reduction='sum')
def forward(self, input_wide, input_deep, label=None):
act = torch.relu(self.hidden1(input_deep))
act = torch.relu(self.hidden2(act))
concat = torch.cat([input_wide, act], dim=1)
main_output = self.main_head(concat)
aux_output = self.aux_head(act)
if label is not None:
main_loss = self.main_loss_fn(main_output.squeeze(), label)
aux_loss = self.aux_loss_fn(aux_output.squeeze(), label)
loss = 0.9 * main_loss + 0.1 * aux_loss
return WideAndDeepNetOutput(main_output, aux_output, main_loss, aux_loss, loss)
else:
return WideAndDeepNetOutput(main_output, aux_output)
model = WideAndDeepNet.from_pretrained("sadhaklal/wide-and-deep-net-california-housing-v3")
model.to(device)
model.eval()
# Let's predict on 3 unseen examples from the test set:
print(f"Ground truth housing prices: {y_test.values[:3]}")
new = {
'input_wide': torch.tensor(X_test.values[:3, :5], dtype=torch.float32),
'input_deep': torch.tensor(X_test.values[:3, 2:], dtype=torch.float32)
}
new = {k: v.to(device) for k, v in new.items()}
with torch.no_grad():
output = model(**new)
print(f"Predicted housing prices: {output.main_output.squeeze()}")
```
## Metric
RMSE on the test set: 0.574
---
This model has been pushed to the Hub using the [PyTorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration.