metadata
license: apache-2.0
datasets:
- scikit-learn/iris
metrics:
- accuracy
library_name: pytorch
pipeline_tag: tabular-classification
logistic-regression-iris
A logistic regression model trained on the Iris dataset.
It takes two inputs: 'PetalLengthCm'
and 'PetalWidthCm'
. It predicts whether the species is 'Iris-setosa'
.
It is a PyTorch adaptation of the scikit-learn model in Chapter 10 of Aurelien Geron's book 'Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow'.
Experiment tracking: https://wandb.ai/sadhaklal/logistic-regression-iris
Usage
!pip install -q datasets
from datasets import load_dataset
iris = load_dataset("scikit-learn/iris")
iris.set_format("pandas")
iris_df = iris['train'][:]
X = iris_df[['PetalLengthCm', 'PetalWidthCm']]
y = (iris_df['Species'] == "Iris-setosa").astype(int)
class_names = ["Not Iris-setosa", "Iris-setosa"]
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X.values, y.values, test_size=0.3, stratify=y, random_state=42)
X_means, X_stds = X_train.mean(axis=0), X_train.std(axis=0)
import torch
import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin
device = torch.device("cpu")
class LinearModel(nn.Module, PyTorchModelHubMixin):
def __init__(self):
super().__init__()
self.fc = nn.Linear(2, 1)
def forward(self, x):
out = self.fc(x)
return out
model = LinearModel.from_pretrained("sadhaklal/logistic-regression-iris")
model.to(device)
# Inference on new data:
import numpy as np
X_new = np.array([[2.0, 0.5], [3.0, 1.0]]) # Contains data on 2 new flowers.
X_new = ((X_new - X_means) / X_stds) # Normalize.
X_new = torch.from_numpy(X_new).float()
model.eval()
X_new = X_new.to(device)
with torch.no_grad():
logits = model(X_new)
proba = torch.sigmoid(logits.squeeze())
preds = (proba > 0.5).long()
print(f"Predicted classes: {preds}")
print(f"Predicted probabilities of being Iris-setosa: {proba}")
Metric
As shown above, the validation set contains 30% of the examples (selected at random in a stratified fashion).
Accuracy on the validation set: 1.0