CADCODER/GenCAD-Code
Viewer • Updated • 163k • 397 • 20
This model generates CADQuery Python code from images of 3D CAD objects. It uses a Vision Transformer (ViT) encoder and CodeGPT decoder in a vision-encoder-decoder architecture.
Final training metrics:
from transformers import VisionEncoderDecoderModel, ViTFeatureExtractor, AutoTokenizer
from PIL import Image
import torch
# Load the model
model = VisionEncoderDecoderModel.from_pretrained("Thehunter99/vit-codegpt-cadcoder")
feature_extractor = ViTFeatureExtractor.from_pretrained("google/vit-base-patch16-224")
tokenizer = AutoTokenizer.from_pretrained("microsoft/CodeGPT-small-py")
# Load and process image
image = Image.open("path/to/your/cad_image.png")
pixel_values = feature_extractor(images=image, return_tensors="pt").pixel_values
# Generate CAD code
with torch.no_grad():
generated_ids = model.generate(
pixel_values,
max_length=256,
num_beams=4,
early_stopping=True,
pad_token_id=tokenizer.eos_token_id
)
generated_code = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(generated_code)
Input: Image of a 3D cube Output:
import cadquery as cq
# Create a simple cube
result = cq.Workplane("XY").box(10, 10, 10)
The model was trained on the CADCODER/GenCAD-Code dataset, which contains pairs of 3D CAD images and their corresponding CADQuery Python code.
If you use this model, please cite:
@misc{vit-codegpt-cadcoder,
title={VIT-CodeGPT CAD Code Generator},
author={Your Name},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/Thehunter99/vit-codegpt-cadcoder}
}