Image-to-3D
3d
File size: 1,588 Bytes
cdf2934
 
a992dbb
 
 
cdf2934
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
license: mit
tags:
  - 3d
pipeline_tag: image-to-3d
---
<b>Real3D</b>


**Model Details**:

**Model Description**:

We use the model architecture provided by [TripoSR](https://github.com/VAST-AI-Research/TripoSR), which is a Transformer model for 2D-to-3D mapping built on [LRM](https://arxiv.org/abs/2311.04400).

We scale it further on in-the-wild image collections by enabling unsupervised self-training and automatric data curation.

* Developed by: [Hanwen Jiang](https://hwjiang1510.github.io/)
* License: MIT
* Hardware: We train Real3D on 1 node (8GPU) with equivalent batch size of 80 for 5-6 days.

**Model Sources**:
* Paper: https://arxiv.org/abs/2406.08479
* Project: https://hwjiang1510.github.io/Real3D/
* Code for training and evaluation: https://github.com/hwjiang1510/Real3D

**Training Data**:
Real3D is jointly trained on synthetic data (Objaverse) and in-the-wild image collections. The former prevents training divergence, the latter introduces new knowldege from a broader distribution of real images. We use Objaverse renderings from [Zero-1-to-3](https://github.com/cvlab-columbia/zero123) and [GObjaverse](https://aigc3d.github.io/gobjaverse/). The in the wild images are from [ImageNet](https://www.image-net.org/), [OpenImages](https://storage.googleapis.com/openimages/web/index.html), etc.

**Misuse, Malicious Use, and Out-of-Scope Use**:
The model should not be used to intentionally create or disseminate 3D models that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.