File size: 2,507 Bytes
35a3059
 
 
5f46ec9
35a3059
 
 
 
7fba326
 
 
 
 
 
 
 
 
35a3059
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
239b210
7bb086a
35a3059
c2e5809
 
 
 
 
35a3059
 
 
 
 
c2e5809
 
 
 
 
 
 
 
 
 
 
 
 
7fba326
 
 
 
 
07776ce
7fba326
c2e5809
 
 
35a3059
 
 
 
0bf374f
239b210
 
 
 
d11b587
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
tags:
- text-to-image
- text-to-video
- lora
- diffusers
- template:diffusion-lora
widget:
- text: '-'
  output:
    url: images/ComfyUI_00789_.png
- text: '-'
  output:
    url: images/ComfyUI_00796_.png
- text: '-'
  output:
    url: images/ComfyUI_00793_.png
- text: >-
    an anime illustration of kitsune, girl, blue eyes, braided hair,
    multicoloured hair, brown hair, pink hair, brown fox ears, brown fox tail,
    fantasy school uniform, open shoulders, masterpiece, best quality, with
    professional photography composition, dynamic lighting, well-balanced color
    and contrast, clear separation of subject and background, detailed, and
    storytelling.
  output:
    url: images/ComfyUI_00784_.png
base_model: tencent/HunyuanVideo
instance_prompt: an anime illustration of
license: mit
---
# Hunyuan Video Lora - AnimeStills

<Gallery />

**EXPERIMENTAL:** the model generates noisy, low-resolution illustration-like images. It can be used to guide more refined models such as SDXL for its natural language (and composition) capabilities, but use with a grain of salt if you plan to use it directly. Also, results might look 'old-time anime' due to dataset used. 


A experimental model that uses HunyuanVideo as a image generator. outputs images at 768 resolution.

In a typical HunyuanVideo workflow, set 'frame' to **1** and add this lora to get an anime illustration-like output.


## Trigger words

You should use `an anime illustration of` to trigger the image generation.


## Resolutions

Use the following resolution for the best results:
```
(768, 768)
(672, 864), (864, 672)
(608, 960), (960, 608)
(544, 1088), (1088, 544)
```

## Training


![image/png](https://cdn-uploads.huggingface.co/production/uploads/636982a164aad59d4d42714b/oMsfEYPYbWLyK6mihIkWm.png)

The model has been trained on a tag-balanced dataset of 2k best pixiv illustrations, at resolution of 768, for 856 eopchs (214 epochs * 4 repeats per epoch). 


The training takes about 3 days on a 8 x H100 cluster. By the time training ends the loss is still consistently going down, so further training could be beneficial.




## Download model

Weights for this model are available in Safetensors format.

[Download](/trojblue/HunyuanVideo-lora-AnimeStills/tree/main/epoch214) them in the Files & versions tab.


## Limitations

The model outputs could be deformed, not conforming to prompt, turning realistic, or getting nsfw results, due to the limited size of dataset used and limitations of lora models.