Fhrozen
/

vctk_hifigan.full_band.v2

Model card Files Files and versions Community

vctk_hifigan.full_band.v2 / README.md

Fhrozen's picture

update readme

0e65c4e over 1 year ago

|

1.22 kB

	---
	tags:
	- espnet
	- audio
	- audio-to-audio
	- vocoder
	language:
	- en
	datasets:
	- vctk
	license: cc-by-4.0
	inference: false
	---

	## Vocoder model - HifiGAN - English

	https://github.com/kan-bayashi/ParallelWaveGAN

	No support given.

	### Details

	```
	sampling_rate: 44100 # Sampling rate.
	fft_size: 2048 # FFT size.
	hop_size: 512 # Hop size.
	win_length: 2048 # Window length.
	# If set to null, it will be the same as fft_size.
	window: "hann" # Window function.
	num_mels: 80 # Number of mel basis.
	fmin: 0 # Minimum freq in mel basis calculation.
	fmax: 22050 # Maximum frequency in mel basis calculation.
	generator_type: HiFiGANGenerator
	generator_params:
	in_channels: 80 # Number of input channels.
	out_channels: 1 # Number of output channels.
	channels: 512 # Number of initial channels.
	kernel_size: 7 # Kernel size of initial and final conv layers.
	upsample_scales: [8, 8, 2, 2, 2] # Upsampling scales.
	upsample_kernel_sizes: [16, 16, 4, 4, 4] # Kernel size for upsampling layers.
	```