본문 바로가기

내 맘대로 읽는 논문 리뷰/Speech & Signal2

X-vector X-Vectors: Robust DNN Embeddings for Speaker Recognition 발행 2016 https://ieeexplore.ieee.org/abstract/document/8461375?casa_token=9dMIoIumcvEAAAAA:XJa_Z3ezdJ7T_IFejJxePVUN4uxgGMOKWjSPVMwhzDvyBd-nhts-sfa1SXb7V5dt1_z44PsnGa8 Introduction X-vector는 speaker recognition task를 DNN으로 학습하기 위하여 고안된 fixed-length embedding으로, 그 학습의 용이를 위한 잔향, noise augmentation을 활용하여 speaker recognition 분야에서 baseline보다 높은 .. 2022. 8. 10.

AutoVC AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss https://arxiv.org/abs/1905.05879 코드: https://github.com/auspicious3000/autovc Abstract zero-shot voice conversion 기술로 2019년에 나온 논문. 해당 기술은 쉽게 말하면 A의 목소리(음색)으로 녹음된 발화를 다른 B의 목소리(음색)로 바꾸는 기술이다. 아래 데모 페이지를 참고하면 더 이해가 빠를 것이다. https://auspicious3000.github.io/autovc-demo/ 이러한 기술은 style transfer 기술이라고 칭하는데, 비슷한 approach들로 GAN, CVAE등의 ge.. 2022. 7. 15.

이전 1 다음

티스토리툴바