Developing TTS and ASR for Lule and North Sámi languages

1UiT The Arctic University of Norway
2National Library of Norway
SIGUL 2023
2nd Annual Meeting of the Special Interest Group on Under-resourced Languages
A Satellite Workshop of Interspeech 2023

Abstract

Recent innovations in speech technology have made high quality TTS and ASR available even for extremely low-resource languages. This paper presents our updated work-in-progress report of an open-source speech technology project for two indigenous Sámi languages that are minority languages in Norway, Sweden and Finland. At this stage, we have designed and collected text and speech corpora for training the first neural text-to-speech (TTS) for Lule Sámi. We will update the previous North Sámi TTS by collecting additional materials and by training a new model using state-of-the-art technologies. We also describe our first experiments with developing ASR for North Sámi and discuss the next steps to be taken in our project.

TTS Examples

Text Language Gender TTS
Dohko bohtet olbmot ja duddjojit buot lágan dujiid ja mii guossohit gáfe, ságastallat dujiid birra ja deaivvadit nuppiiguin. North Sámi Male
Raportta mielde ulbmilin lea maid geahpedit boazodoalu ja eará eanageavaheami ruossalasvuođaid sihke sihkkarastit guohtoneatnamiid ceavzilis anu. North Sámi Female
Divvun, sáme duollatjállemvædtsak, le dal ásaduvvam stuoves årnigin Tråmså universitiehtan. Lule Sámi Male

Poster

BibTeX


        @inproceedings{hiovain-asikainen-de-la-rosa-2023-tts-asr-sami,
          title = "Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings",
          author = "Hiovain-Asikainen, Katri  and
            {De la Rosa}, Javier",
          booktitle = "Proceedings of the 2nd Annual Meeting of the Special Interest Group on Under-resourced Languages",
          month = aug,
          year = "2023",
          address = "Dublin, Ireland",
          publisher = "Interspeech 2023",
          url = "...",
          pages = "...",
      }