Instantly translate audio into another language
You can try the demo.
How to use
It’s open source, so you can use it from code.
The demo is in four languages, but the code supports the following languages.
"afr": "Afrikaans",
"amh": "Amharic",
"arb": "Modern Standard Arabic",
"ary": "Moroccan Arabic",
"arz": "Egyptian Arabic",
"asm": "Assamese",
"ast": "Asturian",
"azj": "North Azerbaijani",
"bel": "Belarusian",
"ben": "Bengali",
"bos": "Bosnian",
"bul": "Bulgarian",
"cat": "Catalan",
"ceb": "Cebuano",
"ces": "Czech",
"ckb": "Central Kurdish",
"cmn": "Mandarin Chinese",
"cym": "Welsh",
"dan": "Danish",
"deu": "German",
"ell": "Greek",
"eng": "English",
"est": "Estonian",
"eus": "Basque",
"fin": "Finnish",
"fra": "French",
"gaz": "West Central Oromo",
"gle": "Irish",
"glg": "Galician",
"guj": "Gujarati",
"heb": "Hebrew",
"hin": "Hindi",
"hrv": "Croatian",
"hun": "Hungarian",
"hye": "Armenian",
"ibo": "Igbo",
"ind": "Indonesian",
"isl": "Icelandic",
"ita": "Italian",
"jav": "Javanese",
"jpn": "Japanese",
"kam": "Kamba",
"kan": "Kannada",
"kat": "Georgian",
"kaz": "Kazakh",
"kea": "Kabuverdianu",
"khk": "Halh Mongolian",
"khm": "Khmer",
"kir": "Kyrgyz",
"kor": "Korean",
"lao": "Lao",
"lit": "Lithuanian",
"ltz": "Luxembourgish",
"lug": "Ganda",
"luo": "Luo",
"lvs": "Standard Latvian",
"mai": "Maithili",
"mal": "Malayalam",
"mar": "Marathi",
"mkd": "Macedonian",
"mlt": "Maltese",
"mni": "Meitei",
"mya": "Burmese",
"nld": "Dutch",
"nno": "Norwegian Nynorsk",
"nob": "Norwegian Bokm\u00e5l",
"npi": "Nepali",
"nya": "Nyanja",
"oci": "Occitan",
"ory": "Odia",
"pan": "Punjabi",
"pbt": "Southern Pashto",
"pes": "Western Persian",
"pol": "Polish",
"por": "Portuguese",
"ron": "Romanian",
"rus": "Russian",
"slk": "Slovak",
"slv": "Slovenian",
"sna": "Shona",
"snd": "Sindhi",
"som": "Somali",
"spa": "Spanish",
"srp": "Serbian",
"swe": "Swedish",
"swh": "Swahili",
"tam": "Tamil",
"tel": "Telugu",
"tgk": "Tajik",
"tgl": "Tagalog",
"tha": "Thai",
"tur": "Turkish",
"ukr": "Ukrainian",
"urd": "Urdu",
"uzn": "Northern Uzbek",
"vie": "Vietnamese",
"xho": "Xhosa",
"yor": "Yoruba",
"yue": "Cantonese",
"zlm": "Colloquial Malay",
"zsm": "Standard Malay",
"zul": "Zulu",
install
git clone https://github.com/facebookresearch/seamless_communication.git
cd seamless_communication
pip install .
execution
Read aloud with preset voices
m4t_predict your_audio.m4a --task s2st --tgt_lang eng --output_path result_audio.mp3
The first English voice is the translation result (my video editing didn’t work).
Read out in your own voice
expressivity_predict <path_to_input_audio> --tgt_lang <tgt_lang> --model_name seamless_expressivity --vocoder_name vocoder_pretssel --output_path <path_to_save_audio>
However, the personal voice version requires registration and authentication such as an email address to download the model, so I have not been able to try it yet. There is an application form in the repository.
🐣
I’m a freelance engineer.
Work consultation
Please feel free to contact us with a brief development description.
rockyshikoku@gmail.com
I am creating applications using machine learning and AR technology.
I send machine learning / AR related information.