1,796 Hours - German Speech Data by Mobile Phone
German audio data captured by mobile phone, 1,796 hours in total, recorded by 3,442 German native speakers. The recorded text is designed by linguistic experts, covering generic, interactive, on-board, home and other categories. The text has been proofread manually with high accuracy; this data can be used for automatic speech recognition, machine translation, and voiceprint recognition.
52,483 Shanghai Dialect Pronunciation Dictionary
The data contains more than 50,000 entries. All words and pronunciations are produced by Shanghai dialect linguists, including 410 international phonemes and 74 Shanghai phonemes. The pinyin of Shanghai dialect consists of five single tones, namely, yin ping, yin qu, yang qu, yin ru, yang ru, with accurate pronunciation. It can be used in the research and development of Shanghai dialect identification technology.
98 Hours - Taiwan Mandarin Speech Data by Mobile Phone_Reading
The data collects 204 Taiwan residents with 450 sentences for each speaker. The recorded is rich in content, including economy, entertainment, news, spoken language, numbers, letters, etc., covering general scenes and human-computer interaction scenes. Manual transcription of text to make sure the high accuracy. Recording devices are mainstream Android phones and iPhones.
388 Hours - Spanish Speaking English Speech Data by Mobile Phone
891 Spanish native speakers participated in the recording with authentic accent. The recorded script is designed by linguists and cover a wide range of topics including generic, interactive, on-board and home. The text is manually proofread with high accuracy. It matches with mainstream Android and Apple system phones. The data set can be applied for automatic speech recognition, and machine translation scenes.
CUSTOMIZED COLLECTION & ANNOTATION SERVICES
1,000,000+ crowdsourcing to perform complex and professional projects