INTERSPEECH 2022, We are coming

From：MKT Date：2022-09-16

INTERSPEECH is the world’s largest and most comprehensive conference on the science and technology of spoken language processing.

INTERSPEECH conferences emphasize interdisciplinary approaches addressing all aspects of speech science and technology, ranging from basic theories to advanced applications. [The 23rd INTERSPEECH Conference] (https://interspeech2022.org/) from September 18 to 22, 2022 at Songdo ConvensiA, in Incheon, Korea, under the theme “Human and Humanizing Speech Technology”.

As the leading data solution provider in AI industry, we are happy we will be one of the sponsors of the INTERSPEECH Conference for 2022 and we are welcoming you to the world of interspeech 2022.

---

To celebrate the INTERSPEECH Conference, we are planning to give away total 9 free dataset which worthy 40,000 USD to 400 participants of the conference.

Our give away dataset are picked from 5250 hours audio database, across English, Spanish, Korean, Hindi, Japanese, Italian, German, Vietnamese, and French. All speakers who participated in the recording and conducted face-to-face communication in a natural way. various topics are discussed. Transcription is included.

----

Here’s List of the given away dataset:

1. 20 Hours - Italian Conversational Speech Data by Mobile Phone

About 700 speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transferred manually, with high accuracy.

Full Dataset: https://www.datatang.ai/datasets/1178

2. 20 Hours - Japanese Conversation Speech by Mobile Phone

About 1000 speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transferred manually, with high accuracy.

Full Dataset: https://www.datatang.ai/datasets/1166

3. 20 Hours - Hindi Conversational Speech Data by Mobile Phone

About 1,000 speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transferred manually, with high accuracy.

Full Dataset: https://www.datatang.ai/datasets/1156

4. 20 Hours - Spanish Conversational Speech Data by Mobile Phone

https://www.datatang.ai/datasets/1147

5. 20 Hours - French Conversational Speech Data by Mobile Phone

https://www.datatang.ai/datasets/1146

6. 20 Hours – Vietnamese Conversational Speech Data by Mobile Phone

About 750 speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transferred manually, with high accuracy.

https://www.datatang.ai/datasets/1122

7. 20 Hours – German Conversational Speech Data by Mobile Phone

About 750 speakers participated in the recording, and conducted communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transferred manually, with high accuracy.

https://www.datatang.ai/datasets/1121

8. 20 Hours - Korean Conversational Speech Data by Mobile Phone

About 700 Korean speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transferred manually, with high accuracy.

https://www.datatang.ai/datasets/1103

9. 20 Hours – American English Natural Dialogue Speech Data

2000 speakers participated in the recording and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transferred manually, with high accuracy.

https://www.datatang.ai/datasets/1004

------

If you are looking for full version of the dataset, don’t hesitate to contact us [email protected]

To be the first one to get the dataset, please scan the code below and fill out the form below and we will send out the data on September 18 through registered email.

How to Build a Conversational AI Models? In the View of Training Data

From the perspective of the current data industry, most speech recognition data is based on reading training data. Reading speech data can solve relatively simple human-machine interaction application scenarios such as mobile phone voice assistants, in-vehicle voice assistants, smart speakers, and smart home appliances.

Build Minority Language ASR Models With High-Quality Reading Speech Data

In the past ten years, driven by deep learning, speech recognition technology and applications have achieved rapid development. Related products and services equipped with speech recognition technology, such as voice search, voice input method, smart speakers, smart TVs, smart wearables, intelligent customer service, robots, etc. have been widely used in all aspects of our lives.

INTERSPEECH 2022, We are coming

Previous

How to Build a Conversational AI Models? In the View of Training Data

Next

Build Minority Language ASR Models With High-Quality Reading Speech Data

INTERSPEECH 2022, We are coming

Recent

Datatang is going to attend Interspeech 2023

Empowering Multilingual Speech Recognition in the Automotive Industry

AI-Driven Marketing

Previous

How to Build a Conversational AI Models? In the View of Training Data

Next

Build Minority Language ASR Models With High-Quality Reading Speech Data