Deliver Speech Data

  • Eng

  • Chi

  • Dut

  • Fra

  • Ger

  • Ita

  • Jpn

  • Kor

  • Lav

  • Nor

  • Por

  • Rus

  • Spa

  • Tha

  • Vie

E-mail Address
Birth Year
<County / State>
<City / Town>
Recording Date    
Recording Country
Recording City
Recording Country/City other ---
Recording place
Sampling rate
Quantization bit:
Introduced by (person)/Belonging to (company)
Deliver Media
Deliver Date    


Before the recording:
Thank you for coming today. We would like to explain about today's task to you.

The purpose of today's work is research and development for AI. I think you've heard the word AI. A computer with human-like intelligence. However, there are many types of AI. Recently famous are AI that can drive automatically and AI that can play chess. Autonomous driving is already possible, and AI that beats human professionals in chess is also possible. However, on the other hand, there is a field of AI that has not been completed yet and is said to be difficult in the future. It's communication AI. Certainly, AI that recognizes and translates words and short phrases has been completed to some extent. However, AI that can do a completely natural dialogue with humans has not been created. It is said to be extremely difficult in the future. However, although difficult, we are working on AI research and development in this field.

As you may know, AI today can "recognize" something in a field by learning a lot of data in that field. This process is called deep learning.For example, if you show hundreds of millions of pictures of cats and dogs with different appearances to a computer to learn, the computer can determine whether it is a cat or a dog even if the computer see the pictures of cats or dogs for the first time. It is said that computers can more accurately estimate the age of a person than humans whose image it sees for the first time.Therefore, also
when developing communication AI, it is necessary to let the computer learn "a large amount of data".

The large amount of data is "speech data-that is, a large amount of voice data". But there are many different languages ??around the world. Even in the same language, there are differences in accents depending on the dialect, region and age group. Furthermore, even if the same person utters the same word 10 times on the spot, the voice itself will be different 10 times. Therefore, it is necessary to have many people around the world speak a lot and record it. As mentioned above, this data is needed in large quantities, and we have recorded the voice data of hundreds of thousands of people around the world over the last 30-40 years. Still, communication AI has not been completed yet. It is difficult to develop AI and robots that can talk spontaneously. Today, as one of the millions of people around the world, I will record a data sample, your "voice," used for the research and development of AI in this difficult field (speech recognition).

Your voice recorded here today is used internally only for computer learning at Timehill Inc. and related companies, institute and universities. And your voice data is not open to the public.

Do you agree that your voice and speech recorded today/here will be used for the purposes mentioned above?







Copyright(c) 2016 Timehill Inc. All Rights Reserved.