An Overview of Speech-To-Text Conversion

An Overview of Speech-To-Text Conversion

Kartik Aggarwal , Naveen

Computational Intelligence and Machine Learning . 2023 October; 4(2): 13-21. Published online October 2023

doi.org/10.36647/CIML/04.02.A003

Abstract : As a result of developments in science and technology, an automatic speech-to-text (STT) conversion system has been available. This system converts spoken words into text that can be read visually. People with trouble hearing may use this technology to communicate in other ways, including understanding voice communication and being able to follow directions using their visual abilities. There are instances when seeing something is more powerful than listening to something, particularly in long-distance communication; thus, speech-to-text conversion is crucial in situations like these. One of the fascinating developments to occur in the twenty-first century is the advent of machine learning. It has evolved from its roots in neurology studies conducted in the 1940s into something like artificial intelligence humans have created. Neural networks, a collection of complex structures, are the basis of machine learning. When combined with optimization techniques, these networks mimic the behaviour of neurons in the human brain and allow a computer to learn from its experiences. Here we explore one of many potential uses for such structures - the analysis of vocal performance in an original study. In particular, we dissect voice recognition systems to determine their inner workings.

Keyword : Automated speech-to-text, Machine learning, Natural Language Processing, Neural Networks.