The technology industry continues to be a hotbed of innovation, with activity driven by increasing demand for natural and human-like speech synthesis in various applications such as virtual assistants, accessibility tools, and audio content production, and growing importance of technologies such as deep learning, neural networks, and speech synthesis techniques, which contribute to improved speech quality, expressiveness, and customization options for users. In the last three years alone, there have been over 3.6 million patents filed and granted in the technology industry, according to GlobalData’s report on Innovation in Artificial Intelligence: Text to speech systems. Buy the report here.

However, not all innovations are equal and nor do they follow a constant upward trend. Instead, their evolution takes the form of an S-shaped curve that reflects their typical lifecycle from early emergence to accelerating adoption, before finally stabilising and reaching maturity.

Identifying where a particular innovation is on this journey, especially those that are in the emerging and accelerating stages, is essential for understanding their current level of adoption and the likely future trajectory and impact they will have.

300+ innovations will shape the technology industry

According to GlobalData’s Technology Foresights, which plots the S-curve for the technology industry using innovation intensity models built on over 2.5 million patents, there are 300+ innovation areas that will shape the future of the industry.

Within the emerging innovation stage, finite element simulation, ML-enabled blockchain networks and generative adversarial network (GAN), are disruptive technologies that are in the early stages of application and should be tracked closely. Demand forecasting applications, intelligent embedded systems, and deep reinforcement learning are some of the accelerating innovation areas, where adoption has been steadily increasing. Among maturing innovation areas are wearable physiological monitors and smart lighting, which are now well established in the industry.

Innovation S-curve for artificial intelligence in the technology industry

Text to speech systems is a key innovation area in artificial intelligence

Text-to-speech systems, also referred to as speech synthesis or text-to-voice systems, are computer-based tools that transform written text into spoken words. These systems aim to replicate human speech patterns and are utilized in diverse applications such as assisting individuals with disabilities, creating audio versions of written content, and delivering audio notifications for medical equipment.

GlobalData’s analysis also uncovers the companies at the forefront of each innovation area and assesses the potential reach and impact of their patenting activity across different applications and geographies. According to GlobalData, there are 60+ companies, spanning technology vendors, established technology companies, and up-and-coming start-ups engaged in the development and application of text to speech systems.

Key players in text to speech systems – a disruptive innovation in the technology industry

‘Application diversity’ measures the number of different applications identified for each relevant patent and broadly splits companies into either ‘niche’ or ‘diversified’ innovators.

‘Geographic reach’ refers to the number of different countries each relevant patent is registered in and reflects the breadth of geographic application intended, ranging from ‘global’ to ‘local’.

Patent volumes related to text to speech systems

Company Total patents (2010 - 2022) Premium intelligence on the world's largest companies
Microsoft 180 Unlock Company Profile
Toshiba 124 Unlock Company Profile
Sony Group 96 Unlock Company Profile
Alphabet 88 Unlock Company Profile
Samsung Group 81 Unlock Company Profile
International Business Machines (IBM) 79 Unlock Company Profile
Dolby Laboratories 74 Unlock Company Profile
LG 62 Unlock Company Profile
Baidu 58 Unlock Company Profile
Amazon.com 40 Unlock Company Profile
Apple 39 Unlock Company Profile
Yamaha 38 Unlock Company Profile
Intel 37 Unlock Company Profile
NEC 36 Unlock Company Profile
Tencent 31 Unlock Company Profile
Panasonic 26 Unlock Company Profile
Legend 26 Unlock Company Profile
Neosapience 23 Unlock Company Profile
Casio Computer 23 Unlock Company Profile
Xperi 23 Unlock Company Profile
Qualcomm 22 Unlock Company Profile
AT&T 22 Unlock Company Profile
Cerence 19 Unlock Company Profile
Ping An Insurance 17 Unlock Company Profile
Koninklijke Philips 17 Unlock Company Profile
Honda Motor 16 Unlock Company Profile
Gainwell Technologies 15 Unlock Company Profile
Toyota Motor 14 Unlock Company Profile
Sharp 14 Unlock Company Profile
Telefonica 14 Unlock Company Profile
Robert Bosch Stiftung 13 Unlock Company Profile
Modulate 12 Unlock Company Profile
Huawei Investment & Holding 11 Unlock Company Profile
ZTE 11 Unlock Company Profile
Capital One Financial 10 Unlock Company Profile
Nokia 10 Unlock Company Profile
Meta Platforms 9 Unlock Company Profile
Nippon Telegraph and Telephone 9 Unlock Company Profile
Telefonaktiebolaget LM Ericsson 9 Unlock Company Profile
Interactive Intelligence Group 9 Unlock Company Profile
Motorola Solutions 8 Unlock Company Profile
Porsche Automobil 8 Unlock Company Profile
STATS 8 Unlock Company Profile
Mitsubishi Electric 8 Unlock Company Profile
InCube Labs 8 Unlock Company Profile
BlackBerry 8 Unlock Company Profile
Talkdesk 7 Unlock Company Profile
ROBLOX 7 Unlock Company Profile
Sohu.com 7 Unlock Company Profile
Zya 7 Unlock Company Profile

Source: GlobalData Patent Analytics

Microsoft is a leading patent filer in text-to-speech systems. One of the company’s patents focuses on multi-voice font interpolation that enables the creation of computer-generated speech with diverse speaker characteristics and prosody by combining existing fonts. A prediction model in the interpolation engine estimates parameters influencing speaker characteristics and prosody based on the phoneme sequence from the text. The engine generates additional parameter values through weighted interpolation, allowing modification of voice fonts to alter speech style and emotion while preserving the original voice's fundamental qualities. This technology facilitates transplanting speaker characteristics and prosody across voice fonts or generating entirely new attributes for existing voice fonts.  

Other prominent patent filers in the space include Toshiba and Sony Group.  

By geographic reach, Dolby Laboratories leads the pack, followed by Interactive Intelligence Group and 24/7 Customer. In terms of application diversity, Zya holds the top position, followed by Casio Computer and ROBLOX.  

Text-to-speech systems play a crucial role in enhancing accessibility by providing audio representation of written content, benefiting individuals with visual impairments, or reading difficulties. Additionally, these systems offer a wide range of applications in areas such as voice assistants, language learning, entertainment, and automated customer service, improving user experiences and enabling efficient information consumption.    To further understand how artificial intelligence is disrupting the technology industry, access GlobalData’s latest thematic research report on Artificial Intelligence (AI) – Thematic Intelligence.

Data Insights

From

The gold standard of business intelligence.

Blending expert knowledge with cutting-edge technology, GlobalData’s unrivalled proprietary data will enable you to decode what’s happening in your market. You can make better informed decisions and gain a future-proof advantage over your competitors.

GlobalData

GlobalData, the leading provider of industry intelligence, provided the underlying data, research, and analysis used to produce this article.

GlobalData’s Patent Analytics tracks patent filings and grants from official offices around the world. Textual analysis and official patent classifications are used to group patents into key thematic areas and link them to specific companies across the world’s largest industries.