Stay informed with free updates
Simply sign up to the Artificial intelligence myFT Digest — delivered directly to your inbox.
On his final earnings call as chief executive of gene sequencing company Illumina, Francis deSouza did his best to stay positive.
A contentious $8bn takeover of cancer screening business Grail had prompted a campaign by activist investor Carl Icahn, fights with competition authorities on both sides of the Atlantic, and criticism from Grail’s founding directors.
DeSouza told analysts the drama was only affecting “a very small part of the company”.
But each time he was asked about Grail, there were shifts in his speech rate, pitch and volume, according to Speech Craft Analytics, which uses artificial intelligence to analyse audio recordings. There was also an increase in filler words like “um” and “ah” and even an audible gulp.
The combination “betrays signs of anxiety and tension specifically when addressing this sensitive issue”, according to David Pope, Speech Craft Analytics chief data scientist.
DeSouza resigned less than two months later.
The idea that audio recordings could provide tips on executives’ true emotions has caught the attention of some of the world’s largest investors.
Many funds already use algorithms to trawl through transcripts of earnings calls and company presentations to glean signals from executives’ choice of words — a field known as “Natural Language Processing” or NLP. Now they are trying to find further messages in the way those words are spoken.
“The idea is that audio captures more than just what is in text,” said Mike Chen, head of alternative alpha research at Robeco, the asset manager. “Even if you have a sophisticated semantic machine, it only captures semantics.”
Hesitation and filler words tend to be left out of transcripts, and AI can also pick up some “microtremors” that are imperceptible to the human ear.
Robeco, which manages over $80bn in algorithmically driven funds, making it one of the largest quants, began adding audio signals picked up through AI into its strategies earlier this year. Chen said it had added to returns, and that he expected more investors to follow suit.
The use of audio represents a new level in the game of cat and mouse between fund managers and executives.
“We found tremendous value from transcripts,” said Yin Luo, head of quantitative research at Wolfe Research. “The problem that has created for us and many others is that overall sentiment is becoming more and more positive . . . [because] company management knows their messages are being analysed.”
Multiple research papers have found that presentations have become increasingly positive since the emergence of NLP, as companies adjust their language to game the algorithms.
A paper co-written by Luo earlier this year found that combining traditional NLP with audio analysis was an effective way to differentiate between companies as their filings become increasingly “standardised”.
Although costs have come down, the approach can still be relatively expensive. Robeco spent three years investing in a new technology infrastructure before it even began work on incorporating audio analysis.
Chen spent years trying to use audio before joining Robeco, but found the technology was not advanced enough. And while the insights available have improved, there are still limitations.
To avoid jumping to conclusions based on different personalities — some executives may just be more naturally effusive than others — the most reliable analysis comes from comparing different speeches by the same individual over time. But that can make it harder to judge the performance of a new leader — arguably a time when insight would be particularly useful.
“A limitation even in NLP is that a CEO change messes up the overall sentiment [analysis],” said one executive at a company that provides NLP analysis. “This disruption effect has got to be stronger with voice.”
Developers must also avoid adding their own biases into algorithms based on audio, where differences such as gender, class or race can be more obvious than in text.
“We are very careful in making sure the conscious biases that we’re aware of don’t make it in, but there could still be subconscious ones,” said Chen. “Having a large and diverse research team at Robeco helps.”
Algorithms can give misleading results if they try to analyse someone speaking in a non-native language, and an interpretation that works in one language may not work in another.
Just as companies have tried to adapt to text analysis, Pope predicted investor relations teams would start coaching executives to monitor voice tone and other behaviour that transcripts miss. Voice analysis struggles with trained actors who can convincingly stay in character, but replicating that may be easier said than done for executives.
“Very few of us are good at modulating our voice,” he said. “It’s much easier for us to choose our words carefully. We’ve learned to do this since we were very young to avoid getting in trouble.”
Robert Johnson is a UK-based business writer specializing in finance and entrepreneurship. With an eye for market trends and a keen interest in the corporate world, he offers readers valuable insights into business developments.