The Future of NLG

July 30, 2020

Natural Language Generation (closely related to Natural Language Processing) is a field of AI which has been thrust into the forefront of the tech world in recent years by so called ‘Virtual Assistants’ such as Apple’s ‘Siri’, Microsoft’s ‘Cortana’, and Amazon’s ‘Alexa’. This heightened interest in the field is a sure sign of the maturation of the hardware and the software which drives it. To understand where NLG and NLP are heading in the future, it’s useful to know a little about their past.

We all remember early chatbots such as A.L.I.C.E, Dr. Sbaitso, and perhaps even the 1966 chatbot Eliza, but what is so special about this new generation of chatbots? All early chatbots worked in essentially the same way. ‘Eliza’ worked by matching responses to certain phrases and triggering them when appropriate; up until the mid 2000s this was as much as any chatbot was capable of. Of course some were more holistically designed, and contain more clever routes of conversation than others, but the underlying mechanics remained essentially the same. In the early 2000’s bots such as Smarterchild improved data access algorithms allowing for a greater scope in conversation topics, responses, and triggers, but again this does not fully reflect the NLG and NLP that we use today.

Chatbots, and NLG along with them, have been undergoing a revolution over the last decade, driven by the widespread adoption of these technologies amongst consumers in the form of Virtual Assistant software. The virtual assistant has, on the surface, many of the same properties of a chatbot. The key focus of these assistants, however, is utility. The first major difference is their capacity for Speech Recognition.

Major advances in Speech Recognition have coincided with the proliferation of Machine Learning techniques. A battle amongst the three main virtual assistant tech giants (Microsoft, Apple, and Amazon) has pushed Speech Recognition to a level of accuracy approaching 100%. When considering the hurdles of Speech Recognition, this becomes a truly amazing feat. To name a few: differentiating user voice from background noise, processing the wide variety accents and dialects (not to mention the hundreds of world languages), and being able to pick out the correct homonyms and homophones. It was only through the power of machine learning that this was achievable to such a high degree of accuracy.

Machine Learning was defined by Arthur Samuel in 1959 as the “field of study that gives computers the ability to learn without being explicitly programmed.” More recently (1998) it has been described in more practical terms by Tom Mitchell as a system which can “improve on Task T with respect to performance metric P based on experience E.” An early example of this is the categorization of spam email where T=categorizing emails into spam or legitimate emails, P=Percentage of email messages correctly classified, and E=Database of emails. This does, of course, also require human categorization to learn from. Machine learning, just like human learning, is not a process that happens in a vacuum, and training is required. This training can be very time and resource intensive, and as such some very clever methods have been developed to create the vast databases and libraries of data needed to teach these systems.

Almost everyone with access to the internet has been, whether they know it or not, training machine learning systems for years, through online CAPTCHA verification. To protect against malicious bots, validation is required to differentiate between human and computer. Simple tasks such as the ability to read distorted words were sufficient at this point. Initially just used for security, CAPTCHA took on a new roll when reCAPTCHA was deployed by Google in 2009, being used to digitize the entirety of Google Books by 2011. Since then it has gone on to compile vast data banks to be used for image recognition, using  such as road signs, and shop fronts. It is this kind of training that NLG requires to create content which does not rely on simple templating with interchangeable words/phrases. All that is required are the correct data banks to be used for training.

End to End Natural Language Generation uses machine learning and neural networking to create NLG outputs with almost no human interference in the process. This approach has had some degree of success, although only in projects of fairly limited scope. Creating more complex automated NLG systems using neural networking and machine learning can be problematic. One only needs to look at the latest generation of neural network, and deep learning based chatbots to see the problems which can occur with incorrect word associations, repetitive loops, and syntactical errors.

In the business world NLG is fast becoming a workhorse of content generation. Forbes declared NLG to be one of 2017’s hottest trends, while Kristian Hammond, Narrative Science’s co-founder, estimates that 90% of news could be algorithmically generated by the mid-2020s, much of it without human intervention. The applications are as innumerable, as they are invaluable. The ability to harness the vast quantity of structured and unstructured data (a huge 25 million terabytes a day) has the potential to change the face of everything from media to finance to education.

Looking into the future, we at Agrud hope to combine traditional NLG methodology, with its carefully constructed templates and rigorous human testing, and machine learning based NLG to create a platform which is as flexible as it is robust. As well as providing broad market analysis for a mass audience, Agrud is striving to create content tailored for individuals. The possibilities of using NLG to tailor information and news to small audiences have been explored in other sectors. In the financial world, however, the potential of NLG has not yet been fully realised.