Over 1,000 researchers, developers, data scientists, entrepreneurs, managers and over 50 talks – that was the AI for Human Language conference on March 5th, 2020 in Tel Aviv.
With the purpose of discovering the latest trends, collecting ideas and having interesting discussions with other NLP enthusiasts, our data scientists Jonathan and Simon visited the AI for Human Language in Israel.
Speakers from well-known AI research giants such as Intel, Google, IBM, as well as from smaller and specialized "grown-up" start-ups or from academic research offered a varied presentation program. Both the background of new MS Office features and the pitfalls of the fiercely competed translation market, which is increasingly based on machine learning and NLP models, were highlighted. These topics were presented in great technical detail and complemented by academic presentations on the latest insights into the "Sesame Street" generation of deep learning-based language models (ELMo, BERT).
Progress around conversational systems or chatbots
Next steps for the NLP community with deep learning language models
Interest and potential of NLP applications in the health sector
Conversational systems interact with customers in customer service, answer questions or enable self-service via natural language as a 'universal' interface. These systems present NLP researchers with interesting challenges, open new ways of customer interaction for companies and are positively accepted by customers. Accordingly, the interest in improvements of all aspects of conversational systems is high.
By now, projects like RASA and services like the MS Bot Framework exist, which significantly simplify the development of bots with basic capabilities. In particular, the trainable machine learning components for language comprehension are expedited. In this area, service providers offer domain-specific pre-training, e.g. to optimize intent recognition. At the same time, they are experimenting with techniques for augmenting the – always scarce – training data (see below).
However, it remains a great challenge to design chat bots in a sufficiently flexible way so that they can help their human users in situations that are rare or deviate from the anticipated course.
Another major topic of discussion was the success of deep learning language models, especially the transformer-based ones like BERT. After all, they achieve astonishing results in the most diverse areas from NLU to NLG, text classification, question answering and especially in numerous NLP competitions and benchmarks.
Besides the efforts for a better understanding of the mechanisms behind the models and their limitations, the NLP community is also working on the integration of the transformer models with 'classical' knowledge databases, ontologies and/or rule-based systems. The aim is not only to improve the results of the models, but also to extend these classical data sources by the learned models. The merging of these two approaches even has the potential to become the next development milestone in NL/AI systems.
In terms of application areas, the health sector was the focus. On one hand, there is great potential here, on the other hand, there are specific obstacles for NLP-supported solutions. Such solutions offer, for example, the chance to simplify administrative processes and reduce costs. In addition, they can make the relevant information from the often unstructured or only partially structured files, guidelines or knowledge databases etc. more easily accessible to medical staff and support them in the selection of treatment strategies.
Yet it is precisely hospitals and medical practices that are sparsely digitised. The domain is highly complex and sensitive, data is unstructured, strictly confidential and peppered with terminology. In addition, less training data is available than in other domains. At the same time, medical professionals working with patients tend to be more sceptical of automated systems than average and are easily put off by model errors.
The solution may be to involve doctors and experts even more in the development processes. On one hand, this increases trust and acceptance of new procedures, on the other hand, their expertise can and must be incorporated more in addition to the existing training data, e.g. for plausibility checks or as ’emergency brakes'.
LAMBADA deals with data augmentation in the context of conversational systems and creates a small paradox: improving intent classification by (actually much more sophisticated) text generation.
If there are only few labeled statements per intent for training an NLU model – a common problem in developer reality – improvements are possible. To achieve this, new statements must be generated using a specially fine-tuned GPT-2 model and, after some heuristic filtering steps, added to the training set for the NLU model. When training intent classifiers with few utterances (
To make Conversational Systems more user-friendly, Domain Exploration extends the bot capabilities so that it can deliver information that the user has not explicitly asked for. For this purpose, the system learns what information fits the previous conversation and could be of interest to the user.
One solution to the above described challenges of NLP in healthcare are hybrid algorithms. Ensemble models, i.e. combinations of sub-models, can, for example, handle classes with little training data using rule-based or knowledge-based methods and, for classes with enough training data, use the (here usually better generalizing) newer deep learning methods. This allows greater control over the entire model and uses the existing knowledge in a targeted way to strengthen the weak points of the models.
Too much trust in deep learning language models can also be misleading: If test data can be explained by simple heuristics, even complex neural architectures will only learn simple heuristics and not meaningful, generalizing inference rules.
A simple heuristic would be that every sentence that is completely and coherently contained in a true training sentence is itself true. For example in the following case:
It is wrong to conclude "The actor danced." from "The doctor near the actor danced.". This example is an error that a standard BERT model makes, if trained on the text inference data set MNLI. It shows how important it is not to accept deep learning models as a black box, but to investigate and explain how they work.
In many areas, NLP systems are already good enough to simplify frequent everyday challenges. Examples of well understood use cases are machine translation or text classification. But solutions are usually very specifically tailored to a use case, and there is still a long way to go before truly generalized, learning systems can be developed.
The integration of externally processed knowledge, in the form of knowledge databases, ontologies or hand-crafted rules, seems to be the next stage of development. Steadforce is looking forward to the new possibilities and challenges.