Virtual assistant at NEULA – results of R&D works
More than two years have passed since the beginning of the R&D project implemented by JT Weston with the use of EU funds. Our research team, operating mainly in the capital of Lower Silesia, used this time very effectively, reaching the assumed milestones.
The aim of the project was to develop an intelligent, virtual assistant module that will enable the use of unstructured messages (written and verbal) to support processes embedded in the NEULA platform. The works were divided into 3 stages:
- Recognition of messages introduced through different channels,
- Completing forms in NEULA,
- Filling missing content We present a summary of the main results of this work.
We present a summary of the main results of this work.
Recognition of messages introduced through various channels
The first year of the project was devoted mainly to developing a mechanism for loading data from various communication channels and processing them into a uniform record. The goal was to automatically assign data to the appropriate processes.
First, an analysis of the quality of solutions available on the market for extracting text from photos and PDF files (ie OCR – Optical Character Recognition) was carried out. The analysis showed that the algorithms used are very accurate when extracting data from files containing only text. However, their accuracy drops significantly if the text is linked to graphics or tables. In addition, the presence of handwriting (such as signatures, notes) or low-quality text (such as stamps) further degrades the effectiveness of OCR.
In order to improve the quality of OCR, our team has designed an algorithm to extract tables from PDF files. It uses the mechanism of building a document graph. Fragments of the content returned by the OCR algorithm are placed in the nodes of the graph and combined with each other according to their mutual position and distance. 90.91% PDF to OCR text assignment has been achieved.
Then, research on the classification of texts was carried out in order to assign the received messages (written or voice) to the appropriate processes. The accuracy of the developed algorithm was obtained at the level of 92%.
Completing forms in NEULA
The next stage of the project was devoted to the development of an information extraction mechanism and automatic completion of process forms on the NEULA platform. The R&D team developed an algorithm to extract mandatory fields from forms used in processes. The algorithm has been extended with a mechanism for retrieving additional information about form fields, such as the types of each field, the range of acceptable values and descriptions. The created algorithm achieved 100% accuracy on all tested processes. Further research was focused on creating a mechanism for extracting data from various communication channels. A method of communication between Neula and the chatbot, voicebot, content contained in PDF files and emails was developed. The complexity and variety of communication methods required the development of independent solutions for each communication method. The read accuracy was 91% for chatbot, 80% for voicebot, 88% for PDF documents and 97% for email.
Filling missing content
As part of the last stage of the research, work was carried out on a mechanism allowing to determine the information necessary for the course of the process in NEULA, analyze the available data and ask the user to complete the missing data. 6 processes implemented on the NEULA platform were used. For each of the processes, a predictive model based on machine learning and artificial intelligence algorithms was developed, allowing for the prediction of form field values. In addition, an algorithm for determining the course of the process was built. It allows you to predict the most likely course of even very complex processes.
What’s next?
Work is currently underway to integrate the developed solutions with the NEULA platform. After their completion, the new solution will be ready for the first implementations for existing and new users of the NEULA platform. They are already the first to use the technology built at JT Weston, unique on a European scale, at least.
The project is co-financed by the European Union under Measure 1.1.1 R&D projects of enterprises, Sub-measure 1.1.1. Industrial research and development work carried out by enterprises, the so-called Fast Track, Smart Growth Operational Program for 2014-2020.