An Enterprise Solution for Using Text Mining and Predictive Analytics

A Reflective Assessment of the Use of Text Analytics in an Enterprise IT Environment

6 min readJun 20, 2019

Figure 1: Content Management System (QueryCare 2019)

Introduction

Within any large enterprise organisation, there is an IT department which works to service the rest of the company with any computer-related or technological enquiries. However, the process of requesting assistance from the IT department, by way of telephone enquiry or online ticket submission, can be somewhat convoluted, confusing, and even frustrating. One way to improve this is to have a search module, whereby the user can type a certain number of words, and the system will return a list of knowledge articles for the users’ consideration. This can be incredibly useful, and may even mean the user might not even need to submit an enquiry to the IT department. The details behind this program needs to be optimised for speed and accuracy, because ultimately the goal is to increase the user efficiency and usefulness of each employee within the organisation.

Some Background

Working as a white-collar employee in a large company, there are inevitably times when I need to ask IT for something. This may mean I’ll need to call the IT Service Desk for something like installing a new software, password issues, or a system outage of some sort. These types of requests understandably require human intervention and thus need to be actioned over the phone. However, there are also times when I need to contact the IT team for low-priority issues, like ‘how-to’ enquiries, requests to set up new email distribution lists, or to see which system deployments are scheduled and what issues they are resolving. For these types of requests, the business requires a submission of a service-request ticket which will be actioned within 3–7 business days. This is a typical scenario, and is in-line with the service-level agreements and prioritisation matrices of any enterprise IT department.

However, when my enquiry is fairly simple, one which may only take 10 minutes to fix (for example: ‘how can I set up a new printer?’ Or ‘how do I run a particular report?’), and then I am told that I need to wait 3–7 business days, this can get frustrating. Moreover, the process to submit the ticket via the online ticketing system can be convoluted, confusing, and equally as frustrating. Think of the last time you were researching on the internet about how to do something? When have you needed to submit a ticket to Google and needed to wait 3–7 business days for a response? The answer is that this has never happened. Therefore, I ask myself, how can this be better in my workplace?

Figure 2: Key Phrase Extraction and Analysis (Azure 2019)

The Solution

One solution that I believe will provide a lot of benefit to the business would be to streamline the integration between the IT ‘Ticket Submission’ page, and the Business ‘Knowledge Bank’ repositories. Specifically, to implement a method of predictive analytics based on text input, in a similar way that Google predicts pages based on only a few typed words. The text-mining algorithm behind this prediction should take each word typed in to a search bar, then compare those words against each page (whether it be an intranet page, internal wiki page, work instruction, or internal blog pages), and return to the user an ordered list based on which articles would be the most use to the user’s enquiry. Effectively, it is directing the user to the most appropriate place to answer their question, before they even ask the question to the greater IT team. However, if the user’s question is unable to be answered from the listed articles, then they would be able to still submit a Ticket via the usual means.

The Specifics

In order to have this solution implemented to its maximum efficiency, there are three prerequisites which need to be considered:

Sufficient number of documents and articles: A document prediction algorithm is useless unless there is a substantial amount of material in which it is able to query. Therefore, there needs to be enough how-to articles written, along with standardised work instructions, standard operating procedures, and supporting documents which can be analysed and returned to the user upon request.
Optimised, indexed, and networked repositories: Each of these articles needs to reside in a networked location, accessible by everyone within the business. Furthermore, it needs to be optimised and indexed in such a way as to speed up the query and return the results to the user quickly. It is pointless if the user needs to wait 30 seconds for the query to run; it should be nearly instantaneous. If Google is able to assess millions of pages in milliseconds, then it is entirely possible for an internal algorithm to analyse thousands of pages within seconds.
Robust algorithm enhanced for speed and accuracy: The chosen algorithm needs to be robust enough to query the term importance in all of the networked repositories and return the most relevant documents for the user. Further business understanding is needed to determine whether to choose a Clustering/Hierarchical algorithm, a Dirichlet Allocation algorithm, or a Convoluted Neural Network. However, suffice to say, unless the algorithm is quick and accurate, the users will not want to use it.

Ultimately, the proposed solution should be set up in such a way as to empower the user to find the answer for themselves; if the enquiry is simple enough. However, if the user is not able to find their answer in the documents, then the option to submit an enquiry should still be available.

Figure 3: Enterprise Content Management (WebTown 2019)

The Difference

Conversely, why is this solution different to a chat bot? Essentially, the solution is not set up to have a conversation with the user. The assumption is made that the user does not want to ask four or five questions in order to find the answer they are looking for, and they do not have the spare time to have a discussion with a robot. The Solution is geared in such a way as to predict and return knowledge articles to the user based on the importance and relevance of a small number of words submitted to the search bar. Obviously, one or two words will only have a certain level of accuracy, however as the user types more words, the model will be more accurate in predicting the most appropriate article. Nevertheless, the user should not need to type more than ten words in order to find the ideal article for their enquiry.

Conclusion

Sometimes I need to submit an enquiry to the IT department for a very simple request (for example: ‘how do I set up a new printer?’), and for requests such as this it is not unusual to be told that the request will be addressed within 3–7 business days. This is the case even if the request can be resolved within 10 minutes. This process can be improved by leveraging the current residual knowledge within the business (in the form of knowledge articles, SOP’s, instructions, and blogs), and directing employees like myself to these repositories before submitting an enquiry to IT. To do this, there needs to be enough of these pages available in the network, they need to be optimised, and the search algorithm needs to be both quick and accurate to predict the most appropriate article for the user. If delivered, this solution will make my life a lot less frustrating and would greatly reduce the number of tickets for the IT team; therefore creating a win-win for everyone.

References

Azure 2019, Text Analysis, Image, viewed 2 June 2019, <https://azure.microsoft.com/en-in/services/cognitive-services/text-analytics/>.

QueryCare 2019, Content Management Systems, Image, viewed 2 June 2019, <http://querycare.com/content-management-system/>.

WebTown 2019, eZ Platform Enterprise CMS, Image, viewed 2 June 2019, <https://www.webtown-group.com/ez-platform-enterprise-cms>.