Text Analysis of Applicants for Personality Classification Using Multinomial Naïve Bayes and Decision Tree

In talent recruitment, companies seek talents who fit with their corporate culture among many applicants. To conduct this process, companies carry out psychological tests to assess the fitness of applicants’ personalities with their corporate culture. However, psychological testing requires cost and much effort. Thus, an automated system necessary to assist companies in the talent recruitment process that can classify personalities through text and to reduce the effort needed. This research is conducted based on the personality traits according to the corporate culture in Telkom Indonesia. The data used is text data that has been labeled, pre-processed, and feature selected. The clean text data is used to create a classification model using multinomial Naïve Bayes and Decision Trees. There are six models built based on three work cultures. The Decision Tree achieves an accuracy of 33%, 66%, 80%, while multinomial Naïve Bayes with an accuracy of 83%, 50%, 60%, which resulted in better performance.


INTRODUCTION
The recruitment process is essential for a company because the quality of employees influences the overall company's performance. Therefore, companies are very selective in finding promising applicants [1]. However, with many applicants in every recruitment process, companies have difficulty getting employees who meet their criteria. Moreover, the recruitment process also requires enormous resources from the process, costs, and human resources [2]. The applicant's personality can be an essential factor for companies to determine whether they can work well or not. According to N. R. Ngatirin, Z. Zainol, and T. L. C. Yoong [3], personality represents a combination of features and qualities that build individual characteristics. Personality traits can be used to understand human behavior regarding many things, including how they work with their environment. Psychological testing is one of the efforts that can be done to determine the applicants' personality but requires a lot of time and cost in its implementation. Therefore, we need a system that can classify the applicants' personality and reduce the time and cost required. Personality classification is done by classifying applicant text using one of the text mining methods. The data used for personality classification is interview verbatim, which has been converted from interview recordings into text.
Multinomial Naïve Bayes and Decision Tree are the classification methods used in this study. Multinomial Naïve Bayes is used because each feature stands individually so that its implementation can produce excellent performance. Multinomial Naïve Bayes can also predict with less time, so it will reduce the time spent to classify the personality and acceptance of applicants to the company. The text mining will have more than a hundred features to naive calculate with the label. Therefore, multinomial Bayes will be suitable to be implemented in this research. Decision Tree is used because it is easy to apply, the classification process can be easily understood. and the learning speed is quite fast. Decision Tree is also suitable for this research that gives output categorical; more than that, we can calculate or see the most relevant word that can Copyright  Research conducted by Rintaspon Bhannarai and Chartchai Doungsa-ard states that personality tests can be performed to predict agile people suitable for an agile methodology based on the theory of big five personality traits. With k-nearest neighbor as the classification technique used, the study gave an accuracy of 65.71% with its best k as 1 [4].
Research conducted by Harshal Chaudhari, Nalini Yadhav, and Yash Shukla predicts what types of jobs are suitable for each individual based on their resumes. The classification is done using the Naïve Bayes Classifier by calculating the likelihood of candidates getting the tasks based on the tokens calculated from the resumes. Tokens are calculated based on academic grades, hobbies, professional experiences, projects, publications, awards, etc. The expected result from this study is that the created model can help, companies select candidates and employ competent applicants in the right position [5].
To classify personalities using text data, personality theory that can be used as a basis for the personality classification is needed. For example, the Big Five Personality Traits theory is a personality theory that categorizes people into five different characteristics [6]. The five characteristics are Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. But in this study, the personality traits that support the applicant's personality classification process are Conscientiousness and Agreeableness. The dataset used is the text data that needs to be done through the cleansing process first, so that the data can be used for the applicant's personality classification process. The result of the classification of applicants is whether the applicant is suitable or not for the company based on personality theory. Several limitations arise in this study. That is the difficulty of collecting the required dataset. Because the dataset needed is the data result of interviews between applicants and companies. The problem lies in the code of ethics of psychology that forbids anyone other than the parties involved to find out the contents of the interview. There are many procedures needed to be followed to collect the interview data from the company, whereas making the interview data manually requires a significant amount of money to pay volunteers to be interviewed and psychologists to identify the personality of the volunteer. The result is that the data collected is limited.
The objective of this research is to classify applicants' personalities based on existing text data. The steps taken are processing the text data that will be used to classify the personality. The features used in the classification process are a collection of words that have been filtered beforehand. Then, the accuracies of the two methods are calculated to find out which method is better to apply for the classification and whether the system built is good enough to be implemented or not.

A. Dataset
The dataset used in this study is data in the form of text obtained from interviews stored in audio and then converted into text through speech to text. There are nine assessments in the dataset that are not related to each other; the evaluation comes from the work culture that exists in the company with the associated personality theory, which is Conscientiousness and Agreeableness. The work culture used for three assessment criteria is integrity, enthusiast, and speed. Each of these assessments is labeled with two classes: 1 is to declare the applicant does not fit the company and 2 is to state that the applicant is suitable for the company. In this study, only three assessments are used because there are only three assessments balanced between the proportion of class 1 and class 2. The existing dataset is 53 in total, with a comparison between training data and test data at 90% for training data and 10% for test data.
The scoring process or how the applicant's personality can be classified is defined based on criteria from each work culture used in this study. Based on that criteria, scoring is defined as whether the applicants are suitable or not in the company. Using the number as explained before, 1 is for "not suitable" and 2 is for "suitable". An expert in the psychology major does the scoring process.  Table 1 is the score that represents the "suitable" and "not suitable" label.  Table 2 and Table 3 are the criteria used to determine to score for the applicant's personality, whether suitable or not in the company. Applicants effectively articulate a passion for sincerity at work, and a desire to do the best for themselves and the company.
Applicants effectively articulate agile attitudes and proactively provide the best for the company.

Not Suitable
Applicants cannot or less effectively articulate honest attitudes, positive behavior, and professional ethics that can positively affect the work environment.
Applicants cannot or less effectively articulate a passion for sincerity at work, and a desire to do the best for themselves and the company.
Applicants cannot or less effectively articulate an agile and proactive attitude to provide the best for the company.  Table 4 is an example of the dataset that classifies an applicant's personality with one of the work cultures: integrity.  Table 5 is an example of the dataset that classifies an applicant's personality with one of the work cultures: enthusiast.  Table 6 is an example of the dataset that classifies an applicant's personality with one of the work cultures: speed. Table 4, Table 5, and Table 6 above show a lot of speech that contains informal speech for the Indonesian language. It is because the recorded audio is based on spontaneous interviews, resulting in colloquial words that the interviewee directly said. The applicant talks whether formal, semi-formal, or informal depends on the atmosphere or situation created by the interviewer.

B. Propose Method a) Personality Traits
According to B. Y. Pratama and R. Sarno [7], personality is a combination of individual characteristics and behavior in overcoming various situations. Besides, personality also influences interactions with other people and the environment. Personality can be used as an assessment for employee recruitment, career consulting, relationships, and health. Attitudes and behavior can be explained by personality traits. Personality traits are beneficial for knowing individual psychological differences, psychological similarities between individuals, and identification of human nature [8]. It has a strong relation to the personal level, interpersonal level, life, and work decision [9], Personality traits affect leadership [10], ways of learning as well as work performance [11], academic ability and motivation [12]. So, when the company knows the applicant's personality, the company can put the applicant into a position that suits their personality best. By placing applicants in a suitable position, the company will increase its overall performance. Psychology offers personality tests to find out the personality traits of individuals [13]. An example of a personality test is the Big Five Inventory (BFI) [14].
The Big Five Personality traits or the Five-Factor Model is a personality theory that categorizes people into five different characteristics [6]. The five factors are Openness to Experience, Conscientiousness, Extraversion, Agreeableness, Neuroticism. Each of these factors has its features, namely Openness to Experience, with an active imagination, sensitive to feelings, and intellectual curiosity. Conscientiousness has characteristics that are neat, efficient, disciplined, conscientious, people who have these traits tend to be hard-working and reliable. Extraversion has features that are easy to socialize with other people and the environment; people who have these characteristics tend to enjoy spending time with many people by partying, doing community activities, and public demonstrations. People with these characteristics also tend to be better at doing work in groups. Agreeableness has features that are friendliness, sympathetic, cooperative, and caring; people who have these characteristics tend to prioritize shared interests rather than their interests. Neuroticism has characteristics that are anxious, worried, afraid, grumpy, frustrated, jealous, guilty, depressed, and lonely. People with these 77 Jurnal Infotel Vol. 12  characteristics tend to be moody, exaggerated responses to typical situations and consider a little problem as a huge problem.

b) Text Mining
Text mining is a method of taking information from unstructured text data [15]. Steps taken to gain knowledge in the text is usually by determining a specific pattern. Before determining the pattern, there are preprocessing stages in text mining that makes the data in the text to be the only data related to the information. This stage includes tokenization, normalization, and weighting of each word.
According to D. L. Olson and D.Delen [16], there are several techniques used in text mining: information retrieval, information extraction, text categorization, text summarizing, and text clustering. Text categorization will be used to collect the text, process it, and analyze it to determine the topic of the text. In this study, the personality possessed by each applicant is determined based on the text data of the applicant.

c) Multinomial Naïve Bayes
Naïve Bayes is a classification with probability and statistical methods. For each class selected, calculate the possibility based on the condition that the selected class is valid, where the vector is the object information [16]. Multinomial Naïve Bayes is a modified form of Naïve Bayes. With the same approach using probabilities, Multinomial Naïve Bayes is designed for document text by calculating the frequency of words [17]. By calculating the likelihood of a word's appearance in text data, then the category of the text can be determined. The formula used to calculate Multinomial Naïve Bayes is as follows: : Words that have value in class c ∝ : Parameter smoothing d) Decision Tree Decision Tree is a classification method by making a tree structure like a flow chart where each node represents a feature or attribute as a classification criterion and a leaf node as a result of its classification [18]. Determination of features at each branch node in the Decision Tree is calculated based on the Gini index to determine which features or attributes most influence the classification process. The formula for calculating the value of a feature based on the Gini index is as follows: Where: : The gini value of a feature : Number of classes in attribute : The percentage of classes that appear in the attribute e) Term Weighting TF-IDF TF IDF is a technique in NLP (Natural Language Processing) used to extract essential keywords in a document. The TF-IDF algorithm has two working steps: Calculates the frequency of a word based on the number of times the word appears in the text, the more often the word appears, the higher the value of the word.

• Inverse Document Frequency
Count the unique words that exist in a text, by separating the words that appear only in one text with the words that appear in every text. To calculate the value of TF-IDF is as follows: Where: : The value of the word t , : The appearance of the word t in document d : Number of documents : The number of documents containing the word t

C. System Development
In this research, several steps need to be done to build a classification model, preprocessing data to clean the dataset for the training phase and validation phase, training the model classification and test the model for the validation phase.

a) Preprocessing
Because the dataset is not well-structured for classification yet, it is necessary to cleanse the data by removing punctuation, affixing, and changing every word in the text into a basic word. The process carried out is case-folding, remove punctuation, stopword removal, stemming, and tokenization. After processing the text, a feature selection is performed on the preprocessed text data using TF-IDF to calculate the frequency of words in the text data. After successfully giving value on the words, every word that passes the feature selection will be used to build a system model using Multinomial Naïve Bayes and Decision Tree. Before entering that stage, the dataset is divided into train data and test data for system development with a 90% ratio for train data and 10% for test data. The data train is used to build a classification model using the Multinomial Naïve Bayes and Decision Tree. Before entering that stage, the dataset is divided into train data and test data for system development with a 90% ratio for train data and 10% for test data. The learning process in the Decision Tree is done by calculating the feasibility of the feature in the dataset using the Gini index. After each feature gets its Gini value, the feature with the largest Gini value will be used as the root node. Other features that have smaller Gini values will act as branches from the root node, and the class in each assessment category act as its leaf node. Each preprocessing text data will enter each node in the Decision Tree, starting from the root, branch node, and leaf node. If the text data can pass through the existing node until it reaches the leaf node, the text data class can be defined. Each assessment has its respective classification model because it has no connection with each other, so there are three classification models built in this study from each method. c) Validation After the system has been successfully created, it is necessary to validate the process to prove whether the system produced satisfactory results or not. Results can be evaluated by trying to test the existing test data. The parameter used to validate the system is Accuracy, recall, precision, and F-1 measure. Then the evaluation is done using the parameter equation as follows: = ( + ) Where: With these parameters, it can be seen how much performance that the system has. After that, we can see how well the system performed by using the confusion matrix in Table 7, • False Negative (FN) is a value when the predicted data shows not suitable personality, and the actual data shows a suitable personality

III. RESULT
After the construction of the classification model and data, validation is done. The following are the accuracy data of the classification model results made from the Multinomial Naïve Bayes and Decision Tree:  Table 8 shows a comparison of how well the performance of each classification model was built using the Multinomial Naïve Bayes and Decision Tree. The validation method is also performed by calculating the confusion matrix of each model built along with the recall, precision, and f-1 score Table 9.  Table 9 is the confusion matrix result from the three models that have been built with two methods, from confusion matrix we can calculate recall, precision, and an f-1 score of the model so we can know how well the model performed so far.  Table 10 and 11 is the validation result of the model that implements the multinomial Naïve Bayes method and Decision Tree method. From these tables, we can know the precision, recall, and f-1 score from each label at both methods. To compare two methods of how well they predict positive classes correctly, we can use the f-1 score. From the tables, multinomial Naïve Bayes has the f-1 score for integrity, enthusiast, and speed is 0.89, 0.67, and 0.75, respectively, and Decision Tree has 0.33, 0.75, and 0.80, respectively.  Figure 2 illustrates the comparison of the accuracy from both method by using a graph. We can see that two models from the Decision Tree perform higher than multinomial Naïve Bayes.

IV. DISCUSSION
Based on the test results that have been shown previously, Multinomial Naïve Bayes can provide better performance than the Decision Tree of the three models. However, the results can be said, not optimal. Even though from accuracy and f-1 score, we can see that two models from Decision Tree can perform better than multinomial Naïve Bayes. But if we see it based on what method that more stable, multinomial Naïve Bayes has better performance. Because all models from multinomial Naïve Bayes can have accuracy 50% or higher, on the other hand, there is one model from the Decision Tree that has accuracy, only 33%. Many factors can affect the performance of the built classification model. One of them is from the dataset itself because the dataset collected to create and test the classification model is interview data recorded into audio and converted into text data using speech to text. Inside the dataset, there are many data records whose sentences cannot be understood because of the words contained in them. This is most likely due to missed speech to text in converting the interviewee's words into text. Here are a few words or sentences that were rated wrong by researchers:  Table 12 shows the sentence that considered mistakenly converted. This can affect the performance of the classification model. Some words have the same meaning, but speech to text miss converting it. Thus, the system considers the word as a new feature, resulting in performance's reduction of the classification model. As in the table, the word "Perusahaan (company)" which is the basic word, became "Usaha (business)". Because of the missed conversion, there is a word that becomes "perusahaansaya (my company)" so that the basic word remains "perusahaansaya (my company)," this makes the word "Perusahaan (company)" in "perusahaansaya (my company)" valued as a new feature and not the same as the word "perusahaan (company)" from before.
In addition to the strange sentence or word factors above, the classification model's performance is also influenced by the stopword list. In the construction of personality classification models, libraries and lists for stopwords that are used are derived from literature. Instead of omitting them from the documents, some of stopwords in literature are being kept for giving broader information on personality classification.
Another factor also comes from the comparison of label classes that exist in each recording in each assessment. The following is the data comparison of label classes at each evaluation: Based on the data table above, there are only three appraisal data in which the number of label classes is quite balanced, namely in Integrity, Enthusiast, and Speed. This can affect the performance of the classification model because the system can accurately classify the model if the system truly has enough data for each label class. In addition to the comparison of biased label classes, the number of existing records is inadequate. It can be said that this test can have an excellent performance on several assessments because it only has two classes of each evaluation. If there is only one class with a sufficient amount of training data to be able to guess well in that class, then the recorded data, which was not included in the class classification, will be included in the other class. Still, if there are three classes, then the performance of the system will decrease.

V. CONCLUSION
Based on the research that has been done for the classification of applicants' personalities using the Multinomial Naïve Bayes and The Decision Tree, it can be concluded that the Multinomial Naïve Bayes provides better performance than the Decision Tree. Even so, these results cannot be said to be good because if the averaged results between the three models of the two methods, the multinomial Naïve Bayes produce an accuracy of 64.3% and the Decision Tree produce an accuracy of 59.67%, this is due to insufficient data available for the construction of the classification model. The preprocessing stage is also an essential factor to improve the performance of the model made, therefore even though this proposed classification model can classify the personality of the applicants based on the suitable with the work culture For related research in the future, it is deemed necessary to have a special stopword list for the construction of personality classification models so that the results obtained can be more accurate. Besides, the dataset collection also needs to be done more, and the dataset has a balanced comparison of label classes.