Application of a Multi-Layer Perceptron in Preoperative Screening for Orthognathic Surgery
Article information
Abstract
Objectives
Orthognathic surgery is used to treat moderate to severe occlusal discrepancies. Examinations and measurements for preoperative screening are essential procedures. A careful analysis is needed to decide whether cases require orthognathic surgery. This study developed screening software using a multi-layer perceptron to determine whether orthognathic surgery is required.
Methods
In total, 538 digital lateral cephalometric radiographs were retrospectively collected from a hospital data system. The input data consisted of seven cephalometric variables. All cephalograms were analyzed by the Detectron2 detection and segmentation algorithms. A keypoint region-based convolutional neural network (R-CNN) was used for object detection, and an artificial neural network (ANN) was used for classification. This novel neural network decision support system was created and validated using Keras software. The output data are shown as a number from 0 to 1, with cases requiring orthognathic surgery being indicated by a number approaching 1.
Results
The screening software demonstrated a diagnostic agreement of 96.3% with specialists regarding the requirement for orthognathic surgery. A confusion matrix showed that only 2 out of 54 cases were misdiagnosed (accuracy = 0.963, sensitivity = 1, precision = 0.93, F-value = 0.963, area under the curve = 0.96).
Conclusions
Orthognathic surgery screening with a keypoint R-CNN for object detection and an ANN for classification showed 96.3% diagnostic agreement in this study.
I. Introduction
Orthognathic surgery is used to reposition the jaw in patients with moderate to severe occlusal discrepancies and dentofacial deformities. This field of surgery uses dental, medical, and surgical concepts—most notably, orthodontics incorporated with oral and maxillofacial surgery—to treat facial deformities that have progressed beyond the treatment that can be provided by dental movement alone. Accurate pre-operative screening is highly important given patients’ need to be aware of the invasiveness of orthognathic surgery and the requirement for financial planning.
Prior to orthognathic surgery, examinations and measurements from dental casts, intra- and extra-oral photographs, panoramic and cephalometric radiographs, and patients’ perceptions are precisely evaluated by orthodontists [1,2]. Notably, an intricate interpretation of cephalometric tracings is part of the screening process, the definitive diagnosis, treatment planning, and clinical decision-making. Nonetheless, the interpretation of cephalometric tracings is time-consuming and requires expertise in orthodontics and maxillofacial surgery; otherwise, the interpretation of the measurements may be subject to error.
To overcome the limitations present at this stage, developments in computer technology have led to the digitization of cephalometric analysis. Moreover, artificial intelligence (AI) expert systems with deep learning have been shown to be useful in orthodontics [3]. AI contains many subfields, including machine learning, which digests large quantities of data to perform specific tasks and learns without explicit programming, and deep learning, which self-learns a specific task with increasingly greater accuracy using many layers of processing units. Common applications of deep learning include image and speech recognition [4]. The amalgamation of AI-aided digitization with neural network systems shows strong potential for the development of an automated decision support system for orthognathic surgery screening. Such a system would be highly beneficial for potential patients and convenient for referral dentists and dental specialists.
Deep learning algorithms analyzing very large datasets of digitized cephalometry have been developed to classify skeletal and dental discrepancies [5]. Artificial neural network (ANN) applications, which use deep learning systems that employ cephalograms as sources of variables (learning weights and biases) and then use the information in these cephalograms to determine whether surgery is required, have also been created. Moreover, the keypoint region-based convolutional neural network (R-CNN) showed better accuracy than earlier neural networks for object detection [6]. Therefore, this study applied AI to cephalometric analysis to develop a standardized decision-making system for orthognathic diagnosis. The specific objective of this study was to develop and validate a multi-layer perceptron (MLP) for preoperative screening of orthognathic surgery using a keypoint R-CNN for object detection.
II. Methods
Prior to collecting the cephalometric radiographs and commencing the research, the ethical standards and research procedures were approved by the Institutional Review Board of the Faculty of Dentistry/Faculty of Pharmacy, Mahidol University (COA.No. MU-DT/PY-IRB 2021/032.2903). The flow chart of the entire process is shown in Figure 1.
The data were prepared from patients who visited the Department of Oral and Maxillofacial Radiology, Faculty of Dentistry, Mahidol University, from 2012 to 2021. The seven cephalometric parameters listed in Table 1 were measured in all subjects for the classification [7]. The seven measurements were U1 to PP (angle between the axis of the maxillary incisor and palatal plane), L1 to MP (angle between the axis of mandibular incisor and the mandibular plane), overjet (anterior-posterior overlap of the maxillary incisor over the mandibular incisor), U1 root tip to A-point (distance between the root apex of the maxillary incisor and the A-point as referenced by the functional occlusal plane which is a line connected between mesiobuccal cusp of mandibular first premolar (L4) to mesiobuccal cusp of mandibular first molar (L6)), L1 root tip to B-point (distance between the root apex of the mandibular incisor and the B-point as referenced by the functional occlusal plane), ANB (angle between A-point, nasion, and B-point), and Wits (distance between A-point perpendicular to the functional occlusal plane and B-point perpendicular to the functional occlusal plane).
The samples used in this retrospective study consisted of 538 digital lateral cephalograms from Thai patients who were 20–40 years old. Patients who had an unerupted permanent tooth or a missing tooth, recognizable craniofacial abnormality or deformity, or previous orthodontic, plastic, or other maxillofacial surgical procedures were excluded. The cases included skeletal classes I, II, and III (Table 2). Two orthodontists (SM, ST), each with more than 10 years of experience, decided on the treatment plans. Of the 538 samples, non (orthognathic) surgery orthodontic treatment was chosen in 256, while 282 were deemed to require orthognathic surgery.
All 538 images were manually annotated using the Vision Marker II which was a web application created by Digital Storemesh company (by the first author; NC) and were validated by an experienced orthodontist (SM). Next, Detectron2 [8], which is a unified library of object detection algorithms from Facebook AI Research (FAIR), incorporated the models included in the keypoint R-CNN [9,10]. Detectron2 was created in Google Colaboratory, which enabled coding and execution in Python for locating and labeling 13 anatomical landmarks (the U1 incisal tip, U1 root tip, ANS, PNS, L1 incisal tip, L1 root tip, A-point, B-point, nasion, gonion, menton, mesiobuccal cusp of L4, and mesiobuccal cusp of L6) (Figure 2), allowing the seven measurements to be performed.
A loss function was then used to evaluate the performance of the model. An output approaching 0 indicates that the model is well trained (Figure 3). The root mean square error (RMSE) and percentage of detected joints (PDJ) [11] were also used to assess the performance of the model. The RMSE was calculated as 8.54 pixels, meaning that the accuracy of the model was acceptable (the image showed a vertical and horizontal resolution of 96 dpi with a height and width of 1020 × 1024 pixels). The PDJ setting at a threshold of 0.05 was calculated as 0.97, which was satisfactory as an evaluation of the distance between the prediction and ground truth (Figure 4).
Each case was randomly assigned to either the training or testing dataset, with 484 radiographs allocated to the training set used to create the prediction model and the remaining 54 radiographs allocated to the test set. The test set was used to evaluate the performance of the model, and the training set was reevaluated as the validation set. To avoid overfitting, iterative learning was stopped at the lowest error point for the validation set. Max-min normalization was used to transform the input data to the range of 0–1. The applied machine learning model consisted of a four-layer neural network, including one input layer, two hidden layers (64 nodes in the first hidden layer and 24 nodes in the second hidden layer), and one output layer (epochs = 2,000, batch size = 32) (Figure 5). The learning rate was set at 0.01. Activation functions were used to improve the learning of the deep neural network. The rectified linear unit (ReLU) function [12] was applied to the hidden layers and a sigmoid function was used for the output layer. Keras [13], another neural network library, running in Python, was used to code neural network models in Google Colaboratory. Backward propagation was performed in Python to adjust the values of weights.
Finally, the output data were shown as numbers ranging from 0 to 1. Orthognathic surgery was suggested if the output approached the value of 1, whereas non-orthognathic surgery would be more appropriate if the output approached a value of 0. The data files can be freely and openly accessed on Open Science Foundation under https://osf.io/bcd4h/?view_only=8ff663525fc6468cb61109b7fb6abca6.
III. Results
The model showed 96.3% diagnostic agreement for the classification of whether the patient required orthognathic surgery. A graph of the training accuracy and validation accuracy of the neural network model showed a plot increasing to the point of stability, as shown in Figure 6A. When the output approaches 1, the model could be considered to be well trained. Meanwhile, a graph of the training loss and validation loss of the neural network model showed a plot decreasing to the point of stability, as shown in Figure 6B. When the output approaches 0, the model could be considered to be well trained. Moreover, a receiver operating characteristic (ROC) curve showed the performance of the classification of the model (Figure 7). This graph swiftly changed from the origin to (0, 1), exhibiting a high true-positive rate and a low false-positive rate, which indicated that this was a good classification model. The area under the ROC curve was 0.96, showing that the model had excellent accuracy. Furthermore, a confusion matrix showed cases of misdiagnosis (Figure 8). Only two out of 54 cases were misdiagnosed. One was skeletal class II and the other was skeletal class III. As a result, the accuracy of the model was 0.963, showing a high rate of correct predictions. The sensitivity of the model was 1, indicating that all the positive cases were labeled as positive. The precision of the model was 0.93, showing that a high proportion of the predicted cases correctly turned out to be positive. The F-value was 0.963, showing that both the precision and sensitivity were high.
IV. Discussion
AI has been used in the fields of healthcare and medicine for several years, and applications seem to be developing at a breakneck speed [14]. Machine learning methods are usually used to perform either prediction or classification [15]. It is widely recognized that orthodontics has gained more precision in terms of structuring and improving its practices from computerization than any other dental discipline [16].
This study describes the creation and validation of a decision-making model based on a keypoint R-CNN for object detection and an ANN for classification. The procedure commenced with the orthodontist’s validation of manually localized anatomical landmarks, enhanced with the keypoint R-CNN for object localization. More specifically, the keypoint R-CNN is good at object localization of two-dimensional images because the main task of keypoint detection is to detect categorical boundary points. Moreover, the keypoint R-CNN may show particular attentiveness to the model boundary [17]. Meanwhile, ANNs are the foundation of deep learning algorithms, a subset of machine learning. ANNs are intelligent systems that are used to handle difficult issues in a wide variety of applications, including prediction. Hence, this study applied an ANN for prediction. ANNs employ a hidden layer to improve prediction accuracy [18]. ReLU was applied as an activation function in the hidden layer to decrease the vanishing gradient and provided a certain sparsity of the neural network [19].
Previous studies have analyzed the success rate of neural network-based decision support systems for orthognathic surgery. A decision support system with a deep convolutional neural network for image classification showed 95.4% to 96.4% rates of diagnostic agreement regarding orthognathic surgery between the actual diagnosis and the diagnosis made by the AI model [20]. Another study employed an ANN for image classification, and the model achieved a diagnostic agreement rate of 96% [21]. In the present study, we created and validated a keypoint R-CNN for the detection and deep learning-based classification of lateral cephalometric images, which showed a diagnostic agreement rate of 96.3%. Therefore, this model, using a keypoint R-CNN and ANN, could be beneficial for determining whether orthognathic surgery is required.
The precision of deep learning depends strongly on the amount of training data. There is still room to further improve the diagnostic agreement of our model by including additional cephalometric radiographs in the training set. Moreover, standardization of images could also improve the diagnostic agreement. When additional radiographs are analyzed, CNN training algorithms can enhance the weighting parameters in each layer of the architecture. Although this study evaluated more than 500 cephalograms with accurate anatomical landmark localization, we still recommend increasing the number of cephalograms in the training data in future research. With further datasets and training, the model could also be used for a variety of screening and diagnostic purposes [22].
The limitations of this study include the absence of input data on crowding, skeletal asymmetry, soft tissue profiles, and airway spaces. In addition, the patients’ perceptions, complaints, and expectations are important. Disparities in orthognathic surgery decisions could also be meaningful. For example, the decision-making could be affected by the patient’s and clinician’s preferences, airway space size, or the clinician’s experience [3]. While preferences are very hard to standardize, the patient’s subjective needs should be addressed, and agreement between the surgeon, orthodontist, and patient is essential [23].
The keypoint R-CNN also showed very strong performance in detecting the features of facial parts. Therefore, it is necessary to evaluate AI and human perceptions of other significant anatomical features, and we suggest this as a topic for future research. Furthermore, the lateral cephalogram is only a two-dimensional image, and the bilateral structures overlap, whereas three-dimensional cone-beam computed tomography (CBCT) can solve this drawback. CBCT images enable a more exact identification of cephalometric landmarks and can overcome the problem of superimposition of bilateral landmarks in cephalometry [24]. Moreover, merging frontal profiles and CBCT could provide more information on the relationship between soft tissue and the facial skeletal structure.
In summary, by combining a neural network model with information on clinical decision-making, a supplemental tool for orthognathic screening using digital lateral cephalometry images was created. The model showed a diagnostic agreement rate of 96.3%. Increasing the size of the training data set, evaluating additional important data (especially patients’ perceptions), and using three-dimensional CBCT would further improve this AI-aided approach to orthognathic screening.
Acknowledgments
This study was supported by Mahidol University. The technical advice and support from Assistant Professor Wasit Limprasert, College of Interdisciplinary Studies, Thammasat University, Thailand was truly appreciated. We are also grateful for the technical assistance from Digital Storemesh Co. Ltd. And we truly appreciate the help Dr. Sasipa Thiradilok gave to the team.
Notes
Conflict of Interest
No potential conflict of interest relevant to this article was reported.