Accuracy of Mobile Device–Compatible 3D Scanners for Facial Digitization: Systematic Review and Meta-Analysis

Background The accurate assessment and acquisition of facial anatomical information significantly contributes to enhancing the reliability of treatments in dental and medical fields, and has applications in fields such as craniomaxillofacial surgery, orthodontics, prosthodontics, orthopedics, and forensic medicine. Mobile device–compatible 3D facial scanners have been reported to be an effective tool for clinical use, but the accuracy of digital facial impressions obtained with the scanners has not been explored. Objective We aimed to review comparisons of the accuracy of mobile device–compatible face scanners for facial digitization with that of systems for professional 3D facial scanning. Methods Individual search strategies were employed in PubMed (MEDLINE), Scopus, Science Direct, and Cochrane Library databases to search for articles published up to May 27, 2020. Peer-reviewed journal articles evaluating the accuracy of 3D facial models generated by mobile device–compatible face scanners were included. Cohen d effect size estimates and confidence intervals of standardized mean difference (SMD) data sets were used for meta-analysis. Results By automatic database searching, 3942 articles were identified, of which 11 articles were considered eligible for narrative review, with 6 studies included in the meta-analysis. Overall, the accuracy of face models obtained using mobile device–compatible face scanners was significantly lower than that of face models obtained using professional 3D facial scanners (SMD 3.96 mm, 95% CI 2.81-5.10 mm; z=6.78; P<.001). The difference between face scanning when performed on inanimate facial models was significantly higher (SMD 10.53 mm, 95% CI 6.29-14.77 mm) than that when performed on living participants (SMD 2.58 mm, 95% CI 1.70-3.47 mm, P<.001, df=12.94). Conclusions Overall, mobile device–compatible face scanners did not perform as well as professional scanning systems in 3D facial acquisition, but the deviations were within the clinically acceptable range of <1.5 mm. Significant differences between results when 3D facial scans were performed on inanimate facial objects and when performed on the faces of living participants were found; thus, caution should be exercised when interpreting results from studies conducted on inanimate objects.


Introduction
Oral and facial rehabilitation involves comprehensive diagnosis and treatment planning [1,2]. Facial morphology assessment is vital for the diagnosis of maxillofacial anomalies, surgery, fabrication of prostheses, and postoperative evaluation [2,3]. Esthetics and prognosis of treatment outcomes can be improved through simulation performed on the 3D facial models of patients [4]. The conventional method for generating facial models of patients is physical facial impression, in which a replica of the face is fabricated using elastomeric materials and a gypsum cast [5,6]. However, the method is uncomfortable for patients because their face is covered with materials during the impression-taking process [6]. In addition, the dimensional accuracy of the physical facial impression model is affected by several factors, including the viscosity of the impression materials, setting time, storage conditions, and time interval from material mixing to stone pouring of the casts [7,8]. Furthermore, the human face is made up of complex anatomical structures with complicated skin textures and colors, which makes realistic replication of the face challenging.
Modern digital technologies have revolutionized the facial impression method by enabling 3D facial morphology to be captured using noncontact optical facial scanning devices [9,10]. Digital impression does not require conventional laboratory work or the use of impression materials, thus reducing the discomfort and chair time of the patients. Compared with facial stone casts, wherein only direct anthropometric measurements of the faces can be performed for facial analyses, virtually reconstructed models of the face can be utilized for multidisciplinary purposes [11][12][13]. Facial landmarks can easily be extracted from a digital facial model, and the digitized data format enables image merging and advanced dimensional analyses, such as surface-to-surface distance measurements and volume misfit evaluations, using analytical computer software [3,[14][15][16][17]. In addition, digital facial scanning provides an efficient basis for dental education and facial recognition [18][19][20].
Stationary facial scanning systems based on stereophotogrammetry technology were first introduced in dentistry [21]. However, because of the encumbrance and high cost of this technology, handheld scanning systems using laser or structured-light technology were developed [21][22][23]. Although most professional handheld scanners are considered acceptable in terms of their scan image quality, they are expensive and often require considerable training time to learn their complex scanning protocols [3,24,25]. Alternatively, 3D sensor cameras based on structured-light technology have been developed for smartphone and tablet devices [15,[26][27][28]. An advantage of using mobile devices for face scanning is their user-friendly operation; this reduces the training time for users [15,29]. Apps can be developed and customized for specific purposes by using open source scripts and software coding [15,29]. Moreover, when an external attachment-type 3D sensor camera is used, the position of the camera is controllable in the mobile-device system [27,29].
Facial scanning using a mobile device 3D sensor camera has been attracting a lot of interest in recent years because it is highly portable and cost-effective and because of the popularity of mobile devices [29]. Smartphone-and tablet-compatible 3D facial scanners have been reported to be an effective tool for clinical use in prosthodontic treatment [27,[30][31][32][33]. However, the accuracy of the digital facial impression obtained with mobile device-compatible face scanners has not been explored. The purpose of this systematic review and meta-analysis was to investigate the accuracy of mobile device-compatible face scanners for facial digitization.

Study Design
This study was designed based on PRISMA guidelines (Preferred Reporting Items For Systematic Reviews and Meta-Analyses) [34]. This review was not preregistered on PROSPERO. Accuracy was defined as a dimensional discrepancy between the digital facial impression made by a mobile device-compatible face scanning camera and reference image data set. The PICO (population, intervention, comparison, and outcomes) question was as follows: Are digital facial impressions (population) obtained with mobile device-compatible 3D facial scanning cameras (intervention) equivalent to those of professional handheld face scanners (comparison) in terms of accuracy (outcomes)?

Inclusion and Exclusion Criteria
Inclusion and exclusion criteria were set based on the study design, objectives, interventions, and measurement results. The search was limited to articles published in English only. The inclusion criteria for meta-analysis were low risk of bias, low concern for applicability, and relevant numeric data for pool-weighted estimation using the Cohen d statistical method. Accordingly, randomized and nonrandomized controlled trials, cohort studies, case-control studies, and cross-sectional studies that were performed with human participants and on inanimate objects, reporting quantitative assessments of digital facial models obtained with 3D facial scanners and mobile device-compatible 3D facial scan cameras were included in this review. Conversely, conference papers, case reports, case letters, epidemiologic studies, and author or editorial opinion articles were excluded. Original studies that used only 2D images or did not include mobile device-compatible 3D facial scanners were not reviewed, and studies in which the accuracy could not be quantitatively determined were not considered for analysis.

Data Collection
Two reviewers (H-NM and D-HL) independently participated in collecting, screening, and selecting the potential studies based on the information provided by the titles and abstracts. The full texts of relevant articles were assessed and reviewed by both reviewers. The papers that satisfied all the inclusion criteria were considered eligible for review. The following information was collected from full-text papers and recorded on an electronic spreadsheet (Office Excel, Microsoft Inc): authors, year of publication, study purpose, participant information (sample size, mean age, age range, and gender proportion), scanning methods (scanning device, capture technology, working condition, and scanning process), reference standard for validation (direct anthropometry or another 3D scanning device), types of measurement performed (linear distances or surface-to-surface deviation), number of measurements (number of landmarks, measurement times, and raters), measurement results (mean, estimation errors, and types of statistical analysis), and major conclusions. Articles with missing data or unreliable data were excluded from the meta-analysis. The agreement (κ) between the 2 reviewers was calculated. In case of disagreement, a discussion between the 2 reviewers was conducted to resolve the issues.

Quality Assessment and Meta-Analysis
The risk of bias and concern for applicability based on 4 bias domains-patient selection, index test, reference standard, and flow and timing-were assessed by the 2 reviewers using the Quality Assessment Tool for Diagnostic Accuracy Studies-2 (QUADAS-2) [35].
The random-or fixed-effects model was used to analyze the standardized mean difference (SMD) between the experimental and reference data sets to investigate the effect size estimate and the confidence intervals of SMDs using Cohen d [36]. Heterogeneity was evaluated using the Cochran Q test based on the Higgins I 2 statistic [37], where a higher I 2 value indicated a stronger heterogeneity. When the Q test indicated high heterogeneity across studies (P<.05) or I 2 >50%, the random-effects model was selected, and subgroup analysis was performed [38]. The subgroup was defined based on the participants or inanimate objects investigated.
Publication bias was assessed using the Egger linear regression statistical test and visually inspected using funnel plots. Meta-analyses were performed using the meta package for R software (version 3.6.0, R Foundation for Statistical Computing Platform); the significance level was set at .05. The robvis package (version 0.3.0) was used to visualize the risk-of-bias assessment results [39].

Search Results
The search resulted in a total of 3942 articles, which were reduced to 3726 articles after removing 216 duplicates. In the title screening process, 3674 articles that were outside the scope of this review were excluded, thereby leaving 52 articles for abstract screening. After the exclusion of 24 articles with irrelevant abstracts, the full texts of 28 articles were read and assessed, and 11 articles were considered eligible for this review. Of these, 6 articles were included in the global meta-analysis, 4 articles were included in the living person face subgroup analysis, and 3 articles in the inanimate face subgroup analysis. The results of the searching and screening process are summarized in Figure 1. There was substantial interrater agreement (κ=0.90).

Quality Assessment and Applicability Concerns
The quality assessment results from the Quality Assessment Tool for Diagnostic Accuracy Studies-2 showed that among the 11 studies included, one study [40] had a high risk of bias, and another study [41] had a high concern for applicability ( Figure  2). There were 2 studies [41,42] showing some risk of bias, and there were 2 studies [40,42] for which there were some concerns for applicability. The patient selection and index test had a higher risk of bias than those of other domains in some studies because of unclear statements regarding the methods employed for random sampling [28,43] or the small number of participants included [5,15]. For applicability, the major concerns arose in the index test domain because several studies did not describe the scanning procedures in detail or did not provide sufficient information about the scanning devices [27,28,40,41].

Study Characteristics
Extracted data were organized according to the characteristics of the studies ( Table 1). The characteristics of the mobile device-compatible face scanners that were investigated are summarized in Multimedia Appendix 1.  [29] Artec Eva (Artec Group) provided significantly more accurate results than those of the Sense (P<.001) and the iSense devices (P<.001). The Sense was more accurate than the iSense scanner; however, the difference was not significant (P=.12).  Among the 11 studies included, 6 were conducted on adult volunteers or patients [27,29,41,42,44,45] with a mean age of 35.50 years (SD 8.50; range 24-59). The number of participants in these studies ranged from 8 to 34, with 2 to 15 male and 4 to 15 female participants. The other 5 studies were conducted using inanimate objects such as impression casts of the face [5,15,43] or mannequin heads [28,44], and 1 study [40] was conducted on human cadaver heads. Stereophotogrammetry [29,41,42,44], computed tomography [5,15,43], and high-resolution structured-light handheld scanning [40,45] were used as the reference measurements for comparison, and 2 studies [27,28] used manual interlandmark distance as the reference measurement.

RMSE
For the evaluation, most studies [5,15,27,29,[40][41][42]44,45] measured the global surface-to-surface deviation between the reference and test images by calculating the root-mean-square error (RMSE) of the superimposed 3D images using analytical computer software, with a higher RMSE value indicating a higher surface deviation; however, 3 studies [28,41,43] compared the distances between facial landmarks on a digitized face with those between respective landmarks on a physical model obtained using the manual measurement method. Among them, 1 study [41] evaluated both the global surface-to-surface deviation and interlandmark linear distances, and the deviation was assessed along the x-axis (horizontal length), y-axis (vertical length), and z-axis (depth) in another study [28].

Meta-Analysis
The global analysis revealed heterogeneity (I 2 =91% P<.001). Random-effects models were selected for both global and subgroup meta-analyses based on the heterogeneity among the studies. In general, the accuracy of facial models obtained with mobile device-compatible face scanners was significantly lower than that of facial models obtained using professional face scanners (SMD 3.96 mm, 95% CI 2.81-5.10 mm, z=6.78, P<.001; Figure 3). Results from the subgroup analysis revealed a significant difference between the subgroups (Figure 4). The difference between the mobile device-compatible and professional face scanners was significantly higher for the face scans of inanimate facial objects (SMD 10.53 mm, 95% CI 6.29-14.77 mm) than for those of living participants (SMD 2.58 mm, 95% CI 1.70-3.47 mm, P<.001, df=12.94).

Principal Findings
We aimed to investigate the accuracy of mobile device-compatible face scanners in facial digitization. Mean discrepancy values of the digitized face obtained using mobile device-compatible 3D facial scanners ranged from 0.34 to 1.40 mm in articles included in this systematic review. The meta-analysis revealed that mobile device-compatible 3D facial scanners were less accurate than their professional 3D counterparts. The reliability of a digital face scanner can be classified into 4 categories: highly reliable (deviation <1.0 mm), reliable (deviation 1.0 mm-1.5 mm), moderately reliable (deviation 1.5 mm-2.0 mm), and unreliable (deviation >2.0 mm) [46]. For clinical application, deviations <1.5 mm were considered acceptable [3,47,48]. Based on the classifications, mobile device-compatible 3D facial scanners were considered acceptable for clinical use even though their accuracies were lower than those of the professional 3D facial scanners. Amornvit et al [28] and Liu et al [43] reported that mobile device-compatible face scanners are comparable to professional 3D facial scanners when scanning simple and flat areas of the face such as the forehead, cheeks, and chin. However, scanning accuracy was relatively low when mobile device-compatible face scanners were used to capture complex facial regions, such as the external ears, eyelids, nostril, and teeth [28,44,45]. Higher inaccuracy was found in the facial areas with defects, depending on the depth of the defect [15]. Thus, careful consideration in accordance with the purpose and the person might be needed when using mobile device-compatible face scanners.
In the preliminary stages, smartphone-based 3D scanners used a multiphotogrammetry approach that captured several photographs of the object from different views and matched common features in the photographs to establish a 3D model of the object by using dedicated smartphone software apps [15,45]. The resolution of a 3D image depended on the number of reconstructed polygons that were calculated by the software algorithm based on the resolution of the captured images [49]. The working principle is similar to that of professional stereophotogrammetry facial scanning systems; however, professional systems usually use digital single-lens reflex cameras that have higher pixel densities with better noise reduction software and higher ISO settings compared with those of smartphone cameras [50]. The accuracy of smartphone multiphotogrammetry in facial data acquisition was reported as 0.605 (SD 0.124) mm by Elbashti et al [15]. In another study by Ross et al [45], the mean discrepancy of scan data obtained using smartphones ranged from 0.9 mm to 1.0 mm, depending on the number of photographs taken during scanning. In recent years, infrared structured-light depth-sensing cameras have been incorporated in mobile devices to facilitate 3D optical scans [51]. 3D depth-sensing cameras work by the time-of-flight principle, measuring the time taken for light emitted by an illumination unit to travel to an object and back to the sensor array [52,53]. The 3D images are then reconstructed based on a depth map of the object and surroundings [54]. Although smartphone depth-sensing cameras share similar working principles with professional laser scanning systems, laser systems are more sensitive to depths because they are built with higher sensitivity sensors [15,23]. Amornvit et al [28] reported that the 3D depth-sensing sensor scanner of a smartphone is reliable in linear measurement at the frontal plane, but it has less accuracy in depth measurement compared with that of professional face scanners. Depth-sensing cameras can also be used separately and attached or plugged into smartphones, tablets, or laptop computers to acquire 3D scans [27,29,[40][41][42]44]. Because the quality of facial scanning is also affected by the performance of compatible mobile devices when external depth-sensing cameras are used, the resulting accuracy might vary widely and should be evaluated for each combination of depth-sensing camera and mobile device.
Subgroup meta-analysis showed that the accuracy of 3D facial scans performed on living persons was significantly different compared with those performed on inanimate objects. This result implies that the outcomes of in vitro or laboratory studies could be different from those obtained from people. Thus, based on the findings of this review, we recommend using living persons for related research on mobile device-compatible face scanning. Caution should be exercised when scanning the orbital, nasolabial, and oral regions on the face of a living person to minimize the discrepancies caused by motion artifacts [16,24]. Subconscious nose breathing, eye blinking, and lip twitching should also be carefully considered as these are the main sources of involuntary facial movements [16]. Ozsoy et al [17] reported that changes in facial expressions could affect the reproducibility and reliability of a scan, with the highest error values observed for a frightened facial expression and the lowest value observed for neutral facial expression. To reduce motion artifacts, the person should be instructed to maintain a neutral facial expression and avoid any head movement during image acquisition [55]. Another concern is that human faces contain complex skin textures, pores, freckles, scars, and wrinkles. Some artifacts or missing scan data appear as holes, originating from surfaces that are difficult to capture, such as eyebrows, eyelashes, and hairlines [29]. Small empty holes can be repaired using image processing software that uses neighboring areas that are morphologically similar; however, large defects can cause difficulties in the stitching process because of the lack of reference [24]. In addition, human faces vary in shape and are not perfectly symmetric, thus may appear different in different angles of view [56]. This phenomenon might cause some artifacts when the multiphotogrammetry approach is used because the 3D model of an object is reconstructed by matching common facial features in the captured photographs.
A limitation of this review is that the review protocol was not preregistered on PROSPERO. Most included studies are not directly correlated with clinical treatment outcomes due to the difficulty of performing clinical studies to assess the accuracy of scanners. However, the findings of this review show great promise for the clinical use of mobile device-compatible face scanners. Another limitation of this systematic review is the small number of included studies. The limited number of studies show high heterogeneity and funnel plot asymmetry. Regarding publication bias, the Egger test result was significant (P=.004). Heterogeneity can cause funnel plot asymmetry when a correlation between intervention effects and study sizes is present [57]. Further examination was performed on the eligibility of a study that showed distinctly larger effect estimates [5], and we included the study [5] in the meta-analysis because it was conducted in an environment of a scanning intervention and was methodologically scientific. Although the inclusion of this study [5] increased the heterogeneity among studies and funnel plot asymmetry, the results were fundamentally attributed to a small number of articles [58]. All eligible papers included in the review were published between 2017 and 2020 due to the novelty of the research topic. A random-effects model is often used in meta-analyses for studies with heterogeneity. Random effects meta-analyses weigh studies more equally than fixed-effect analyses by incorporating the variance between studies [58]. Therefore, in this review, based on heterogeneity and funnel plot asymmetry, random-effects models were selected for global and subgroup analyses. Additional controlled in vitro and randomized clinical trials will be needed to reinforce the impact of review articles. Moreover, considering the rapid development of face scanning in the medical field, diverse investigations with newly developed devices and systems need to be continuously performed.

Conclusions
Overall, the accuracy of mobile device-compatible face scanners in 3D facial acquisition was not comparable to that of professional optical scanning systems, but it was still within the clinically acceptable range of <1.5 mm in dimensional deviation. There were significant differences between 3D facial scans performed on inanimate objects and living persons; thus, caution should be exercised when interpreting the results from studies conducted on inanimate objects.