Sampling of N. barkeri and related species
Phytoseiid mites inhabit a variety of habitats, such as various plants and soil litters. Individuals were collected from plants and those on substrates and soil litters were isolated using Berlese’ funnels and kept in 95% alcohol. Samples were mounted in Hoyer’s medium and softened and cleaned with lactic acid if the mite body was hard. In addition, specimens were deposited at several institutes: GIABR (Guangdong Institute of Applied Biological Resources, Guangzhou, Guangdong, China), HUM (Hokkaido University Museum, Sapporo, Japan), NMNS (National Museum of Nature and Science, Tsukuba, Japan), NTU (Department of Entomology, National Taiwan University, Taipei, Taiwan), TARL (Taiwan Acari Research Laboratory, Taichung City, Taiwan). Female phytoseiid mites were collected, including 250 specimens of N. barkeri, and 262 specimens of 35 non-target species belonging to subfamily Amblyseiinae, in 6 tribes, and 11 genera. The following numbers of these non-target species were collected: 4 of N. baraki, 10 of N. longispinosus, 10 of N. makuwa, 6 of N. taiwanicus, 9 of N. womersleyi, 9 of Amblyseius alpinia, 10 of A. bellatulus, 10 of A. eharai, 10 of A. herbicolus, 2 of A. pascalis, 10 of A. tamatavensis, 10 of Euseius aizawai, 6 of E. circellatus, 7 of E. daluensis, 11 of E. macaranga, 10 of E. ovalis, 6 of E. paraovalis, 3 of E. nicholsi, 6 of E. oolong, 7 of E. sojaensis, 4 of Gynaeseius liturivorus, 3 of G. santosoi, 10 of Okiseius subtropicus, 4 of Paraamblyseius formosanus, 7 of Paraphytoseius chihpenensis, 10 of Parap. cracentis, 3 of Parap. hualienensis, 10 of Parap. orientalis, 6 of Phytoscutus salebrosus, 10 of Proprioseiopsis asetus, 3 of Prop. ovatus, 8 of Scapulaseius anuwati, 10 of S. cantonensis, 10 of S. okinawanus, and 8 of S. tienhsainensis. In addition, specimens of N. barkeri were collected from the United States, China, Israel, Japan, the Netherlands, Taiwan, and Thailand (including intercepted specimens in plant quarantine).
Quantitative measurements of phytoseiid mites
Specimens were examined under an Olympus BX51 microscope, and measurements were performed using a stage-calibrated ocular micrometer and ImageJ 1.4736. Photos were taken using a Motic Moticam 5+ camera attached to the microscope (Figure S1). All measurements were recorded in micrometres (μm). The general terminology used for morphological descriptions in this study conformed to that of Chant and McMurtry20. The notation for idiosomal setae conformed to that of Lindquist and Evans37 and Lindquist38, as adapted by Rowell et al.39 and Chant and Yoshida-Shaul32. Phytoseiid mites exhibit pronounced sexual dimorphism, and female individuals are more crucial for identification because of their distinguishing features and greater prevalence. In the present study, 22 quantitative measurements were collected from the female specimens: dorsal shield length and width; j1, j3, j4, j6, J5, z2, z4, z5, Z1, Z4, Z5, s4, r3, and R1 setae length; ventrianal shield length and width (at ZV2 level); JV5 length; St IV length; spermatheca calyx length, and spermatheca calyx width (Fig. 1, Table 1).
XGBoost training and computing
We used XGBoost to develop a classification system for target mite species and related species based on their morphological features. Among machine learning methods, XGBoost is the most efficient for implementing the gradient boosting decision tree algorithm from multiple decision trees, which are created successively. For each iteration, a tree enhances its predictive power by minimising the unexplained part of the last tree. First, we determined the number of decision trees through cross-validation. The original sample was randomly partitioned into five equally sized subsamples (Table S1). A single subsample and the other subsamples were retained for use as the validation and training data, respectively. Cross-validation was then performed five times, with each subsample used exactly once as the validation data. The number of decision trees allows the same level of performance to be achieved in training and validation. The number of decision trees was then used for the full dataset to create a final model, and key morphological features were selected for their relative importance. Next, we used ICE plots to indicate the determinative roles of these key features in classification. Plots in which one line represents one specimen indicate changes in predictions (of target species) that occur as a morphological feature change. We generated XGBoost and ICE plots by respectively using the R package “xgboost”40 and “pdp”41.
Hand-drawn illustrations (Fig. 1) were made under an optic microscope (Olympus BX51). These drawings were first scanned, then processed and digitized with Photoshop CS6 (Adobe Systems Incorporated, USA).