Molecular Descriptors Family on Structure Activity Relationships
2. Insecticidal Activity of Neonicotinoid Compounds
Sorana BOLBOACĂ, Lorentz JÄNTSCHI
“Iuliu Haţieganu” University of Medicine and Pharmacy, http://sorana.academicdirect.ro
Technical University of ClujNapoca, Romania, http://lori.academicdirect.org
Abstract
The neonicotinoids are the newest major class of insecticides modeled after the basic nicotine molecule having improved insecticide activity and generally low toxicity. The insecticidal activities of neonicotinoids were previous studied using 3D and standard partial least squares regression models. The paper describes the ability of the MDF SAR methodology in prediction of insecticidal activities of neonicotinoid compounds. The best MDF SAR bivaried model was validated on training and test sets and its ability on prediction of insecticidal activity was compared with previous reported models. Even if the MDF SAR methodology is complex and time consuming the results worth the effort because they are statistical significant better then previous reported results.
Keywords
Molecular Descriptor Family (MDF), Structure Activity Relationships (SAR), Neonicotinoid Compounds, Insecticidal Activity
Introduction
The neonicotinoids are the newest major class of insecticides modeled after the basic “nicotine” molecule with improved insecticide activity and generally low toxicity [[1]]. Some benefits of using the neonicotinoid compounds are ([[2]]):
· safer to humans and environmentally friendly vs. older products;
· used as seed coatings at very low amounts per acre;
· are water soluble and highly systemic in seeds and seedlings;
· targeted application (kernel coating) increases efficacy;
· precise application helps minimize environmental effects;
· easy to apply and requires no application equipment;
· reduces storage and transport effort;
· do not have offensive odors;
· are cost competitive.
The product name of the registered neonicotinoid insecticides are [[3]]: Intruder (active ingredient: acetamiprid), Admire (imidacloprid), Provado (imidacloprid), Actara (thiamethoxam), Centric (thiamethoxam), and Platinum (thiamethoxam).
The insecticidal activities of neonicotinoid compounds were previous studied [[4]] using a 3D and standard partial least squares regression (PLS) models. The insecticidal activity of neonicotinoid compounds defined as the logarithm of the reciprocal value of binding constant K_{i} against the nicotinic acetylcholine (Ach) receptor derived from the head of honeybee. The previous published standard PLS analysis has the following multiple linear regression equation:
n = 8, A = 3, r^{2} = 0.913, Q^{2} = 0.800, Log (1/K_{i}) = 2.045·C_{1j}+1.935·C_{2j}
+1.697·C_{3j}+1.765· C_{4j}+0.695· C_{5j}+0.519· C_{6j}0.610· C_{7j}+0.254· C_{8j} (1)
where n is the sample size, A is the number of components, r^{2} is the squared correlation coefficient, C is the Carbo similarity index, and Q^{2} value is defined as follows:
_{} (2)
where y_{i,obs} and y_{i,pred} are the observed and predicted activity for sample i, respectively, y_{ave} is the averaged activity and n is the sample size. The crossvalidation results indicate that three is the optimal number of components (see table 1).
Table 1. The 3way PLS analysis results
Component 
1 
2 
3 
4 
5 
Q^{2} 
0.597 
0.791 
0.855 
0.803 
0.829 
The aim of the research was to estimate the insecticidal activity of neonicotinoid compounds based on an original molecular descriptors family on structure activity relationships method and to estimate the predictive ability by comparing the results with previous reported 3DQSAR model.
Material and Method
Eight neonicotinoid compounds with insecticidal activity were reported previously as data set for a 3DQSAR study [4] and were investigate here. The neonicotinoids planar structures and insecticidal activities are in figure 1.
Molecule name 
X 
Z 
Insecticidal Activity 
na1 
NH 
CHNO_{2} 
7.67 
na2 
O 
CHNO_{2} 
6.51 
na3 
S 
CHNO_{2} 
6.94 
na4 
C 
CHNO_{2} 
6.45 
na5 
NH 
CHCN 
4.75 
na6 
NH 
NCN 
5.16 
na7 
NCH_{3} 
NNO_{2} 
3.79 
na8 
NH 
NNO_{2} 
5.81 
Figure 1. Eight neonicotinoids acting as insecticides
The methodology of MDF SAR modeling of insecticidal activity of neonicotinoid compounds is [[5]]:
· Sketch of the neonicotinoid compounds;
· Create the neonicotinoid compounds activity file;
· Generate the neonicotinoid compounds MDF members;
· Find the insecticidal activity SAR models;
· Validate and compare with previous reported results the MDF SAR models;
· Analyze the selected MDF SAR model.
Results
The calculated values for the mono and the best bivaried MDF SAR models and previously reported estimated values (3way PLS model 3W_PLS, and standard PLS model S_PLS) are in table 2.
Table 2. MDF SAR models and estimated insecticidal activity

Monovaried SAR 
Bivaried SAR 
Trivaried SAR 

19.1577.49∙iIDrSMg 
43.342.21∙ImMdsEg+3.74∙lIMMFQt 
Previously reported [4] 

iIDrSMg 
MDF SAR 
ImMdsEg 
lIMMFQt 
MDF SAR 
3W_PLS 
S_PLS 

na1 
0.1574 
6.961 
6.135 
5.926 
7.632 
6.94 
6.94 
na2 
0.1531 
7.293 
6.951 
5.742 
6.517 
6.75 
6.9 
na3 
0.1566 
7.023 
5.665 
6.373 
6.995 
6.36 
6.78 
na4 
0.1662 
6.275 
5.667 
6.512 
6.470 
7.02 
6.93 
na5 
0.1852 
4.804 
6.855 
6.264 
4.777 
4.77 
4.85 
na6 
0.1843 
4.877 
6.940 
6.111 
5.162 
5.18 
5.16 
na7 
0.1938 
4.140 
6.174 
6.931 
3.785 
3.3 
3.91 
na8 
0.1736 
5.706 
5.985 
6.518 
5.746 
6.15 
5.67 
Monovaried MDF SAR model has the following statistics:
r = 0.937; r^{2} = 0.878; r^{2}_{adj} = 0.858 r^{2}_{cv} = 0.756;
F = 43; p = 0.06% (3)
and bivaried MDF SAR model has the following statistics:
r = 0.999; r^{2} = 0.999; r^{2}_{adj} = 0.998 r^{2}_{cv} = 0.998;
F = 2864; p = 2·10^{6}_{ }% (4)
where r is the correlation coefficient, r^{2} is the rsquare, r^{2}_{adj} is the adjusted r^{2}, r^{2}_{cv} is the crossvalidation leave one out score, F is the statistical parameter of Fisher test, and p is the probability of wrong associated with F test.
The crossvalidation leave one out procedure predicted values for both mono and bivaried MDF SAR models are in table 3.
Table 3. Predicted values of insecticidal activity

na1 
na2 
na3 
na4 
na5 
na6 
na7 
na8 
Monovaried model 
6.734 
7.672 
7.052 
6.247 
4.821 
4.794 
4.408 
5.69 
Bivaried model 
7.606 
6.522 
7.024 
6.479 
4.788 
5.162 
3.775 
5.73 
The plot of monovaried MDF SAR model values for insecticidal activity of neonicotinoid compounds is in figure 2, and for bivaried MDF SAR model is in figure 3.
Figure 2. Plot of insecticidal activity vs. monovaried MDF SAR model
Figure 3. Plot of insecticidal activity vs. bivaried MDF SAR model
In order to validate the models, the set was split into the training and the test sets and the MLR models were rebuilt. The number of molecules from training set was chouse to be equal with four. The experiment was run for three times just for the bivaried MDF SAR model. The results are in table 4.
Table 4. Training vs. test sets results

Molecules in training set 
Model 
r^{2} 
F, p_{F}(%) 

Train 
Test 
Train 
Test 

1 
na3,na7,na8,na2 
42.72.17·ImMdsEg+3.69·lIMMFQt 
.999 
.999 
490, 3.2 
382, 3.6 
2 
na8,na5,na4,na7 
43.42.23·ImMdsEg+3.72·lIMMFQt 
.999 
.998 
452, 3.3 
228, 4.7 
3 
na4,na7,na6,na8 
43.32.21·ImMdsEg+3.73·lIMMFQt 
.999 
.999 
485, 3.2 
339, 3.8 
The results obtained with MDF SAR were compared with the previous reported ones (table 2) using the Steiger’s Z test. The results of comparison between correlated correlations obtained with MDF SAR bivaried model and previous reported results are in table 5.
Table 5. The models comparison results
Statistical parameter(s) 
Variables 
Value(s) 

1 
2 

r 
Insecticidal Activity 
Bivaried MDF SAR 
0.9996 
r 
Insecticidal Activity 
3W_PLS 
0.9325 
r 
Insecticidal Activity 
S_PLS 
0.9556 
r 
Bivaried MDF SAR 
3W_PLS 
0.9312 
r 
Bivaried MDF SAR 
S_PLS 
0.9591 
r 
S_PLS 
3W_PLS 
0.9698 
Z, p_{Z}(%) 
Bivaried MDF SAR 
3W_PLS 
5.5, 2·10^{5} 
Z, p_{Z}(%) 
Bivaried MDF SAR 
S_PLS 
5.1, 2·10^{4} 
Z, p_{Z}(%) 
S_PLS 
3W_PLS 
0.7, 25 
Discussions
The monovaried MDF SAR model of insecticidal activity of the eight neonicotinoid studied compounds use the iIDrSMg as molecular descriptor and gives us a probability of wrong model equal with 0.06%. The iIDrSMg descriptor take into consideration the geometric distance operator (g) and the atomic relative mass (M). Almost eightyeight percent of the variation in insecticidal activity of studied neonicotinoid is explainable by its linear relation with iIDrSMg, being of the molecular geometry nature.
The bivaried MDF SAR model use ImMdsEg and lIMMFQt descriptors, and is statistically significant (2·10^{6}_{ }%). This model takes into consideration the geometric distance operator (g), the topological distance operator (t), the atomic electronegativity (E), as well as the partial charge, semiempirical Extended Hückel model, Single Point approach (Q) in order to explain the insecticidal activity of the studied neonicotinoid. Ninetynine percent of the variation in insecticidal activity of studied neonicotinoid is explainable by its linear relation with ImMdsEg and lIMMFQt descriptors.
The descriptor from the best monovaried equation does not appear in the bivaried best model. It can be says that is an expected observation, if we consider the bivaried model as a refined of structureactivity dependency and the insecticidal activity are decomposed differently in two structural components. The absence of the best descriptor from the monovaried model from the pair(s) of best bivaried model demonstrates that it is no link between using of orthogonal descriptors (Principal and/or Dominant Component Analysis) and MDF SAR modeling.
All multiple linear regression models resulted from splitting the compounds in training and test set obtained appropriate rsquared values and were statistical significant, that demonstrate the validity of the MDF SAR model.
Our best model use only two molecular structure descriptors and compared with the previous reported model is significantly better (p_{Z} = 2·10^{5} comparing with 3W_PLS, p_{Z} = 2·10^{4} comparing with S_PLS). The MDF SAR model use only two structure descriptors (instead of three as was previous reported) and the statistical result is even better (squared correlation coefficient from 0.91 to 0.999 and crossvalidation score from 0.8 to 0.998).
The obtained scores of correlation in bivaried model make unpractical to search for better models with three or more molecular structure descriptors.
Even if the described structureactivity relationship methodology is a complex one and time consuming (takes about one or two days to complete the bivaried model findings on a P3 server and client type machine) the results worth the effort.
Conclusions
The MDF SAR methodology is a better solution in predicting the insecticidal activity of the studied neonicotinoid compounds giving better results compared with the previous reported 3way PLS method or standard PLD method.
The advantages represented by the simplicity of the best model and the easiness of the interpret it (it use just two molecular descriptors) and by the higher performances sustain the use of the using of the MDF SAR methodology in prediction of insecticidal activity of neonicotinoid compounds.
References
[1]. Tomizawa M., Casida J. E., Neonicotinoid Insecticide Toxicology: Mechanisms of Selective Action, Annual Review of Pharmacology and Toxicology, 2005, 45, p. 247268.
[2]. Van Duyn J., Neonicotinoid Insecticide Seed Coatings For Protection of Planted Corn Kernels and Seedlings, 78^{th} Annual Meeting of the Southeastern Branch Entomological Society of America, 2004 February 1518, USA, Internet, http://www.ces.ncsu.edu/plymouth/pubs/ent/CRGROWERS03.html then http://vl.academicdirect.org/molecular_topology/qsar_qspr_s/ref/Neonicotinoid.mht
[3]. Palumbo J. C., Ellsworth P. C., Dennehy T. J. , Nichols R. L., Crosscommodity Guidelines for Neonicotinoid Insecticides in Arizona, The University of Arizona Cooperative Extension, 2003, 17, AZ13195, Internet, http://cals.arizona.edu/pubs/insects/az1319.pdf then http://vl.academicdirect.org/molecular_topology/qsar_qspr_s/ref/az1319.pdf
[4]. Hasegawa K., Arakawa M., Funatsu K., 3DQSAR study of insecticidal neonicotinoid compounds based on 3way partial least squares model, Chemometrics and Intelligent Laboratory Systems, 1999, 47, p. 3340.
[5]. Jäntschi L., Molecular Descriptors Family on Structure Activity Relationships 1. The review of Methodology, Leonardo Electronic Journal of Practices and Technologies, AcademicDirect, 2005, Issue 6, p. 7698.