Changes between Dragon 5.3 and Dragon 5.4
In Dragon 5.4 some bugs have been fixed, thus leading to differences in values of some descriptors with respect to Dragon previous version. Precision in descriptor calculation has been further improved as well as calculation time, especially when single descriptor blocks are selected.
New atom types have been added to the set of atom types recognized by Dragon: Ge, Sb, Bi. Therefore, also molecules containing these atom types can be now correctly processed.
Molecules with a lot of cycles, such as fullerenes, are now detected and rejected by Dragon with the message "Too many cycles".
The maximum atom connectivity (i.e., the number of bonds of an atom) has been increased from 10 to 12.
The maximum allowed number of atoms in a molecule has been increased from 300 to 1,000.
The maximum number of processed molecules in a single run has been increased from 10,000 to 50,000.
The correct citation for Dragon software/documentation has been added in the main Dragon dialog window under the icon 'About'.
Selenium (Se) has been added to the set of heteroatoms, which are allowed to belong to an aromatic cycle. Selenium is treated as oxygen by the Dragon aromaticity algorithm. For molecules containing aromatic rings including Se, descriptors related to bond multiplicity will give different values. Examples are: nBM, nAB, nDB, ARR, Ui, Burden eigenvalue descriptors, edge adjacency indices, molecular multiple path counts, Wiener-like and Balaban-like indices from weighted distance matrices, eigenvalue-based indices.
MDL file format
MDL molecule files created by ACD/Labs are now correctly imported. Check on the presence of the final string $$$$ has been omitted. Moreover, check on the presence of radicals has been omitted, thus allowing also molecules with radicals to be processed. Clearly, they may be rejected if just one atom has unusual valence. Finally, the charge field (M CHG) is now correctly accounted for.
Sybyl file format
A new algorithm for reading the charge field has been implemented.
SMILES file format
The algorithm for reading molecules in SMILES notation has been further improved. In particular, an error has been fixed in the recognition of SMILES with a terminal double bond followed by a number for ring closure. Molecules containing the following elements Ho, He, Ha, Hf are now correctly rejected, being these atoms not encoded by Dragon.
Topological descriptors and connectivity indices
Valence connectivity indices, Zagreb indices by valence vertex degree (ZM1v and ZM2v), SMTIV index, GMTIV index, and Kier benzene-likeliness index (BLI) will have different values for all molecules containing at least one nitro group. Actually, in the previous Dragon version the formal charge, used to calculate valence vertex degrees, of O and N in the nitro group was erroneously kept equal to –1 and +1, respectively. Now, the formal charge both of O and N is set at zero according to our internal representation of the nitro group.
The atom covalent radius used in the Kier shape indices (S1k, S2K, and S3K) is now calculated by accounting also for the atom formal charge.
An error has been fixed in the algorithm for TIE; this error occurred in symmetryc molecules containing fluorine.
An error has been fixed in the calculation of indices of neighbourhood symmetry (IC, TIC, SIC, BIC, CIC).
Autocorrelation descriptors are now calculated by using carbon-scaled atomic properties in order to reduce the atomic property average value. Actually, according to Hollas (B. Hollas, Commun.Math.Comp.Chem., 45, pp. 27-33 (2002)) two autocorrelation descriptors derived from two atomic properties with large average value are strongly correlated and therefore, encode redundant information.
VRM1 indices and the corresponding average values (VRM2 indices) are now calculated according to a modified Randic-type formula in order to avoid too large values for big and symmetryc molecules.
A new algorithm has been implemented for the calculation of aromaticity descriptors, thus obtaining some difference in values of HOMA, HOMT, AROM and RCI. In particular, while HOMA and HOMT encode information on any conjugated system, AROM and RCI are derived only from aromatic rings. In addition, the L/Bw descriptor, previously provided with two significant figures, is now provided with three significant figures.
ITH and ISH have been changed. In effect, leverage values for molecule atoms have been rounded with less precision in order to obtain more consistent atom equivalence classes.
Functional group counts
An error has been fixed in the recognition of positive charged nitrogen (nN+), aldehydes (nRCHO), sulfones (nS(=O)2), sulfonamides (nSO2N), sulfonates (nSO3), sulfides (nRSR), ethers (nROR), thiophenes (nThiophenes), pyrazoles (nPyrazoles), pyrimidines (nPyrimidines).
According to our internal representation of nitro groups (i.e. two double N=O bonds without charges), missing values are now provided for the molecules that have only charges for N and O in the nitro group.
Topological Polar Surface Area
Errors have been fixed in the PSA algorithm with regards to the recognition of S and P atom types; therefore, for molecules containing S and P, values of TPSA(Tot) could be different.
Several changes have been made in the algorithm for Moriguchi MlogP calculation. The correlation coefficient between MlogP and experimental logP is now 0.935 for the NCI logP data set (3576 compounds) and 0.898 for our own logP data set (10068 compounds). Together with MLOGP values, also MLOGP2, BLTF96, BLTD48, and BLTA96 values will be different for a lot of molecules, these descriptors being derived from MLOGP. The main changes in MlogP compuatation are:
Ghose-Crippen-Viswanadhan atom-centred fragments and AlogP
Several changes have also been made in the algorithm for Ghose-Crippen-Viswanadhan atom-centred fragment determination and AlogP calculation. The correlation coefficient between AlogP and experimental logP is now 0.931 for the NCI logP data set (3568 compounds) and 0.932 for our own logP data set (9834 compounds). Together with ALOGP and atom-centred fragment values, also AMR, ALOGP2, Ghose-Viswanadhan-Wendoloski drug-like indices will be different for a lot of molecules, these descriptors being derived from the atom-centred fragments and ALOGP. The main changes have been performed in the algorithm for the recognition of fragments containing nitrogen. Moreover, missing values for ALOGP and related descriptors are now provided for molecules containing atoms different from those defined in the AlogP model (i.e. C, H, O, N, S, Se, P, B, Si, and halogens). Atom-centred fragments for P ylids (P-115) and halide ions (F-101, Cl-102, Br-103, I-104), previously unused and thus provided by missing value, are now correctly calculated. In addition, the labels of the following descriptors have been changed: