Changes between Dragon 5.3 and Dragon 5.4

In Dragon 5.4 some bugs have been fixed, thus leading to differences in values of some descriptors with respect to Dragon previous version. Precision in descriptor calculation has been further improved as well as calculation time, especially when single descriptor blocks are selected.

 

New atom types have been added to the set of atom types recognized by Dragon: Ge, Sb, Bi. Therefore, also molecules containing these atom types can be now correctly processed.

 

Molecules with a lot of cycles, such as fullerenes, are now detected and rejected by Dragon with the message "Too many cycles".

 

The maximum atom connectivity (i.e., the number of bonds of an atom) has been increased from 10 to 12.

 

The maximum allowed number of atoms in a molecule has been increased from 300 to 1,000.

 

The maximum number of processed molecules in a single run has been increased from 10,000 to 50,000.

 

The correct citation for Dragon software/documentation has been added in the main Dragon dialog window under the icon 'About'.

 

Aromaticity

Selenium (Se) has been added to the set of heteroatoms, which are allowed to belong to an aromatic cycle. Selenium is treated as oxygen by the Dragon aromaticity algorithm. For molecules containing aromatic rings including Se, descriptors related to bond multiplicity will give different values. Examples are: nBM, nAB, nDB, ARR, Ui, Burden eigenvalue descriptors, edge adjacency indices, molecular multiple path counts, Wiener-like and Balaban-like indices from weighted distance matrices, eigenvalue-based indices.

 

MDL file format

MDL molecule files created by ACD/Labs are now correctly imported. Check on the presence of the final string $$$$ has been omitted. Moreover, check on the presence of radicals has been omitted, thus allowing also molecules with radicals to be processed. Clearly, they may be rejected if just one atom has unusual valence. Finally, the charge field (M  CHG) is now correctly accounted for.

 

Sybyl file format

A new algorithm for reading the charge field has been implemented.

 

SMILES file format

The algorithm for reading molecules in SMILES notation has been further improved. In particular, an error has been fixed in the recognition of SMILES with a terminal double bond followed by a number for ring closure. Molecules containing the following elements Ho, He, Ha, Hf are now correctly rejected, being these atoms not encoded by Dragon.

 

Topological descriptors and connectivity indices

Valence connectivity indices, Zagreb indices by valence vertex degree (ZM1v and ZM2v), SMTIV index, GMTIV index, and Kier benzene-likeliness index (BLI) will have different values for all molecules containing at least one nitro group. Actually, in the previous Dragon version the formal charge, used to calculate valence vertex degrees, of O and N in the nitro group was erroneously kept equal to –1 and +1, respectively. Now, the formal charge both of O and N is set at zero according to our internal representation of the nitro group.

The atom covalent radius used in the Kier shape indices (S1k, S2K, and S3K) is now calculated by accounting also for the atom formal charge.

An error has been fixed in the algorithm for TIE; this error occurred in symmetryc molecules containing fluorine.

 

Information indices

An error has been fixed in the calculation of indices of neighbourhood symmetry (IC, TIC, SIC, BIC, CIC).

 

2D Autocorrelations

Autocorrelation descriptors are now calculated by using carbon-scaled atomic properties in order to reduce the atomic property average value. Actually, according to Hollas (B. Hollas, Commun.Math.Comp.Chem., 45, pp. 27-33 (2002)) two autocorrelation descriptors derived from two atomic properties with large average value are strongly correlated and therefore, encode redundant information.

 

Eigenvalue-based indices

VRM1 indices and the corresponding average values (VRM2 indices) are now calculated according to a modified Randic-type formula in order to avoid too large values for big and symmetryc molecules.

 

Geometrical descriptors

A new algorithm has been implemented for the calculation of aromaticity descriptors, thus obtaining some difference in values of HOMA, HOMT, AROM and RCI. In particular, while HOMA and HOMT encode information on any conjugated system, AROM and RCI are derived only from aromatic rings. In addition, the L/Bw descriptor, previously provided with two significant figures, is now provided with three significant figures.

 

GETAWAY descriptors

ITH and ISH have been changed. In effect, leverage values for molecule atoms have been rounded with less precision in order to obtain more consistent atom equivalence classes.

 

Functional group counts

An error has been fixed in the recognition of positive charged nitrogen (nN+), aldehydes (nRCHO), sulfones (nS(=O)2), sulfonamides (nSO2N), sulfonates (nSO3), sulfides (nRSR), ethers (nROR), thiophenes (nThiophenes), pyrazoles (nPyrazoles), pyrimidines (nPyrimidines).

 

Charge descriptors

According to our internal representation of nitro groups (i.e. two double N=O bonds without charges), missing values are now provided for the molecules that have only charges for N and O in the nitro group.

 

Topological Polar Surface Area

Errors have been fixed in the PSA algorithm with regards to the recognition of S and P atom types; therefore, for molecules containing S and P, values of TPSA(Tot) could be different.

 

Moriguchi MlogP

Several changes have been made in the algorithm for Moriguchi MlogP calculation. The correlation coefficient between MlogP and experimental logP is now 0.935 for the NCI logP data set (3576 compounds) and 0.898 for our own logP data set (10068 compounds). Together with MLOGP values, also MLOGP2, BLTF96, BLTD48, and BLTA96 values will be different for a lot of molecules, these descriptors being derived from MLOGP. The main changes in MlogP compuatation are:

RNG variable is determined by accounting only independent cycles instead of all the molecule cycles;
ALK variable can now be equal to one also for molecules containing hydrocarbon chain with at least 7 carbon atoms;
nitro-groups, thiocyanates, isothiocyanates, beta-lactams are now defined independently and with less restrictions than those defined for the functional groups block;
new algorithm for HB estimation;
new definition of N-oxide for QN variable;
alpha-aminoacids for AMP variable are now counted only if independent NH2-COOH pairs are present.

 

Ghose-Crippen-Viswanadhan atom-centred fragments and AlogP

Several changes have also been made in the algorithm for Ghose-Crippen-Viswanadhan atom-centred fragment determination and AlogP calculation. The correlation coefficient between AlogP and experimental logP is now 0.931 for the NCI logP data set (3568 compounds) and 0.932 for our own logP data set (9834 compounds). Together with ALOGP and atom-centred fragment values, also AMR, ALOGP2, Ghose-Viswanadhan-Wendoloski drug-like indices will be different for a lot of molecules, these descriptors being derived from the atom-centred fragments and ALOGP. The main changes have been performed in the algorithm for the recognition of fragments containing nitrogen. Moreover, missing values for ALOGP and related descriptors are now provided for molecules containing atoms different from those defined in the AlogP model (i.e. C, H, O, N, S, Se, P, B, Si, and halogens). Atom-centred fragments for P ylids (P-115) and halide ions (F-101, Cl-102, Br-103, I-104), previously unused and thus provided by missing value, are now correctly calculated. In addition, the labels of the following descriptors have been changed:

 

old

new

C-045

U-045

O-064

Se-064

O-065

Se-065

N-080

U-080

S-101

F-101

S-102

Cl-102

S-103

Br-103

S-104

I-104

S-105

U-105

S-111

Si-111

S-112

B-112

S-113

U-113

S-114

U-114

S-115

P-115