+ Site Statistics
+ Search Articles
+ PDF Full Text Service
How our service works
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ Translate
+ Recently Requested

Towards a Universal SMILES representation - A standard method to generate canonical SMILES based on the InChI

Towards a Universal SMILES representation - A standard method to generate canonical SMILES based on the InChI

Journal of Cheminformatics 4(1): 22

There are two line notations of chemical structures that have established themselves in the field: the SMILES string and the InChI string. The InChI aims to provide a unique, or canonical, identifier for chemical structures, while SMILES strings are widely used for storage and interchange of chemical structures, but no standard exists to generate a canonical SMILES string. I describe how to use the InChI canonicalisation to derive a canonical SMILES string in a straightforward way, either incorporating the InChI normalisations (Inchified SMILES) or not (Universal SMILES). This is the first description of a method to generate canonical SMILES that takes stereochemistry into account. When tested on the 1.1 m compounds in the ChEMBL database, and a 1 m compound subset of the PubChem Substance database, no canonicalisation failures were found with Inchified SMILES. Using Universal SMILES, 99.79% of the ChEMBL database was canonicalised successfully and 99.77% of the PubChem subset. The InChI canonicalisation algorithm can successfully be used as the basis for a common standard for canonical SMILES. While challenges remain - such as the development of a standard aromatic model for SMILES - the ability to create the same SMILES using different toolkits will mean that for the first time it will be possible to easily compare the chemical models used by different toolkits.

Please choose payment method:

(PDF emailed within 0-6 h: $19.90)

Accession: 056616713

Download citation: RISBibTeXText

PMID: 22989151

DOI: 10.1186/1758-2946-4-22

Related references

SMILES-based QSAR approaches for carcinogenicity and anticancer activity: comparison of correlation weights for identical SMILES attributes. Anti-Cancer Agents in Medicinal Chemistry 11(10): 974-982, 2012

Smiles-based Qsar Approaches for Carcinogenicity and Anticancer Activity: Comparison of Correlation Weights for Identical Smiles Attributes. Anti-Cancer Agents in Medicinal Chemistry 11(10): 974-982, 2011

Additive SMILES-based optimal descriptors in QSAR modelling bee toxicity: Using rare SMILES attributes to define the applicability domain. Bioorganic and Medicinal Chemistry 16(9): 4801-4809, 2008

Samuel Smiles and his Surroundings Aileen Smiles. London, Robert Hale Ltd., 1956. Pp. 206 Illustrated 18s. Medical History 1(01): 84-85, 1957

All Smiles are Not Created Equal: Morphology and Timing of Smiles Perceived as Amused, Polite, and Embarrassed/Nervous. Journal of Nonverbal Behavior 33(1): 17-34, 2009

Can Duchenne smiles be feigned? New evidence on felt and false smiles. Emotion 9(6): 807-820, 2010

Leaving us with fond memories, smiles, SMILES and, alas, tears: a tribute to David Weininger, 1952-2016. Journal of Computer-Aided Molecular Design 32(2): 313-319, 2018

It is with pride to see the efforts, goals, and success of the Texas Dental Association Smiles Foundation--a slight understatement in that as dentists, we deal with more than smiles. Texas Dental Journal 124(12): 1178; Author Reply 1179, 2008

Convolutional neural network based on SMILES representation of compounds for detecting chemical motif. Bmc Bioinformatics 19(Suppl 19): 526-526, 2019

In silico toxicity prediction by support vector machine and SMILES representation-based string kernel. Sar and Qsar in Environmental Research 23(1-2): 141-153, 2012

Why Smiles Generate Leniency. Personality and Social Psychology Bulletin 21(3): 207-214, 1995

QSAR models for ACE-inhibitor activity of tri-peptides based on representation of the molecular structure by graph of atomic orbitals and SMILES. Structural Chemistry 23(6): 1873-1878, 2012

Studies on the smiles rearrangement. 13. Reaction conditions on the smiles rearrangement in the 3(2H)-pyridazinone system. Yakugaku Zasshi 93(2): 171-176, 1973

Smiles as signals of lower status in football players and fashion models: evidence that smiles are associated with lower dominance and lower prestige. Evolutionary Psychology 10(3): 371-397, 2013

Universal coverage in the land of smiles: lessons from Thailand's 30 Baht health reforms. Health Affairs 26(4): 999-1008, 2007