Skip to main content

Advertisement

Big Data in Chemistry

Edited by Igor V. Tetko, Helmholtz Zentrum München, Germany

The increasing volume of biomedical data in chemistry and life sciences requires the development of new methodologies and approaches for their analysis. Artificial Intelligence (AI) and machine learning, especially neural networks, are increasingly used in the chemical industry, in particular with respect to Big Data.

The goal of this special collection in Journal of Cheminformatics is to show progress and exemplify the current needs, trends and requirements for machine learning in chemical data analysis. In particular, it focuses on the use of chemical informatics and machine learning methodologies to analyse chemical Big Data, e.g. to predict biological activities and physico-chemical properties, facilitate property-oriented data mining, predict biological targets for compounds on a large scale, design new chemical compounds, and analyse large virtual chemical spaces.

The collection mainly contains a selection of articles to be presented during the BIGCHEM special session of the International Conference on Artificial Neural Networks (ICANN2019), which is co-organized by the European Neural Network Society and the Horizon2020 Marie Skłodowska-Curie Innovative Training Networks European Industrial Doctorate "Big Data in Chemistry" project

New articles will be added to the collection as they are published.


  1. Training neural networks with small and imbalanced datasets often leads to overfitting and disregard of the minority class. For predictive toxicology, however, models with a good balance between sensitivity an...

    Authors: Jennifer Hemmerich, Ece Asilar and Gerhard F. Ecker

    Citation: Journal of Cheminformatics 2020 12:18

    Content type: Research article

    Published on:

  2. We present SMILES-embeddings derived from the internal encoder state of a Transformer [1] model trained to canonize SMILES as a Seq2Seq problem. Using a CharNN [2] architecture upon the embeddings results in high...

    Authors: Pavel Karpov, Guillaume Godin and Igor V. Tetko

    Citation: Journal of Cheminformatics 2020 12:17

    Content type: Research article

    Published on:

  3. Designing a molecule with desired properties is one of the biggest challenges in drug development, as it requires optimization of chemical compound structures with respect to many complex properties. To improv...

    Authors: Łukasz Maziarka, Agnieszka Pocha, Jan Kaczmarczyk, Krzysztof Rataj, Tomasz Danel and Michał Warchoł

    Citation: Journal of Cheminformatics 2020 12:2

    Content type: Research article

    Published on:

  4. Neural Message Passing for graphs is a promising and relatively recent approach for applying Machine Learning to networked data. As molecules can be described intrinsically as a molecular graph, it makes sense...

    Authors: M. Withnall, E. Lindelöf, O. Engkvist and H. Chen

    Citation: Journal of Cheminformatics 2020 12:1

    Content type: Research article

    Published on:

  5. Deep learning methods applied to drug discovery have been used to generate novel structures. In this study, we propose a new deep learning architecture, LatentGAN, which combines an autoencoder and a generativ...

    Authors: Oleksii Prykhodko, Simon Viet Johansson, Panagiotis-Christos Kotsias, Josep Arús-Pous, Esben Jannik Bjerrum, Ola Engkvist and Hongming Chen

    Citation: Journal of Cheminformatics 2019 11:74

    Content type: Research article

    Published on:

  6. Recurrent Neural Networks (RNNs) trained with a set of molecules represented as unique (canonical) SMILES strings, have shown the capacity to create large chemical spaces of valid and meaningful structures. He...

    Authors: Josep Arús-Pous, Simon Viet Johansson, Oleksii Prykhodko, Esben Jannik Bjerrum, Christian Tyrchan, Jean-Louis Reymond, Hongming Chen and Ola Engkvist

    Citation: Journal of Cheminformatics 2019 11:71

    Content type: Research article

    Published on:

  7. This study aims at improving upon existing activity predictions methods by augmenting chemical structure fingerprints with bio-activity based fingerprints derived from high-throughput screening (HTS) data (HTS...

    Authors: Oliver Laufkötter, Noé Sturm, Jürgen Bajorath, Hongming Chen and Ola Engkvist

    Citation: Journal of Cheminformatics 2019 11:54

    Content type: Research article

    Published on: