Gx2Mol: De Novo Generation of Hit-like Molecules from Gene Expression Profiles
Abstract
De novo generation of hit-like molecules is a challenging task in the drug discovery process. Most methods in previous studies learn the semantics and syntax of molecular structures by analyzing molecular graphs or simplified molecular input line entry system (SMILES) strings; however, they do not take into account the drug responses of the biological systems consisting of genes and proteins. In this study we propose a deep generative model, Gx2Mol, which utilizes g ene e x pression profiles to generate mol ecular structures with desirable phenotypes for arbitrary target proteins. In the algorithm, a variational autoencoder is employed as a feature extractor to learn the latent feature distribution of the gene expression profiles. Then, a long short-term memory is leveraged as the chemical generator to produce syntactically valid SMILES strings that satisfy the feature conditions of the gene expression profile extracted by the feature extractor. Experimental results demonstrate that Gx2Mol produces new molecules with potential bioactivities and drug-like properties. The source code is available at: https://github.com/naruto7283/Gx2Mol .
Cite
Text
Li and Yamanishi. "Gx2Mol: De Novo Generation of Hit-like Molecules from Gene Expression Profiles." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-032-06066-2_20Markdown
[Li and Yamanishi. "Gx2Mol: De Novo Generation of Hit-like Molecules from Gene Expression Profiles." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/li2025ecmlpkdd-gx2mol/) doi:10.1007/978-3-032-06066-2_20BibTeX
@inproceedings{li2025ecmlpkdd-gx2mol,
title = {{Gx2Mol: De Novo Generation of Hit-like Molecules from Gene Expression Profiles}},
author = {Li, Chen and Yamanishi, Yoshihiro},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2025},
pages = {333-349},
doi = {10.1007/978-3-032-06066-2_20},
url = {https://mlanthology.org/ecmlpkdd/2025/li2025ecmlpkdd-gx2mol/}
}