TagSolver is a web-based bioinformatics tool designed to predict the solubility of recombinant proteins, aiding researchers in the rational selection of solubility-enhancing tags.
The platform is powered by a LightGBM machine learning model that analyzes 425 sequence-derived features, including amino acid composition and key physicochemical properties. The model was trained on the updated UESolDS dataset, a comprehensive resource of over 70,000 protein sequences made available by Zhang et al. (2024). Its performance was subsequently validated on the independent PSI-Biology dataset to ensure generalizability.
Achieving a robust Area Under the Curve (AUC) of approximately 0.71, TagSolver serves as a reliable tool for in silico screening. By providing rapid, data-driven predictions, the tool aims to streamline experimental workflows and reduce the trial-and-error burden in protein engineering.
Developed by: Sadegh Zargan, Saba Mesdaghinia, Dr. Hasan Jalili, Dr. Bahareh Dabirmanesh, and Dr. Khosro Khajeh.