PepTune / README.md
pranamanam's picture
Update README.md
654009c verified
---
extra_gated_fields:
Name: text
Company: text
Country: country
Specific date: date_picker
I want to use this model for:
type: select
options:
- Research
- Education
- label: Other
value: other
extra_gated_prompt: "PepTune License: https://duke.box.com/s/5ghseh23rpsyou66kg60qr89sxt5twyu"
extra_gated_heading: Acknowledge license to access the repository
extra_gated_button_content: Acknowledge license
---
<div align="center">
<img src="peptune.png" alt="peptune" width="300" height="300">
</div>
# PepTune: *De Novo* Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
arXiv Paper: <https://arxiv.org/abs/2412.17780>
Peptide therapeutics, a major class of medicines, have achieved remarkable success across diseases like diabetes and cancer, with landmark examples such as GLP-1 receptor agonists revolutionizing the treatment of type-2 diabetes and obesity. Despite their success, designing peptides that satisfy multiple conflicting objectives, such as binding affinity, solubility, and membrane permeability, remains a major challenge. Classical drug development and target structure-based design methods are ineffective for such tasks, as they fail to optimize global functional properties critical for therapeutic efficacy. Existing generative frameworks are largely limited to continuous spaces, unconditioned outputs, or single-objective guidance, making them unsuitable for discrete sequence optimization across multiple properties. To address this, we present **PepTune**, a multi-objective discrete diffusion model for the simultaneous generation and optimization of therapeutic peptide SMILES. Built on the Masked Discrete Language Model (MDLM) framework, PepTune ensures valid peptide structures with state-dependent masking schedules and penalty-based objectives. To guide the diffusion process, we propose a Monte Carlo Tree Search (MCTS)-based strategy that balances exploration and exploitation to iteratively refine Pareto-optimal sequences. MCTS integrates classifier-based rewards with search-tree expansion, overcoming gradient estimation challenges and data sparsity inherent to discrete spaces. Using PepTune, we generate diverse, chemically-modified peptides optimized for multiple therapeutic properties, including target binding affinity, membrane permeability, solubility, hemolysis, and non-fouling characteristics on various disease-relevant targets. In total, our results demonstrate that MCTS-guided discrete diffusion is a powerful and modular approach for multi-objective sequence design in discrete state spaces.
## We build our training framework on top of [Masked Discrete Language Model](https://huggingface.co/kuleshov-group/mdlm-owt).
![Masked Discrete Language Model Framework](mdlm.png)
## We optimize desired therapeutic properties of generated sequences based on Monte Carlo Tree Search
![Monte Carlo Tree Search Schemetic View](mcts.png)
## Inference API, datasets, and sequences will be freely accessible to the academic community via a non-commercial license upon publication and provisional patent filing
## Interactive Demo
You can try out the our peptide visualizer directly in your browser, other property classifiers will be added soon:
<https://huggingface.co/spaces/ChatterjeeLab/SMILES2PEPTIDE>
## Usage
To use this repository, you agree to abide by the [PepTune License](https://duke.box.com/s/5ghseh23rpsyou66kg60qr89sxt5twyu).