PepTune / README.md
pranamanam's picture
Update README.md
654009c verified
metadata
extra_gated_fields:
  Name: text
  Company: text
  Country: country
  Specific date: date_picker
  I want to use this model for:
    type: select
    options:
      - Research
      - Education
      - label: Other
        value: other
extra_gated_prompt: 'PepTune License: https://duke.box.com/s/5ghseh23rpsyou66kg60qr89sxt5twyu'
extra_gated_heading: Acknowledge license to access the repository
extra_gated_button_content: Acknowledge license
peptune

PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion

arXiv Paper: https://arxiv.org/abs/2412.17780

Peptide therapeutics, a major class of medicines, have achieved remarkable success across diseases like diabetes and cancer, with landmark examples such as GLP-1 receptor agonists revolutionizing the treatment of type-2 diabetes and obesity. Despite their success, designing peptides that satisfy multiple conflicting objectives, such as binding affinity, solubility, and membrane permeability, remains a major challenge. Classical drug development and target structure-based design methods are ineffective for such tasks, as they fail to optimize global functional properties critical for therapeutic efficacy. Existing generative frameworks are largely limited to continuous spaces, unconditioned outputs, or single-objective guidance, making them unsuitable for discrete sequence optimization across multiple properties. To address this, we present PepTune, a multi-objective discrete diffusion model for the simultaneous generation and optimization of therapeutic peptide SMILES. Built on the Masked Discrete Language Model (MDLM) framework, PepTune ensures valid peptide structures with state-dependent masking schedules and penalty-based objectives. To guide the diffusion process, we propose a Monte Carlo Tree Search (MCTS)-based strategy that balances exploration and exploitation to iteratively refine Pareto-optimal sequences. MCTS integrates classifier-based rewards with search-tree expansion, overcoming gradient estimation challenges and data sparsity inherent to discrete spaces. Using PepTune, we generate diverse, chemically-modified peptides optimized for multiple therapeutic properties, including target binding affinity, membrane permeability, solubility, hemolysis, and non-fouling characteristics on various disease-relevant targets. In total, our results demonstrate that MCTS-guided discrete diffusion is a powerful and modular approach for multi-objective sequence design in discrete state spaces.

We build our training framework on top of Masked Discrete Language Model.

Masked Discrete Language Model Framework

We optimize desired therapeutic properties of generated sequences based on Monte Carlo Tree Search

Monte Carlo Tree Search Schemetic View

Inference API, datasets, and sequences will be freely accessible to the academic community via a non-commercial license upon publication and provisional patent filing

Interactive Demo

You can try out the our peptide visualizer directly in your browser, other property classifiers will be added soon:

https://huggingface.co/spaces/ChatterjeeLab/SMILES2PEPTIDE

Usage

To use this repository, you agree to abide by the PepTune License.