GraDe_IF: Graph Denoising Diffusion for Inverse Protein Folding (NeurIPS 2023)

Description

Implementation for "Graph Denoising Diffusion for Inverse Protein Folding" arxiv link.

Requirements

To install requirements:

conda env create -f environment.yml

Usage

Like denoising-diffusion-pytorch, there is a brief introduction to show how this discrete diffusion work.

import sys
sys.path.append('diffusion')

import torch
from torch_geometric.data import Batch
from diffusion.gradeif import GraDe_IF,EGNN_NET
from dataset_src.generate_graph import prepare_graph

gnn = EGNN_NET(input_feat_dim=input_graph.x.shape[1]+input_graph.extra_x.shape[1],hidden_channels=10,edge_attr_dim=input_graph.edge_attr.shape[1])

diffusion_model = GraDe_IF(gnn)

graph = torch.load('dataset/process/test/3fkf.A.pt')
input_graph = Batch.from_data_list([prepare_graph(graph)])

loss = diffusion_model(input_graph)
loss.backward()

_,sample_seq = diffusion_model.ddim_sample(input_graph) #using structure information generate sequence

More details can be found in the jupyter notebook

Parameter Chosen in Sampling

Here is an ablation study of two key parameters, step and diverse, in the ddim_sample function used to get improved results presented in the paper. The following results were computed after 50 ensemble runs. One can find how to do ensembles in the jupyter notebook.

BLOSUM Kernel - Diverse Mode

Step	Recovery Rate	Perplexity	Single Sample Recovery Rate
500	0.5341	4.02	0.505
250	0.5370	4.06	0.4679
100	0.5356	4.98	0.4213
50	0.4827	8.02	0.3745

BLOSUM Kernel - Non-Diverse Mode

Step	Recovery Rate	Perplexity	Single Sample Recovery Rate
500	0.5342	4.02	0.505
250	0.5373	4.12	0.4741
100	0.5351	7.43	0.5016
50	0.4999	16.74	0.4736

Uniform Kernel - Diverse Mode

Step	Recovery Rate	Perplexity	Single Sample Recovery Rate
500	0.5286	4.08	0.5022
250	0.5292	4.13	0.4325
100	0.5329	5.28	0.4222
50	0.5341	5.91	0.4212

Uniform Kernel - Non-Diverse Mode

Step	Recovery Rate	Perplexity	Single Sample Recovery Rate
500	0.5286	4.08	0.5022
250	0.5273	4.09	0.4357
100	0.5238	9.49	0.5095
50	0.5285	15.53	0.5113

Comments

Our codebase for the EGNN models and discrete diffusion builds on EGNN, DiGress. Thanks for open-sourcing!

Citation

If you consider our codes and datasets useful, please cite:

@inproceedings{
      yi2023graph,
      title={Graph Denoising Diffusion for Inverse Protein Folding},
      author={Kai Yi and Bingxin Zhou and Yiqing Shen and Pietro Lio and Yu Guang Wang},
      booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
      year={2023},
      url={https://openreview.net/forum?id=u4YXKKG5dX}
      }

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
dataset		dataset
dataset_src		dataset_src
diffusion		diffusion
src		src
README.md		README.md
environment.yaml		environment.yaml
gradeif_architecture.png		gradeif_architecture.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GraDe_IF: Graph Denoising Diffusion for Inverse Protein Folding (NeurIPS 2023)

Description

Requirements

Usage

Parameter Chosen in Sampling

BLOSUM Kernel - Diverse Mode

BLOSUM Kernel - Non-Diverse Mode

Uniform Kernel - Diverse Mode

Uniform Kernel - Non-Diverse Mode

Comments

Citation

About

Releases

Packages

Languages

ykiiiiii/GraDe_IF

Folders and files

Latest commit

History

Repository files navigation

GraDe_IF: Graph Denoising Diffusion for Inverse Protein Folding (NeurIPS 2023)

Description

Requirements

Usage

Parameter Chosen in Sampling

BLOSUM Kernel - Diverse Mode

BLOSUM Kernel - Non-Diverse Mode

Uniform Kernel - Diverse Mode

Uniform Kernel - Non-Diverse Mode

Comments

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages