File size: 1,784 Bytes
20b6b4b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
license: apache-2.0
datasets:
- AISE-TUDelft/Capybara
tags:
- code
---
# BinT5
- **Repository: https://github.com/AISE-TUDelft/Capybara-BinT5**
- **Paper: https://huggingface.co/papers/2301.01701**
- **Point of Contact: https://huggingface.co/aalkaswan**
- **Raw Data: https://zenodo.org/records/7229913**
BinT5 is a Binary Code Summarization model, the base models are [CodeT5]() and fine-tuned with [Capybara]().
We offer 5 variations of the model:
| Name | Training Data |
|-----------------------------------------------------|------------------------------------------------------|
| [BinT5-C](https://huggingface.co/AISE-TUDelft/BinT5-C) | C Source |
| [BinT5-Decom](https://huggingface.co/AISE-TUDelft/BinT5-Decom) | Decompiled C Binaries |
| [BinT5-Stripped](https://huggingface.co/AISE-TUDelft/BinT5-Stripped) | Stripped Decompiled C Binaries |
| [BinT5-Demi](https://huggingface.co/AISE-TUDelft/BinT5-Demi) | Demi-stripped Decompiled C Binaries |
| [BinT5-NoFunName](https://huggingface.co/AISE-TUDelft/BinT5-NoFunName) | Decompiled C Binaries with the Function Name removed |
### Citation Information
```
@inproceedings{alkaswan2023extending,
title={Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries},
author={Al-Kaswan, Ali and Ahmed, Toufique and Izadi, Maliheh and Sawant, Anand Ashok and Devanbu, Premkumar and van Deursen, Arie},
booktitle={2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)},
pages={260--271},
year={2023},
organization={IEEE}
}
``` |