Model Card for nl-bert

Provides TAPT (Task Adaptive Pretraining) model from "Enhancing Automated Software Traceability by Transfer Learning from Open-World Data".

Model Details

Model Description

This model was trained to predict trace links between issue and commits on GitHub data from 2016-21.

  • Developed by: Jinfeng Lin, University of Notre Dame
  • Shared by [optional]: Alberto Rodriguez, University of Notre Dame
  • Model type: BertForSequenceClassification
  • Language(s) (NLP): EN
  • License: MIT

Model Sources [optional]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Training Details

Please see cite paper for full training details.

Evaluation

Please see cited paper for full evaluation.

Results

The model achieved a MAP score improvement of over 20% compared to baseline models. See cited paper for full details.

Environmental Impact

  • Hardware Type: Distributed machine pool
  • Hours used: 72 hours

Technical Specifications [optional]

Model Architecture and Objective

The model uses a Single-BERT architecture from the TBERT framework, which performs well on traceability tasks by encoding concatenated source and target artifacts.

Compute Infrastructure

Hardware 300 servers in a distributed machine pool

Software

  • Transformers library
  • PyTorch
  • HTCondor for distributed computation

Citation

BibTeX:

@misc{lin2022enhancing, title={Enhancing Automated Software Traceability by Transfer Learning from Open-World Data}, author={Jinfeng Lin and Amrit Poudel and Wenhao Yu and Qingkai Zeng and Meng Jiang and Jane Cleland-Huang}, year={2022}, eprint={2207.01084}, archivePrefix={arXiv}, primaryClass={cs.SE} }

Model Card Authors

Alberto Rodriguez

Model Card Contact

Alberto Rodriguez ([email protected])

Downloads last month
56
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.