--- title: Accuracy emoji: 🤗 colorFrom: blue colorTo: red sdk: gradio sdk_version: 3.19.1 app_file: app.py pinned: false tags: - evaluate - metric description: >- Balanced Accuracy is the average of recall obtained on each class. It can be computed with: Balanced Accuracy = (TPR + TNR) / N Where: TPR: True positive rate TNR: True negative rate N: Number of classes --- # Metric Card for Balanced Accuracy ## Metric Description Balanced Accuracy is the average of recall obtained on each class. It can be computed with: Balanced Accuracy = (TPR + TNR) / N Where: TPR: True positive rate TNR: True negative rate N: Number of classes ## How to Use At minimum, this metric requires predictions and references as inputs. ```python >>> accuracy_metric = evaluate.load("hyperml/balanced_accuracy") >>> results = accuracy_metric.compute(references=[0, 1], predictions=[0, 1]) >>> print(results) {'balanced_accuracy': 1.0} ``` ### Inputs **predictions** (list of int): Predicted labels. **references** (list of int): Ground truth labels. **sample_weight** (list of float): Sample weights Defaults to None. **adjusted** (boolean): If set to True, adjusts the score by accounting for chance. Useful in handling imbalanced datasets. Defaults to False. ### Output Values - **balanced_accuracy** (float): Balanced Accuracy score. Minimum possible value is 0. Maximum possible value is 1.0. A higher score means higher balanced accuracy. Output Example(s): ```python {'balanced_accuracy': 1.0} ``` This metric outputs a dictionary, containing the balanced accuracy score. #### Values from Popular Papers Balanced accuracy is often used to report performance on supervised classification tasks such as sentiment analysis or fraud detection, where there is a severe imbalance in the classes. ### Examples Example 1-A simple example ```python >>> balanced_accuracy_metric = evaluate.load("balanced_accuracy") >>> results = balanced_accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0]) >>> print(results) {'balanced_accuracy': 0.5} ``` Example 2-The same as Example 1, except with `sample_weight` set. ```python >>> balanced_accuracy_metric = evaluate.load("balanced_accuracy") >>> results = balanced_accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0], sample_weight=[0.5, 2, 0.7, 0.5, 9, 0.4]) >>> print(results) {'balanced_accuracy': 0.8778625954198473} # TODO: check if this is correct ``` Example 3-The same as Example 1, except with `adjusted` set to `True`. ```python >>> balanced_accuracy_metric = evaluate.load("balanced_accuracy") >>> results = balanced_accuracy_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 1, 1, 2, 1, 0], adjusted=True) >>> print(results) {'balanced_accuracy': 0.8} # TODO: check if this is correct ``` ## Limitations and Bias The balanced accuracy metric has limitations when it comes to extreme cases such as perfectly balanced or highly imbalanced datasets. For example, in perfectly balanced datasets, it behaves the same as standard accuracy. However, in highly imbalanced datasets where a class has very few samples, a small change in the prediction for that class can cause a large change in the balanced accuracy score. ## Citation(s) ```bibtex @article{scikit-learn, title={Scikit-learn: Machine Learning in {P}ython}, author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V. and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P. and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.}, journal={Journal of Machine Learning Research}, volume={12}, pages={2825--2830}, year={2011} } ``` ## Further References