mtyrrell commited on
Commit
193a86d
·
1 Parent(s): 6d89039

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -3
README.md CHANGED
@@ -8,6 +8,12 @@ metrics:
8
  model-index:
9
  - name: IKT_classifier_target_best
10
  results: []
 
 
 
 
 
 
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -27,15 +33,34 @@ It achieves the following results on the evaluation set:
27
 
28
  ## Model description
29
 
30
- More information needed
31
 
32
  ## Intended uses & limitations
33
 
34
- More information needed
 
 
 
35
 
36
  ## Training and evaluation data
37
 
38
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  ## Training procedure
41
 
 
8
  model-index:
9
  - name: IKT_classifier_target_best
10
  results: []
11
+
12
+ widget:
13
+ - text: "To reduce greenhouse gas emissions by 37% below 2005 levels in 2025, and by 43% below 2005 levels in 2030."
14
+ example_title: "Target"
15
+ - text: "Change fiscal policies on fossil fuel by 2025 to enable the transition to 100% renewable energy generation in the transportation sector"
16
+ example_title: "Not Target"
17
  ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
33
 
34
  ## Model description
35
 
36
+ The model is a binary text classifier based on [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) and fine-tuned on text sourced from national climate policy documents.
37
 
38
  ## Intended uses & limitations
39
 
40
+ The classifier assigns a class of **'Target' or 'Negative' to denote alignment with stated national targets** as portrayed in extracted passages from the documents. The intended use is for climate policy researchers and analysts seeking to automate the process of reviewing lengthy, non-standardized PDF documents to produce summaries and reports.
41
+
42
+ The performance of the classifier is very high. On training, the classifier exhibited very good overall performance (F1 ~ 0.95). This performance was evenly balanced between precise identification of true positive classifications (precision ~ 0.95) and a wide net to capture as many true positives as possible (recall ~ 0.95). When tested on real world unseen test data, the performance was still very high (F1 ~ 0.9). However, testing was based on a fairly small out-of-sample dataset. Therefore classification performance will need to further evaluated on deployment.
43
+
44
 
45
  ## Training and evaluation data
46
 
47
+ The training dataset is comprised of labelled passages from 2 sources:
48
+ - [ClimateWatch NDC Sector data](https://www.climatewatchdata.org/data-explorer/historical-emissions?historical-emissions-data-sources=climate-watch&historical-emissions-gases=all-ghg&historical-emissions-regions=All%20Selected&historical-emissions-sectors=total-including-lucf%2Ctotal-including-lucf&page=1).
49
+ - [IKI TraCS Climate Strategies for Transport Tracker](https://changing-transport.org/wp-content/uploads/20220722_Tracker_Database.xlsx) implemented by GIZ and funded by theInternational Climate Initiative (IKI) of the German Federal Ministry for Economic Affairs and Climate Action (BMWK). Here we utilized the QA dataset (CW_NDC_data_Sector).
50
+
51
+ The combined dataset[GIZ/policy_qa_v0_1](https://huggingface.co/datasets/GIZ/policy_qa_v0_1) contains ~85k rows. Each row is duplicated twice, to provide varying sequence lengths (denoted by the values 'small', 'medium', and 'large', which correspond to sequence lengths of 60, 85, and 150 respectively - indicated in the 'strategy' column). This effectively means the dataset is reduced by 1/3 in useful size, and the 'strategy' value should be selected based on the use case. For this training, we utilized the 'medium' samples Furthermore, for each row, the 'context' column contains 3 samples of varying quality. The approach used to assess quality and select samples is described below.
52
+
53
+ The pre-processing operations used to produce the final training dataset were as follows:
54
+
55
+ 1. Dataset is filtered based on 'medium' value in 'strategy' column (sequence length = 85).
56
+ 2. For ClimateWatch, all rows are removed as there was assessed to be no taxonomical alignment with the IKITracs labels inherent to the dataset.
57
+ 3. For IKITracs, labels are assigned based on the presence of certain substring prefixes ('T_') based on 'parameter' values which correspond to text containing targets as assessed by human annotaters.
58
+ 4. If 'context_translated' is available and the 'language' is not English, 'context' is replaced with 'context_translated'. This results in the model being trained on English translations of original text samples.
59
+ 5. The dataset is "exploded" - i.e., the text samples in the 'context' column, which are lists, are converted into separate rows - and labels are merged to align with the associated samples.
60
+ 6. The 'match_onanswer' and 'answerWordcount' are used conditionally to select high quality samples (prefers high % of word matches in 'match_onanswer', but will take lower if there is a high 'answerWordcount')
61
+ 7. No data augmentation was conducted as the number of samples were high for the 'TARGET' class. The end result is an equal sample per class breakdown of:
62
+ > - TARGET: 777
63
+ > - NEGATIVE: 778
64
 
65
  ## Training procedure
66