DeDeckerThomas commited on
Commit
3a6fbf2
·
1 Parent(s): 1b3bfc8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -12
README.md CHANGED
@@ -85,18 +85,19 @@ class KeyphraseExtractionPipeline(TokenClassificationPipeline):
85
 
86
  ```python
87
  # Load pipeline
88
- model_name = "DeDeckerThomas/keyphrase-extraction-kbir-kpcrowd"
89
  extractor = KeyphraseExtractionPipeline(model=model_name)
90
  ```
91
  ```python
92
  # Inference
93
  text = """
94
  Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
95
- Since this is a time-consuming process, Artificial Intelligence is used to automate it.
96
- Currently, classical machine learning methods, that use statistics and linguistics, are widely used for the extraction process.
97
- The fact that these methods have been widely used in the community has the advantage that there are many easy-to-use libraries.
98
- Now with the recent innovations in deep learning methods (such as recurrent neural networks and transformers, GANS, …),
99
- keyphrase extraction can be improved. These new methods also focus on the semantics and context of a document, which is quite an improvement.
 
100
  """.replace(
101
  "\n", ""
102
  )
@@ -108,14 +109,18 @@ print(keyphrases)
108
 
109
  ```
110
  # Output
111
- ['Artificial Intelligence' 'GANS' 'Keyphrase extraction'
112
- 'classical machine learning' 'deep learning methods'
113
- 'keyphrase extraction' 'linguistics' 'recurrent neural networks'
114
- 'semantics' 'statistics' 'text analysis' 'transformers']
 
 
 
 
115
  ```
116
 
117
  ## 📚 Training Dataset
118
- KPCrowd is a keyphrase a broadcast news transcription dataset consisting of 500 English broadcast news stories from 10 different categories (art and culture, business, crime, fashion, health, politics us, politics world, science, sports, technology) with 50 docs per category. This dataset is annotated by multiple annotators that were required to look at the same news story and assign a set of keyphrases from the text itself.
119
 
120
  You can find more information here: https://huggingface.co/datasets/midas/kpcrowd and https://github.com/LIAAD/KeywordExtractor-Datasets.
121
 
@@ -218,4 +223,4 @@ The model achieves the following results on the Inspec test set:
218
  For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
219
 
220
  ## 🚨 Issues
221
- Please feel free to contact Thomas De Decker for any problems with this model.
 
85
 
86
  ```python
87
  # Load pipeline
88
+ model_name = "ml6team/keyphrase-extraction-kbir-kpcrowd"
89
  extractor = KeyphraseExtractionPipeline(model=model_name)
90
  ```
91
  ```python
92
  # Inference
93
  text = """
94
  Keyphrase extraction is a technique in text analysis where you extract the important keyphrases from a text.
95
+ Since this is a time-consuming process, Artificial Intelligence is used to automate it.
96
+ Currently, classical machine learning methods, that use statistics and linguistics,
97
+ are widely used for the extraction process. The fact that these methods have been widely used in the community
98
+ has the advantage that there are many easy-to-use libraries. Now with the recent innovations in NLP,
99
+ transformers can be used to improve keyphrase extraction. Transformers also focus on the semantics
100
+ and context of a document, which is quite an improvement.
101
  """.replace(
102
  "\n", ""
103
  )
 
109
 
110
  ```
111
  # Output
112
+ ['Artificial Intelligence', 'Keyphrase extraction', 'NLP',
113
+ 'Transformers also', 'advantage', 'automate',
114
+ 'classical machine learning', 'community', 'context', 'document',
115
+ 'extract', 'extraction', 'extraction process', 'focus',
116
+ 'important', 'improvement', 'innovations', 'keyphrase',
117
+ 'keyphrases', 'libraries', 'linguistics', 'methods', 'process',
118
+ 'recent', 'semantics', 'statistics', 'technique', 'text',
119
+ 'text analysis', 'time-consuming', 'transformers', 'widely']
120
  ```
121
 
122
  ## 📚 Training Dataset
123
+ KPCrowd is a broadcast news transcription dataset consisting of 500 English broadcast news stories from 10 different categories (art and culture, business, crime, fashion, health, politics us, politics world, science, sports, technology) with 50 docs per category. This dataset is annotated by multiple annotators that were required to look at the same news story and assign a set of keyphrases from the text itself.
124
 
125
  You can find more information here: https://huggingface.co/datasets/midas/kpcrowd and https://github.com/LIAAD/KeywordExtractor-Datasets.
126
 
 
223
  For more information on the evaluation process, you can take a look at the keyphrase extraction evaluation notebook.
224
 
225
  ## 🚨 Issues
226
+ Please feel free to start discussions in the Community Tab.