kperkins411 commited on
Commit
623a245
1 Parent(s): d766e51

Add new SentenceTransformer model.

Browse files
Files changed (3) hide show
  1. README.md +243 -169
  2. config.json +1 -1
  3. model.safetensors +1 -1
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  datasets: []
3
  language: []
4
  library_name: sentence-transformers
@@ -39,96 +40,155 @@ tags:
39
  - sentence-similarity
40
  - feature-extraction
41
  - generated_from_trainer
42
- - dataset_size:93522
43
  - loss:TripletLoss
44
  widget:
45
- - source_sentence: What techniques does the platform utilize to remember login credentials,
46
- analyze service usage, personalize user experience, and manage advertisement delivery?
47
  sentences:
48
- - 'Cookies and other electronic technologies. When you use the Services, we use
49
- persistent and session cookies and other similar technologies to: (a) store your
50
- username and password; (b) analyze the usage of our sites and Services by collecting
51
- the information discussed above; (c) customize the Services to your preferences;
52
- and (d) display advertising on the Services. We may also use other Internet technologies,
53
- such as web beacons or pixel tags and other similar technologies, to deliver or
54
- communicate with cookies and analyze your use of the Services. We also may include
55
- Web beacons in e-mail messages or newsletters to determine whether the message
56
- has been opened and for other analytics.'
57
- - We also use cookies to gather information regarding the date and time of your
58
- visit and what your search and view. Cookies are computer files that are saved
59
- to your device and which may store information about you, your device, and your
60
- browser, but they do not include individual's names, or other information that
61
- is personally identifiable. For example, cookies are used to identify user's device
62
- and record the visit so that the UCWeb Services can allow you access to your account
63
- without requiring you to reenter the information.
64
- - With respect to Qwik-Fix Pro delivered by PivX to Detto on CD-Rom, PivX warrants
65
- that for a period of thirty (30) days following delivery to Detto, the media on
66
- which Qwik-Fix Pro is furnished to Detto will be free from defects in materials
67
- and workmanship during normal use.
68
- - source_sentence: What type of license is Valeant granting to Dova?
 
 
 
 
 
 
69
  sentences:
70
- - 1.8. The term "Licensed Intellectual Property" means individually, collectively
71
- or in any combination, T&B's copyrights (whether registered or not), including,
72
- without limitation, the Educational Materials and any and all copyrightable literary
73
- works and audio-visual works developed for use in the Business, trademarks and
74
- trade names (whether registered or unregistered) used in connection with the Business;
75
- as well as customer lists, concepts, developments, trade secrets, methods, systems,
76
- programs, improvements, data and information (whether in perceivable or machine-readable
77
- form), and works of authorship including, but not limited to the (a) the Licensed
78
- Marks and (b) the name, image, and likeness of the T&B Personality.
79
- - '[***], Valeant hereby grants to Dova a fully paid-up, royalty free, non-transferable,
80
- non- exclusive license (with a limited right to sub-license to its Affiliates)
81
- to any Valeant Property that appears on, embodied on or contained in the Product
82
- materials or Product Labeling solely for use in connection with Dova''s promotion
83
- or other commercialization of the Product in the Territory.'
84
- - VOTOCAST expressly reserves the right to change its rates charged hereunder for
85
- the Services during any Renewal Term (as detined herein) but agrees that rates
86
- may not increase by more than ten percent (10%) during any Renewal Term.
87
- - source_sentence: Can the company legally obtain the user's mobile device name and
88
- type?
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
  sentences:
90
- - '(d) As used in this Agreement, the term "Post-Term Period" shall mean a continuous
91
- uninterrupted period of two (2) years from the date of:'
92
- - You will be required to enter a valid phone number while placing an order on the
93
- App. By registering your phone number with us, you consent to be contacted by
94
- us via phone calls and/or SMS notifications, in case of any order or shipment
95
- or delivery related updates. We will not use your personal information to initiate
96
- any promotional phone calls or SMS.
97
- - 3.5. By accessing or using the Services via Your mobile phone, or other mobile
98
- device, You are authorising Us to collect its unique device identifier and IP
99
- address. We may also collect the name You have associated with Your mobile phone
100
- or other mobile device, device type, telephone number, country, and any other
101
- information You choose to provide. We may also access Your contacts to enable
102
- You to invite friends to join You on the Services if you request that we do this
103
- on your behalf.
104
- - source_sentence: How are Contact Center Products priced?
 
 
 
 
 
 
105
  sentences:
106
- - If circumstances require VOTOCAST to raise its rates more than ten percent (10%)
107
- during any Renewal Term, VOTOCAST will provide Licensee cost related supporting
108
- documentation to justify the rate increase.
109
- - 1.32 "Maximum Capacity" has the meaning set forth in Section 2.6.
110
- - 2.12 Contact Center Products
111
- - source_sentence: Are device-level setting opt outs effective against all types of
112
- targeted advertising?
 
 
 
 
 
113
  sentences:
114
- - A User may prevent or limit targeted advertising from device settings which may
115
- vary from device to device. For example, if you are using an iOS device with iOS
116
- 7 or a newer version, this can be done from settings/privacy/advertising by resetting
117
- advertising identifier or limiting ad tracking setting. If you are using Google
118
- Android device with Android 2.3 or a later version, you can find the ad identifier
119
- settings in the app drawer under Google Settings> Ads. Please note that activation
120
- of any applicable "Do-not-track" settings on the User's mobile device or any other
121
- device-level setting opt outs for targeted advertising, (to the extent that such
122
- options exist) may not cease all tracking activities by TabTale's third parties'
123
- service providers.
124
- - We may collect information that You give to us, for example when You sign up to
125
- use our Services (in the Service or via third party login/connect). This information
126
- may include your name, unique username, pictures of yourself, e-mail address,
127
- date of birth, phone number.
128
- - Cookies and Advertising Please refer to our Cookie Statement for more information
129
- about your choices around cookies and related technologies.
 
 
 
 
 
 
 
130
  model-index:
131
- - name: SentenceTransformer
132
  results:
133
  - task:
134
  type: information-retrieval
@@ -138,106 +198,106 @@ model-index:
138
  type: msmarco-distilbert-base-v2
139
  metrics:
140
  - type: cosine_accuracy@1
141
- value: 0.6938676381299332
142
  name: Cosine Accuracy@1
143
  - type: cosine_accuracy@3
144
- value: 0.8775956284153006
145
  name: Cosine Accuracy@3
146
  - type: cosine_accuracy@5
147
- value: 0.9355191256830601
148
  name: Cosine Accuracy@5
149
  - type: cosine_accuracy@10
150
- value: 0.9754705525197328
151
  name: Cosine Accuracy@10
152
  - type: cosine_precision@1
153
- value: 0.6938676381299332
154
  name: Cosine Precision@1
155
  - type: cosine_precision@3
156
- value: 0.2925318761384335
157
  name: Cosine Precision@3
158
  - type: cosine_precision@5
159
- value: 0.18710382513661206
160
  name: Cosine Precision@5
161
  - type: cosine_precision@10
162
- value: 0.09754705525197328
163
  name: Cosine Precision@10
164
  - type: cosine_recall@1
165
- value: 0.6938676381299332
166
  name: Cosine Recall@1
167
  - type: cosine_recall@3
168
- value: 0.8775956284153006
169
  name: Cosine Recall@3
170
  - type: cosine_recall@5
171
- value: 0.9355191256830601
172
  name: Cosine Recall@5
173
  - type: cosine_recall@10
174
- value: 0.9754705525197328
175
  name: Cosine Recall@10
176
  - type: cosine_ndcg@10
177
- value: 0.8405203190247353
178
  name: Cosine Ndcg@10
179
  - type: cosine_mrr@10
180
- value: 0.7965159356598299
181
  name: Cosine Mrr@10
182
  - type: cosine_map@100
183
- value: 0.7978263921544507
184
  name: Cosine Map@100
185
  - type: dot_accuracy@1
186
- value: 0.694474802671524
187
  name: Dot Accuracy@1
188
  - type: dot_accuracy@3
189
- value: 0.8751669702489374
190
  name: Dot Accuracy@3
191
  - type: dot_accuracy@5
192
- value: 0.9306618093503339
193
  name: Dot Accuracy@5
194
  - type: dot_accuracy@10
195
- value: 0.9748633879781421
196
  name: Dot Accuracy@10
197
  - type: dot_precision@1
198
- value: 0.694474802671524
199
  name: Dot Precision@1
200
  - type: dot_precision@3
201
- value: 0.2917223234163125
202
  name: Dot Precision@3
203
  - type: dot_precision@5
204
- value: 0.18613236187006682
205
  name: Dot Precision@5
206
  - type: dot_precision@10
207
- value: 0.0974863387978142
208
  name: Dot Precision@10
209
  - type: dot_recall@1
210
- value: 0.694474802671524
211
  name: Dot Recall@1
212
  - type: dot_recall@3
213
- value: 0.8751669702489374
214
  name: Dot Recall@3
215
  - type: dot_recall@5
216
- value: 0.9306618093503339
217
  name: Dot Recall@5
218
  - type: dot_recall@10
219
- value: 0.9748633879781421
220
  name: Dot Recall@10
221
  - type: dot_ndcg@10
222
- value: 0.8396499937012809
223
  name: Dot Ndcg@10
224
  - type: dot_mrr@10
225
- value: 0.7956702421911849
226
  name: Dot Mrr@10
227
  - type: dot_map@100
228
- value: 0.7970281120329098
229
  name: Dot Map@100
230
  ---
231
 
232
- # SentenceTransformer
233
 
234
- This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
235
 
236
  ## Model Details
237
 
238
  ### Model Description
239
  - **Model Type:** Sentence Transformer
240
- <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
241
  - **Maximum Sequence Length:** 350 tokens
242
  - **Output Dimensionality:** 768 tokens
243
  - **Similarity Function:** Cosine Similarity
@@ -278,9 +338,9 @@ from sentence_transformers import SentenceTransformer
278
  model = SentenceTransformer("kperkins411/msmarco-distilbert-base-v2_triplet_legal")
279
  # Run inference
280
  sentences = [
281
- 'Are device-level setting opt outs effective against all types of targeted advertising?',
282
- 'A User may prevent or limit targeted advertising from device settings which may vary from device to device. For example, if you are using an iOS device with iOS 7 or a newer version, this can be done from settings/privacy/advertising by resetting advertising identifier or limiting ad tracking setting. If you are using Google Android device with Android 2.3 or a later version, you can find the ad identifier settings in the app drawer under Google Settings> Ads. Please note that activation of any applicable "Do-not-track" settings on the User\'s mobile device or any other device-level setting opt outs for targeted advertising, (to the extent that such options exist) may not cease all tracking activities by TabTale\'s third parties\' service providers.',
283
- 'Cookies and Advertising Please refer to our Cookie Statement for more information about your choices around cookies and related technologies.',
284
  ]
285
  embeddings = model.encode(sentences)
286
  print(embeddings.shape)
@@ -326,36 +386,36 @@ You can finetune this model on your own dataset.
326
 
327
  | Metric | Value |
328
  |:--------------------|:-----------|
329
- | cosine_accuracy@1 | 0.6939 |
330
- | cosine_accuracy@3 | 0.8776 |
331
- | cosine_accuracy@5 | 0.9355 |
332
- | cosine_accuracy@10 | 0.9755 |
333
- | cosine_precision@1 | 0.6939 |
334
- | cosine_precision@3 | 0.2925 |
335
- | cosine_precision@5 | 0.1871 |
336
- | cosine_precision@10 | 0.0975 |
337
- | cosine_recall@1 | 0.6939 |
338
- | cosine_recall@3 | 0.8776 |
339
- | cosine_recall@5 | 0.9355 |
340
- | cosine_recall@10 | 0.9755 |
341
- | cosine_ndcg@10 | 0.8405 |
342
- | cosine_mrr@10 | 0.7965 |
343
- | **cosine_map@100** | **0.7978** |
344
- | dot_accuracy@1 | 0.6945 |
345
- | dot_accuracy@3 | 0.8752 |
346
- | dot_accuracy@5 | 0.9307 |
347
- | dot_accuracy@10 | 0.9749 |
348
- | dot_precision@1 | 0.6945 |
349
- | dot_precision@3 | 0.2917 |
350
- | dot_precision@5 | 0.1861 |
351
- | dot_precision@10 | 0.0975 |
352
- | dot_recall@1 | 0.6945 |
353
- | dot_recall@3 | 0.8752 |
354
- | dot_recall@5 | 0.9307 |
355
- | dot_recall@10 | 0.9749 |
356
- | dot_ndcg@10 | 0.8396 |
357
- | dot_mrr@10 | 0.7957 |
358
- | dot_map@100 | 0.797 |
359
 
360
  <!--
361
  ## Bias, Risks and Limitations
@@ -376,7 +436,7 @@ You can finetune this model on your own dataset.
376
  #### Unnamed Dataset
377
 
378
 
379
- * Size: 93,522 training samples
380
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
381
  * Approximate statistics based on the first 1000 samples:
382
  | | anchor | positive | negative |
@@ -402,19 +462,19 @@ You can finetune this model on your own dataset.
402
  #### Unnamed Dataset
403
 
404
 
405
- * Size: 1,055 evaluation samples
406
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
407
  * Approximate statistics based on the first 1000 samples:
408
  | | anchor | positive | negative |
409
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
410
  | type | string | string | string |
411
- | details | <ul><li>min: 7 tokens</li><li>mean: 18.05 tokens</li><li>max: 152 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 98.99 tokens</li><li>max: 350 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 105.03 tokens</li><li>max: 350 tokens</li></ul> |
412
  * Samples:
413
- | anchor | positive | negative |
414
- |:---------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
415
- | <code>Do pixel tags remain invisible when embedded in emails?</code> | <code>Pixel tags. We embed pixel tags (also called web beacons or clear GIFs) on web pages, ads, and emails. These tiny, invisible graphics are used to access cookies and track user activities (such as how many times a page is viewed). We use pixel tags to measure the popularity of our features and services. Ad companies also use pixel tags to measure the number of ads displayed and their performance (such as how many people clicked on an ad).</code> | <code>Information Collected by Cookies and Other Tracking Technologies: We use various technologies to collect information, and this may include sending cookies to your computer or mobile device. Cookies are small data files stored on your hard drive or in device memory that helps us to improve our Services and your experience, see which areas and features of our Services are popular and count visits. For more information about cookies, and how to disable them, please see "Your Choices" below. We may also collect information using web beacons (also known as "tracking pixels"). Web beacons are electronic images that may be used in our Services or emails and help deliver cookies, count visits, understand usage and campaign effectiveness and determine whether an email has been opened and acted upon.</code> |
416
- | <code>Do pixel tags remain invisible when embedded in emails?</code> | <code>Pixel tags. We embed pixel tags (also called web beacons or clear GIFs) on web pages, ads, and emails. These tiny, invisible graphics are used to access cookies and track user activities (such as how many times a page is viewed). We use pixel tags to measure the popularity of our features and services. Ad companies also use pixel tags to measure the number of ads displayed and their performance (such as how many people clicked on an ad).</code> | <code>Changes to this Privacy Policy We may update this Privacy Policy in the future. We will notify you about material changes to this Privacy Policy by sending a notice to the email address you provided to us or by placing a prominent notice on our website.</code> |
417
- | <code>Do pixel tags remain invisible when embedded in emails?</code> | <code>Pixel tags. We embed pixel tags (also called web beacons or clear GIFs) on web pages, ads, and emails. These tiny, invisible graphics are used to access cookies and track user activities (such as how many times a page is viewed). We use pixel tags to measure the popularity of our features and services. Ad companies also use pixel tags to measure the number of ads displayed and their performance (such as how many people clicked on an ad).</code> | <code>You can prevent Peel from showing you targeted ads by sending an email to [email protected] and asking to opt-out of targeted advertising. Opting-out will only prevent targeted ads from being displayed so you may continue to see generic (non-targeted) ads from Peel after you opt-out. For more information on Interest-Based Ads or to stop use of tracking technologies for these purposes, go to www.aboutads.info or www.networkadvertising.org.</code> |
418
  * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
419
  ```json
420
  {
@@ -430,7 +490,6 @@ You can finetune this model on your own dataset.
430
  - `per_device_train_batch_size`: 128
431
  - `per_device_eval_batch_size`: 128
432
  - `learning_rate`: 2e-05
433
- - `num_train_epochs`: 1
434
  - `warmup_ratio`: 0.1
435
  - `fp16`: True
436
  - `load_best_model_at_end`: True
@@ -455,7 +514,7 @@ You can finetune this model on your own dataset.
455
  - `adam_beta2`: 0.999
456
  - `adam_epsilon`: 1e-08
457
  - `max_grad_norm`: 1.0
458
- - `num_train_epochs`: 1
459
  - `max_steps`: -1
460
  - `lr_scheduler_type`: linear
461
  - `lr_scheduler_kwargs`: {}
@@ -551,17 +610,32 @@ You can finetune this model on your own dataset.
551
  </details>
552
 
553
  ### Training Logs
554
- | Epoch | Step | Training Loss | loss | msmarco-distilbert-base-v2_cosine_map@100 |
555
- |:-------:|:-------:|:-------------:|:----------:|:-----------------------------------------:|
556
- | 0 | 0 | - | - | 0.7970 |
557
- | 0.1368 | 100 | 0.1498 | - | - |
558
- | 0.2736 | 200 | 0.0868 | - | - |
559
- | 0.4104 | 300 | 0.0955 | - | - |
560
- | 0.5472 | 400 | 0.1114 | - | - |
561
- | 0.6840 | 500 | 0.1218 | - | - |
562
- | 0.8208 | 600 | 0.1339 | - | - |
563
- | 0.9576 | 700 | 0.1557 | - | - |
564
- | **1.0** | **731** | **-** | **0.3184** | **0.7978** |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
565
 
566
  * The bold row denotes the saved checkpoint.
567
 
 
1
  ---
2
+ base_model: sentence-transformers/msmarco-distilbert-base-v2
3
  datasets: []
4
  language: []
5
  library_name: sentence-transformers
 
40
  - sentence-similarity
41
  - feature-extraction
42
  - generated_from_trainer
43
+ - dataset_size:88018
44
  - loss:TripletLoss
45
  widget:
46
+ - source_sentence: How should SpinRecords.com notify NETTAXI of a potential indemnifiable
47
+ claim?
48
  sentences:
49
+ - '4. The Company shall have no obligations to Verenium with respect to the use
50
+ of such information, or disclosure to others not party to this Agreement, of such
51
+ information which: (d) is rightfully and in good faith developed by Company independently
52
+ of any disclosures made under this Agreement, as evidenced by Company’s competent
53
+ written records; or '
54
+ - 13. Changes To This Privacy Policy We may update this Privacy Policy to reflect
55
+ changes to our information practices. If we make any material changes we will
56
+ notify you by email (sent to the e-mail address specified in your account) or
57
+ by means of a notice on the Services prior to the change becoming effective. We
58
+ encourage you to periodically review this page for the latest information on our
59
+ privacy practices.
60
+ - 7.2 Indemnification by NETTAXI. NETTAXI shall defend, indemnify and ----------------------------
61
+ hold SpinRecords.com harmless from any and all damages, liabilities, costs and
62
+ expenses (including, but not limited to reasonable attorneys' fees) incurred
63
+ by SpinRecords.com as a result of (1) any breach of this Agreement; (ii) any
64
+ claim that the NETTAXI Brand Features or any part thereof, infringes or
65
+ misappropriates any Intellectual Property Right of a third party; or (iii) any
66
+ claim arising out of Spinrecords.com's display of the NETTAXI Brand Features
67
+ SpinRecords.com shall provide NETTAXI with written notice of the claim and
68
+ permit NETTAXI to control the defense, settlement, adjustment or compromise
69
+ of any such claim. SpinRecords.com may employ counsel at its own expense to assist
70
+ it with respect to any such claim; provided, however, that if such counsel
71
+ is necessary because of a conflict of interest of either NETTAXI or its counsel
72
+ or because NETTAXI does not assume control, NETTAXI will bear the expense of
73
+ such counsel.
74
+ - source_sentence: What types of advertisements does Crazy Labs accept within their
75
+ apps?
76
  sentences:
77
+ - Each party each agrees that it will not knowingly do anything inconsistent with
78
+ the other party's ownership of such party's intellectual property, including without
79
+ limitation, questioning the validity of that party's Trademarks or registering
80
+ or attempting to register the other party's Trademarks in its own name or that
81
+ of any other firm, person or corporation.
82
+ - '11. Advertisements We accept advertisements, in various formats (such as banners,
83
+ interstitials, rewarded videos, etc.) from third parties ad networks which may
84
+ be displayed in our Crazy Labs Apps. These third parties ad networks may collect
85
+ and use, inter alia (i) information about your visits to Crazy Labs Apps in connection
86
+ with such marketing, sales and advertising activities; and (ii) geographic tracking
87
+ and carrier network preferences. (iii) information, such as age, gender and logged
88
+ from device to ensure that appropriate advertising is presented within the App
89
+ and calculate or control the number of views of an ad, and/or deliver advertisements
90
+ relating to User''s interests, and measure the effectiveness of advertisements
91
+ campaigns. The delivery of advertisements to you may be based on IP address, device
92
+ identifiers and other Personal Information gathered during your use of the Crazy
93
+ Labs Apps. Note that third parties ad networks which are referred to in relation
94
+ to the Crazy Labs Apps may include third parties service providers, such as Facebook
95
+ and other ad networks, in addition to those which are listed in the following
96
+ link: https://www.tabtale.com/3rdparties/. Note that if you click on any of these
97
+ advertisements, the advertisers may use cookies and other web-tracking technologies
98
+ (such as tracking pixel agent or visitor identification technology, etc.) on your
99
+ device to collect data regarding advertisement performance, your interaction with
100
+ such advertisements and our Crazy Labs Apps and your interests (which may include,
101
+ non-personal and/or personal information (such as, device and network information,
102
+ unique identifiers, gender, age and geo-location) about you) in order to serve
103
+ you advertisements, including targeted advertisements, and for the legitimate
104
+ business interests of such Third Parties ad networks. We recommend that you review
105
+ the terms of use and privacy policy of any third party advertisers with whom you
106
+ are interacting before doing so. Their privacy policy, not ours, will apply to
107
+ any of those interactions.'
108
+ - We want our advertising to be as relevant and interesting as the other information
109
+ you find on our Services. With this in mind, we use all of the information we
110
+ have about you to show you relevant ads. We do not share information that personally
111
+ identifies you (personally identifiable information is information like name or
112
+ email address that can by itself be used to contact you or identifies who you
113
+ are) with advertising, measurement or analytics partners unless you give us permission.
114
+ We may provide these partners with information about the reach and effectiveness
115
+ of their advertising without providing information that personally identifies
116
+ you, or if we have aggregated the information so that it does not personally identify
117
+ you. For example, we may tell an advertiser how its ads performed, or how many
118
+ people viewed their ads or installed an app after seeing an ad, or provide non-personally
119
+ identifying demographic information (such as 25 year old female, in Madrid, who
120
+ likes software engineering) to these partners to help them understand their audience
121
+ or customers, but only after the advertiser has agreed to abide by our advertiser
122
+ guidelines.
123
+ - source_sentence: 9.2 Nature of the Association. The parties herein are engaged
124
+ as independent entities in accordance with this Agreement, and there exists no
125
+ intention to forge any alternate form of association, such as a partnership, franchise,
126
+ joint venture, agency, employer/employee relationship, fiduciary connection, or
127
+ any other specific relationship. Each party is precluded from conducting themselves
128
+ in any way that might suggest or insinuate any association different from that
129
+ of an independent entity, nor shall either party possess the authority to obligate
130
+ or commit the other party in any manner.
131
  sentences:
132
+ - 'The parties hereby grant to each other non-exclusive, fully-paid, royalty-free
133
+ licenses to utilize the other party''s trademarks, as follows: (a) Biocept Trademarks.
134
+ To facilitate the promotion and performance of Tests, during the Term Biocept
135
+ hereby grants Life Technologies a non-exclusive, royalty-free, non-transferable
136
+ license to use the Biocept Trademarks solely for<omitted>use in connection with
137
+ the promotion and performance of the Tests in the Territory.'
138
+ - This Agreement may not be assigned or otherwise transferred, nor may any right
139
+ or obligations hereunder be assigned or transferred, by either Party without the
140
+ prior written consent of the other Party; provided, however, that Licensor may,
141
+ without such consent, assign this Agreement and its rights and obligations hereunder,
142
+ in whole or in part, to an Affiliate or in connection with the transfer or sale
143
+ of all or substantially all of its assets related to the Licensed Product or the
144
+ business relating thereto, or in the event of its merger or consolidation or change
145
+ in control or similar transaction.
146
+ - 9.2 Relationship of Parties. The parties are independent contractors -------------------------
147
+ under this Agreement and no other relationship is intended, including
148
+ a partnership, franchise, joint venture, agency, employer/employee, fiduciary,
149
+ master/servant relationship, or other special relationship. Neither party shall
150
+ act in a manner which expresses or implies a relationship other than that
151
+ of independent contractor, nor bind the other party.
152
+ - source_sentence: In which section can I find the specifics of the 'Initial Term'?
153
  sentences:
154
+ - 1.22 "Initial Term" has the meaning set forth in Section 8.1.
155
+ - If the Reseller sells less than 50% of any year's Annual Milestone, Todos, in
156
+ its sole discretion, may either (a) cancel the Reseller's exclusivity, and market,
157
+ distribute, and sell the Products in the Territory directly or indirectly through
158
+ other distributors and resellers, while leaving the Reseller with a non-exclusive
159
+ right to distribute and sell the Products for the remainder of the term, or (b)
160
+ terminate the Agreement upon one hundred eighty (180) days prior written notice,
161
+ provided that the Reseller does not cure its failure to achieve 50% of the applicable
162
+ year's Annual Milestone within the 180-day notice period.
163
+ - 6 Term and Termination.
164
+ - source_sentence: In what circumstances can FCE assume responsibility for a Program
165
+ Patent?
166
  sentences:
167
+ - We may also collect anonymous, statistical data from users of the Services, such
168
+ as a user's browser version, operating system version, country, page loading time,
169
+ type of device, number of visits, time using the Services, network, demographic
170
+ estimates, flow-through the website and/or Services or referral source, which
171
+ may then be aggregated. We may use non-personal data that we collect from you
172
+ to improve the Services or to support advertising services. For registered users,
173
+ this anonymous, statistical data may include that relating to their activities,
174
+ such as high scores, game rankings, league rankings, game challenges, avatars
175
+ etc.
176
+ - Notwithstanding the foregoing, in the event ExxonMobil decides not to prosecute,
177
+ defend, enforce, maintain or decides to abandon any Program Patent, then ExxonMobil
178
+ will provide notice thereof to FCE, and FCE will then have the right, but not
179
+ the obligation, to prosecute or maintain the Program Patent and sole responsibility
180
+ for the continuing costs, taxes, legal fees, maintenance fees and other fees associated
181
+ with that Program Patent.
182
+ - 4. Limitation of Liability of the Sponsor. The Sponsor shall not be liable for
183
+ any error of judgment or mistake of law or for any act or omission in the oversight,
184
+ administration or management of the Trust or the performance of its duties hereunder,
185
+ except for willful misfeasance, bad faith or gross negligence in the performance
186
+ of its duties, or by reason of the reckless disregard of its obligations and duties
187
+ hereunder. As used in this Section 4, the term "Sponsor" shall include Domini
188
+ and/or any of its affiliates and the directors, officers and employees of Domini
189
+ and/or any of its affiliates.
190
  model-index:
191
+ - name: SentenceTransformer based on sentence-transformers/msmarco-distilbert-base-v2
192
  results:
193
  - task:
194
  type: information-retrieval
 
198
  type: msmarco-distilbert-base-v2
199
  metrics:
200
  - type: cosine_accuracy@1
201
+ value: 0.6422082459818309
202
  name: Cosine Accuracy@1
203
  - type: cosine_accuracy@3
204
+ value: 0.8230607966457023
205
  name: Cosine Accuracy@3
206
  - type: cosine_accuracy@5
207
+ value: 0.872816212438854
208
  name: Cosine Accuracy@5
209
  - type: cosine_accuracy@10
210
+ value: 0.9382250174703005
211
  name: Cosine Accuracy@10
212
  - type: cosine_precision@1
213
+ value: 0.6422082459818309
214
  name: Cosine Precision@1
215
  - type: cosine_precision@3
216
+ value: 0.27435359888190075
217
  name: Cosine Precision@3
218
  - type: cosine_precision@5
219
+ value: 0.1745632424877708
220
  name: Cosine Precision@5
221
  - type: cosine_precision@10
222
+ value: 0.09382250174703004
223
  name: Cosine Precision@10
224
  - type: cosine_recall@1
225
+ value: 0.6422082459818309
226
  name: Cosine Recall@1
227
  - type: cosine_recall@3
228
+ value: 0.8230607966457023
229
  name: Cosine Recall@3
230
  - type: cosine_recall@5
231
+ value: 0.872816212438854
232
  name: Cosine Recall@5
233
  - type: cosine_recall@10
234
+ value: 0.9382250174703005
235
  name: Cosine Recall@10
236
  - type: cosine_ndcg@10
237
+ value: 0.790195916846684
238
  name: Cosine Ndcg@10
239
  - type: cosine_mrr@10
240
+ value: 0.7427224274289222
241
  name: Cosine Mrr@10
242
  - type: cosine_map@100
243
+ value: 0.7454587747656682
244
  name: Cosine Map@100
245
  - type: dot_accuracy@1
246
+ value: 0.6317260656883298
247
  name: Dot Accuracy@1
248
  - type: dot_accuracy@3
249
+ value: 0.8204053109713487
250
  name: Dot Accuracy@3
251
  - type: dot_accuracy@5
252
+ value: 0.8735150244584207
253
  name: Dot Accuracy@5
254
  - type: dot_accuracy@10
255
+ value: 0.9375262054507337
256
  name: Dot Accuracy@10
257
  - type: dot_precision@1
258
+ value: 0.6317260656883298
259
  name: Dot Precision@1
260
  - type: dot_precision@3
261
+ value: 0.27346843699044954
262
  name: Dot Precision@3
263
  - type: dot_precision@5
264
+ value: 0.17470300489168414
265
  name: Dot Precision@5
266
  - type: dot_precision@10
267
+ value: 0.09375262054507337
268
  name: Dot Precision@10
269
  - type: dot_recall@1
270
+ value: 0.6317260656883298
271
  name: Dot Recall@1
272
  - type: dot_recall@3
273
+ value: 0.8204053109713487
274
  name: Dot Recall@3
275
  - type: dot_recall@5
276
+ value: 0.8735150244584207
277
  name: Dot Recall@5
278
  - type: dot_recall@10
279
+ value: 0.9375262054507337
280
  name: Dot Recall@10
281
  - type: dot_ndcg@10
282
+ value: 0.7853441093620476
283
  name: Dot Ndcg@10
284
  - type: dot_mrr@10
285
+ value: 0.7364890242143864
286
  name: Dot Mrr@10
287
  - type: dot_map@100
288
+ value: 0.7392413927907737
289
  name: Dot Map@100
290
  ---
291
 
292
+ # SentenceTransformer based on sentence-transformers/msmarco-distilbert-base-v2
293
 
294
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/msmarco-distilbert-base-v2](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
295
 
296
  ## Model Details
297
 
298
  ### Model Description
299
  - **Model Type:** Sentence Transformer
300
+ - **Base model:** [sentence-transformers/msmarco-distilbert-base-v2](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-v2) <!-- at revision 741fcf2d6eabaf0927bfe49c6d9c577df95d3c40 -->
301
  - **Maximum Sequence Length:** 350 tokens
302
  - **Output Dimensionality:** 768 tokens
303
  - **Similarity Function:** Cosine Similarity
 
338
  model = SentenceTransformer("kperkins411/msmarco-distilbert-base-v2_triplet_legal")
339
  # Run inference
340
  sentences = [
341
+ 'In what circumstances can FCE assume responsibility for a Program Patent?',
342
+ 'Notwithstanding the foregoing, in the event ExxonMobil decides not to prosecute, defend, enforce, maintain or decides to abandon any Program Patent, then ExxonMobil will provide notice thereof to FCE, and FCE will then have the right, but not the obligation, to prosecute or maintain the Program Patent and sole responsibility for the continuing costs, taxes, legal fees, maintenance fees and other fees associated with that Program Patent.',
343
+ '4. Limitation of Liability of the Sponsor. The Sponsor shall not be liable for any error of judgment or mistake of law or for any act or omission in the oversight, administration or management of the Trust or the performance of its duties hereunder, except for willful misfeasance, bad faith or gross negligence in the performance of its duties, or by reason of the reckless disregard of its obligations and duties hereunder. As used in this Section 4, the term "Sponsor" shall include Domini and/or any of its affiliates and the directors, officers and employees of Domini and/or any of its affiliates.',
344
  ]
345
  embeddings = model.encode(sentences)
346
  print(embeddings.shape)
 
386
 
387
  | Metric | Value |
388
  |:--------------------|:-----------|
389
+ | cosine_accuracy@1 | 0.6422 |
390
+ | cosine_accuracy@3 | 0.8231 |
391
+ | cosine_accuracy@5 | 0.8728 |
392
+ | cosine_accuracy@10 | 0.9382 |
393
+ | cosine_precision@1 | 0.6422 |
394
+ | cosine_precision@3 | 0.2744 |
395
+ | cosine_precision@5 | 0.1746 |
396
+ | cosine_precision@10 | 0.0938 |
397
+ | cosine_recall@1 | 0.6422 |
398
+ | cosine_recall@3 | 0.8231 |
399
+ | cosine_recall@5 | 0.8728 |
400
+ | cosine_recall@10 | 0.9382 |
401
+ | cosine_ndcg@10 | 0.7902 |
402
+ | cosine_mrr@10 | 0.7427 |
403
+ | **cosine_map@100** | **0.7455** |
404
+ | dot_accuracy@1 | 0.6317 |
405
+ | dot_accuracy@3 | 0.8204 |
406
+ | dot_accuracy@5 | 0.8735 |
407
+ | dot_accuracy@10 | 0.9375 |
408
+ | dot_precision@1 | 0.6317 |
409
+ | dot_precision@3 | 0.2735 |
410
+ | dot_precision@5 | 0.1747 |
411
+ | dot_precision@10 | 0.0938 |
412
+ | dot_recall@1 | 0.6317 |
413
+ | dot_recall@3 | 0.8204 |
414
+ | dot_recall@5 | 0.8735 |
415
+ | dot_recall@10 | 0.9375 |
416
+ | dot_ndcg@10 | 0.7853 |
417
+ | dot_mrr@10 | 0.7365 |
418
+ | dot_map@100 | 0.7392 |
419
 
420
  <!--
421
  ## Bias, Risks and Limitations
 
436
  #### Unnamed Dataset
437
 
438
 
439
+ * Size: 88,018 training samples
440
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
441
  * Approximate statistics based on the first 1000 samples:
442
  | | anchor | positive | negative |
 
462
  #### Unnamed Dataset
463
 
464
 
465
+ * Size: 1,084 evaluation samples
466
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
467
  * Approximate statistics based on the first 1000 samples:
468
  | | anchor | positive | negative |
469
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
470
  | type | string | string | string |
471
+ | details | <ul><li>min: 6 tokens</li><li>mean: 20.24 tokens</li><li>max: 124 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 97.01 tokens</li><li>max: 350 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 105.03 tokens</li><li>max: 350 tokens</li></ul> |
472
  * Samples:
473
+ | anchor | positive | negative |
474
+ |:--------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
475
+ | <code>Are Capital Contributions categorized as either 'Initial' or 'Additional' in the accounts?</code> | <code>Capital Accounts<br><br>An individual capital account (the "Capital Accounts") will be maintained for each Participant and their Initial Capital Contribution will be credited to this account. Any Additional Capital Contributions made by any Participant will be credited to that Participant's individual Capital Account.</code> | <code>Section 4.3 Deposits and Payments 19</code> |
476
+ | <code>Are Capital Contributions categorized as either 'Initial' or 'Additional' in the accounts?</code> | <code>Capital Accounts<br><br>An individual capital account (the "Capital Accounts") will be maintained for each Participant and their Initial Capital Contribution will be credited to this account. Any Additional Capital Contributions made by any Participant will be credited to that Participant's individual Capital Account.</code> | <code>Section 2.1 The Fund agrees at its own expense to execute any and all documents, to furnish any and all information, and to take any other actions that may be reasonably necessary in connection with the qualification of the Shares for sale in those states that Integrity may designate.</code> |
477
+ | <code>Are Capital Contributions categorized as either 'Initial' or 'Additional' in the accounts?</code> | <code>Capital Accounts<br><br>An individual capital account (the "Capital Accounts") will be maintained for each Participant and their Initial Capital Contribution will be credited to this account. Any Additional Capital Contributions made by any Participant will be credited to that Participant's individual Capital Account.</code> | <code>Section 1.9 Integrity shall prepare and deliver reports to the Treasurer of the Fund and to the Investment Adviser on a regular, at least quarterly, basis, showing the distribution expenses incurred pursuant to this Agreement and the Plan and the purposes therefore, as well as any supplemental reports as the Trustees, from time to time, may reasonably request.</code> |
478
  * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
479
  ```json
480
  {
 
490
  - `per_device_train_batch_size`: 128
491
  - `per_device_eval_batch_size`: 128
492
  - `learning_rate`: 2e-05
 
493
  - `warmup_ratio`: 0.1
494
  - `fp16`: True
495
  - `load_best_model_at_end`: True
 
514
  - `adam_beta2`: 0.999
515
  - `adam_epsilon`: 1e-08
516
  - `max_grad_norm`: 1.0
517
+ - `num_train_epochs`: 3
518
  - `max_steps`: -1
519
  - `lr_scheduler_type`: linear
520
  - `lr_scheduler_kwargs`: {}
 
610
  </details>
611
 
612
  ### Training Logs
613
+ | Epoch | Step | Training Loss | loss | msmarco-distilbert-base-v2_cosine_map@100 |
614
+ |:----------:|:--------:|:-------------:|:----------:|:-----------------------------------------:|
615
+ | 0 | 0 | - | - | 0.6601 |
616
+ | 0.1453 | 100 | 1.5696 | - | - |
617
+ | 0.2907 | 200 | 0.7941 | - | - |
618
+ | 0.4360 | 300 | 0.6151 | - | - |
619
+ | 0.5814 | 400 | 0.5458 | - | - |
620
+ | 0.7267 | 500 | 0.5085 | - | - |
621
+ | 0.8721 | 600 | 0.4601 | - | - |
622
+ | 1.0131 | 697 | - | 0.3492 | - |
623
+ | 1.0044 | 700 | 0.4055 | - | - |
624
+ | 1.1497 | 800 | 0.3538 | - | - |
625
+ | 1.2951 | 900 | 0.2245 | - | - |
626
+ | 1.4404 | 1000 | 0.1821 | - | - |
627
+ | 1.5858 | 1100 | 0.1761 | - | - |
628
+ | 1.7311 | 1200 | 0.1872 | - | - |
629
+ | 1.8765 | 1300 | 0.169 | - | - |
630
+ | 2.0131 | 1394 | - | 0.2674 | - |
631
+ | 2.0087 | 1400 | 0.1502 | - | - |
632
+ | 2.1541 | 1500 | 0.1416 | - | - |
633
+ | 2.2994 | 1600 | 0.0914 | - | - |
634
+ | 2.4448 | 1700 | 0.0868 | - | - |
635
+ | 2.5901 | 1800 | 0.0854 | - | - |
636
+ | 2.7355 | 1900 | 0.0905 | - | - |
637
+ | 2.8808 | 2000 | 0.0888 | - | - |
638
+ | **2.9738** | **2064** | **-** | **0.2272** | **0.7455** |
639
 
640
  * The bold row denotes the saved checkpoint.
641
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "models/msmarco-distilbert-base-v2_triplet/final",
3
  "activation": "gelu",
4
  "architectures": [
5
  "DistilBertModel"
 
1
  {
2
+ "_name_or_path": "models/msmarco-distilbert-base-v2_triplet_legal/final",
3
  "activation": "gelu",
4
  "architectures": [
5
  "DistilBertModel"
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0b43312ce769567621a34b39511377b9c3cbb6f85f1740d8c33c72d88d7acbe7
3
  size 265462608
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b0dcaa159de5c2800f1c4623503b8f88d7bba2e9d2dcd5aefb59d7f3474597f
3
  size 265462608