Spaces:
Sleeping
Sleeping
Add essential files for deployment
Browse files- .gitattributes +9 -0
- docs/faiss/document_lookup.txt +463 -0
- docs/faiss/index.faiss +3 -0
- docs/faiss/index.pkl +3 -0
- docs/faiss/metadata.pkl +3 -0
- docs/pdfs/paper.pdf +3 -0
- docs/pdfs/resume.pdf +3 -0
- docs/youtube/BinThere.ai.m4a +3 -0
- docs/youtube/BinThere.ai_transcript.txt +1 -0
- docs/youtube/Synthia by Nuvela-AI.m4a +3 -0
- docs/youtube/Synthia by Nuvela-AI_transcript.txt +1 -0
- gunicorn_config.py +32 -0
.gitattributes
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.faiss filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
docs/faiss/*.faiss filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
docs/faiss/*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.jpg filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.jpeg filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.png filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.pdf filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.m4a filter=lfs diff=lfs merge=lfs -text
|
docs/faiss/document_lookup.txt
ADDED
|
@@ -0,0 +1,463 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Document 0:
|
| 2 |
+
Source: https://julien-ser.github.io/JulienSerbanescu/
|
| 3 |
+
Type: Unknown
|
| 4 |
+
Content Preview: Julien Serbanescu
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
Julien Serbanescu...
|
| 10 |
+
--------------------------------------------------------------------------------
|
| 11 |
+
Document 1:
|
| 12 |
+
Source: docs/pdfs\paper.pdf
|
| 13 |
+
Type: Unknown
|
| 14 |
+
Content Preview: UnAnswGen: A Systematic Approach for Generating
|
| 15 |
+
Unanswerable Questions in Machine Reading Comprehens...
|
| 16 |
+
--------------------------------------------------------------------------------
|
| 17 |
+
Document 2:
|
| 18 |
+
Source: docs/pdfs\paper.pdf
|
| 19 |
+
Type: Unknown
|
| 20 |
+
Content Preview: Unlike existing datasets like SQuAD2.0, which do not account for
|
| 21 |
+
the reasons behind question unanswe...
|
| 22 |
+
--------------------------------------------------------------------------------
|
| 23 |
+
Document 3:
|
| 24 |
+
Source: docs/pdfs\paper.pdf
|
| 25 |
+
Type: Unknown
|
| 26 |
+
Content Preview: query reformulation. The resulting UnAnswGen dataset and asso-
|
| 27 |
+
ciated software workflow are made pub...
|
| 28 |
+
--------------------------------------------------------------------------------
|
| 29 |
+
Document 4:
|
| 30 |
+
Source: docs/pdfs\paper.pdf
|
| 31 |
+
Type: Unknown
|
| 32 |
+
Content Preview: on the first page. Copyrights for components of this work owned by others than the
|
| 33 |
+
author(s) must be...
|
| 34 |
+
--------------------------------------------------------------------------------
|
| 35 |
+
Document 5:
|
| 36 |
+
Source: docs/pdfs\paper.pdf
|
| 37 |
+
Type: Unknown
|
| 38 |
+
Content Preview: Development in Information Retrieval in the Asia Pacific Region (SIGIR-AP
|
| 39 |
+
’24), December 9–12, 2024,...
|
| 40 |
+
--------------------------------------------------------------------------------
|
| 41 |
+
Document 6:
|
| 42 |
+
Source: docs/pdfs\paper.pdf
|
| 43 |
+
Type: Unknown
|
| 44 |
+
Content Preview: should avoid responding rather than making uncertain guesses,
|
| 45 |
+
demonstrating their language comprehen...
|
| 46 |
+
--------------------------------------------------------------------------------
|
| 47 |
+
Document 7:
|
| 48 |
+
Source: docs/pdfs\paper.pdf
|
| 49 |
+
Type: Unknown
|
| 50 |
+
Content Preview: systems advance to meet the complexity of real-world information
|
| 51 |
+
needs, there is an increasing deman...
|
| 52 |
+
--------------------------------------------------------------------------------
|
| 53 |
+
Document 8:
|
| 54 |
+
Source: docs/pdfs\paper.pdf
|
| 55 |
+
Type: Unknown
|
| 56 |
+
Content Preview: SIGIR-AP ’24, December 9–12, 2024, Tokyo, Japan Hadiseh Moradisani, Fattane Zarrinkalam, Julien Serb...
|
| 57 |
+
--------------------------------------------------------------------------------
|
| 58 |
+
Document 9:
|
| 59 |
+
Source: docs/pdfs\paper.pdf
|
| 60 |
+
Type: Unknown
|
| 61 |
+
Content Preview: SQuAD2-CR
|
| 62 |
+
[17] Wikip
|
| 63 |
+
edia Cr
|
| 64 |
+
owdsourcing (86,821
|
| 65 |
+
- 43,498) 6 [19] 2020
|
| 66 |
+
Dur
|
| 67 |
+
eader [11] Chinese
|
| 68 |
+
search...
|
| 69 |
+
--------------------------------------------------------------------------------
|
| 70 |
+
Document 10:
|
| 71 |
+
Source: docs/pdfs\paper.pdf
|
| 72 |
+
Type: Unknown
|
| 73 |
+
Content Preview: instance, it contains only 3,350 unanswerable questions labeled
|
| 74 |
+
with No Information. Moreover, as th...
|
| 75 |
+
--------------------------------------------------------------------------------
|
| 76 |
+
Document 11:
|
| 77 |
+
Source: docs/pdfs\paper.pdf
|
| 78 |
+
Type: Unknown
|
| 79 |
+
Content Preview: ing unanswerable questions and enables the exploration of various
|
| 80 |
+
causes of unanswerability. The onl...
|
| 81 |
+
--------------------------------------------------------------------------------
|
| 82 |
+
Document 12:
|
| 83 |
+
Source: docs/pdfs\paper.pdf
|
| 84 |
+
Type: Unknown
|
| 85 |
+
Content Preview: unanswerable questions into answerable ones.
|
| 86 |
+
To develop a multi-label MRC dataset with unanswerable ...
|
| 87 |
+
--------------------------------------------------------------------------------
|
| 88 |
+
Document 13:
|
| 89 |
+
Source: docs/pdfs\paper.pdf
|
| 90 |
+
Type: Unknown
|
| 91 |
+
Content Preview: mation for each input question. Second, the generated candidate
|
| 92 |
+
unanswerable questions are evaluated...
|
| 93 |
+
--------------------------------------------------------------------------------
|
| 94 |
+
Document 14:
|
| 95 |
+
Source: docs/pdfs\paper.pdf
|
| 96 |
+
Type: Unknown
|
| 97 |
+
Content Preview: The advantages of our work are twofold: (1) Our implementation
|
| 98 |
+
of the proposed software workflow all...
|
| 99 |
+
--------------------------------------------------------------------------------
|
| 100 |
+
Document 15:
|
| 101 |
+
Source: docs/pdfs\paper.pdf
|
| 102 |
+
Type: Unknown
|
| 103 |
+
Content Preview: of SQuAD2.0 that includes multi-labeled unanswerable questions.
|
| 104 |
+
Figure 1 presents the overview of ou...
|
| 105 |
+
--------------------------------------------------------------------------------
|
| 106 |
+
Document 16:
|
| 107 |
+
Source: docs/pdfs\paper.pdf
|
| 108 |
+
Type: Unknown
|
| 109 |
+
Content Preview: UnAnswGen: A Systematic Approach for Generating Unanswerable Questions in Machine Reading Comprehens...
|
| 110 |
+
--------------------------------------------------------------------------------
|
| 111 |
+
Document 17:
|
| 112 |
+
Source: docs/pdfs\paper.pdf
|
| 113 |
+
Type: Unknown
|
| 114 |
+
Content Preview: linguistic dimensions such as entity swap, number swap, negation,
|
| 115 |
+
antonym, mutual exclusion, and no ...
|
| 116 |
+
--------------------------------------------------------------------------------
|
| 117 |
+
Document 18:
|
| 118 |
+
Source: docs/pdfs\paper.pdf
|
| 119 |
+
Type: Unknown
|
| 120 |
+
Content Preview: passage, and its candidate unanswerable questions. We have imple-
|
| 121 |
+
mented and integrated a comprehens...
|
| 122 |
+
--------------------------------------------------------------------------------
|
| 123 |
+
Document 19:
|
| 124 |
+
Source: docs/pdfs\paper.pdf
|
| 125 |
+
Type: Unknown
|
| 126 |
+
Content Preview: placement: For each entity in the input answerable question, we
|
| 127 |
+
replace it with another entity of th...
|
| 128 |
+
--------------------------------------------------------------------------------
|
| 129 |
+
Document 20:
|
| 130 |
+
Source: docs/pdfs\paper.pdf
|
| 131 |
+
Type: Unknown
|
| 132 |
+
Content Preview: corresponding context.
|
| 133 |
+
Number Swap. Number Swap involves modifying a question to
|
| 134 |
+
potentially render ...
|
| 135 |
+
--------------------------------------------------------------------------------
|
| 136 |
+
Document 21:
|
| 137 |
+
Source: docs/pdfs\paper.pdf
|
| 138 |
+
Type: Unknown
|
| 139 |
+
Content Preview: Time magazine named her one of the most 100 influential people of the
|
| 140 |
+
century? could be Time magazin...
|
| 141 |
+
--------------------------------------------------------------------------------
|
| 142 |
+
Document 22:
|
| 143 |
+
Source: docs/pdfs\paper.pdf
|
| 144 |
+
Type: Unknown
|
| 145 |
+
Content Preview: (2) Replacement: Replace each identified word with its antonym,
|
| 146 |
+
ensuring the modified question remai...
|
| 147 |
+
--------------------------------------------------------------------------------
|
| 148 |
+
Document 23:
|
| 149 |
+
Source: docs/pdfs\paper.pdf
|
| 150 |
+
Type: Unknown
|
| 151 |
+
Content Preview: SIGIR-AP ’24, December 9–12, 2024, Tokyo, Japan Hadiseh Moradisani, Fattane Zarrinkalam, Julien Serb...
|
| 152 |
+
--------------------------------------------------------------------------------
|
| 153 |
+
Document 24:
|
| 154 |
+
Source: docs/pdfs\paper.pdf
|
| 155 |
+
Type: Unknown
|
| 156 |
+
Content Preview: of her music?, utilizing the Detection and Removal approach might
|
| 157 |
+
lead to a question such as Beyoncé...
|
| 158 |
+
--------------------------------------------------------------------------------
|
| 159 |
+
Document 25:
|
| 160 |
+
Source: docs/pdfs\paper.pdf
|
| 161 |
+
Type: Unknown
|
| 162 |
+
Content Preview: formation available in the given context, the question becomes
|
| 163 |
+
inherently unanswerable. For instance...
|
| 164 |
+
--------------------------------------------------------------------------------
|
| 165 |
+
Document 26:
|
| 166 |
+
Source: docs/pdfs\paper.pdf
|
| 167 |
+
Type: Unknown
|
| 168 |
+
Content Preview: No Information. Similar to [33], to modify the original answer-
|
| 169 |
+
able questions by considering this c...
|
| 170 |
+
--------------------------------------------------------------------------------
|
| 171 |
+
Document 27:
|
| 172 |
+
Source: docs/pdfs\paper.pdf
|
| 173 |
+
Type: Unknown
|
| 174 |
+
Content Preview: California is also home to a large homegrown surf and skateboard cul-
|
| 175 |
+
ture.... This method ensures t...
|
| 176 |
+
--------------------------------------------------------------------------------
|
| 177 |
+
Document 28:
|
| 178 |
+
Source: docs/pdfs\paper.pdf
|
| 179 |
+
Type: Unknown
|
| 180 |
+
Content Preview: 𝑎𝑖 , and for each candidate unanswerable question (𝑐𝑗,𝑙𝑗 ) ∈𝐶𝑞𝑖 , we
|
| 181 |
+
conduct the following evaluatio...
|
| 182 |
+
--------------------------------------------------------------------------------
|
| 183 |
+
Document 29:
|
| 184 |
+
Source: docs/pdfs\paper.pdf
|
| 185 |
+
Type: Unknown
|
| 186 |
+
Content Preview: 𝑞𝑖 , denoted as 𝑞′
|
| 187 |
+
𝑖 , to 𝑈𝑞𝑖 , and attribute 𝑙𝑗 as the reason for the
|
| 188 |
+
unanswerability of 𝑞′
|
| 189 |
+
𝑖 .
|
| 190 |
+
The...
|
| 191 |
+
--------------------------------------------------------------------------------
|
| 192 |
+
Document 30:
|
| 193 |
+
Source: docs/pdfs\paper.pdf
|
| 194 |
+
Type: Unknown
|
| 195 |
+
Content Preview: are answerable. This dataset, developed through crowdsourcing,
|
| 196 |
+
consists of a training set with 130,3...
|
| 197 |
+
--------------------------------------------------------------------------------
|
| 198 |
+
Document 31:
|
| 199 |
+
Source: docs/pdfs\paper.pdf
|
| 200 |
+
Type: Unknown
|
| 201 |
+
Content Preview: swerable candidate questions from a single modification process.
|
| 202 |
+
Consequently, from the 86,821 answe...
|
| 203 |
+
--------------------------------------------------------------------------------
|
| 204 |
+
Document 32:
|
| 205 |
+
Source: docs/pdfs\paper.pdf
|
| 206 |
+
Type: Unknown
|
| 207 |
+
Content Preview: UnAnswGen: A Systematic Approach for Generating Unanswerable Questions in Machine Reading Comprehens...
|
| 208 |
+
--------------------------------------------------------------------------------
|
| 209 |
+
Document 33:
|
| 210 |
+
Source: docs/pdfs\paper.pdf
|
| 211 |
+
Type: Unknown
|
| 212 |
+
Content Preview: ele
|
| 213 |
+
ctra-base-squad2 74.8 84.7 84.7 67.9 87.8 93.5 72.2 81.6 89.9
|
| 214 |
+
r
|
| 215 |
+
oberta-large-squad 78.7 90 90 69...
|
| 216 |
+
--------------------------------------------------------------------------------
|
| 217 |
+
Document 34:
|
| 218 |
+
Source: docs/pdfs\paper.pdf
|
| 219 |
+
Type: Unknown
|
| 220 |
+
Content Preview: questions by returning a null or empty string when no appropri-
|
| 221 |
+
ate answer is found within the conte...
|
| 222 |
+
--------------------------------------------------------------------------------
|
| 223 |
+
Document 35:
|
| 224 |
+
Source: docs/pdfs\paper.pdf
|
| 225 |
+
Type: Unknown
|
| 226 |
+
Content Preview: score indicating the model’s certainty in its provided answer. For
|
| 227 |
+
unanswerable questions, these mod...
|
| 228 |
+
--------------------------------------------------------------------------------
|
| 229 |
+
Document 36:
|
| 230 |
+
Source: docs/pdfs\paper.pdf
|
| 231 |
+
Type: Unknown
|
| 232 |
+
Content Preview: Table 4: Statistics on UnAnswGen dataset.
|
| 233 |
+
Unansw
|
| 234 |
+
erability Classes #
|
| 235 |
+
of Questions Per
|
| 236 |
+
centage A
|
| 237 |
+
vera...
|
| 238 |
+
--------------------------------------------------------------------------------
|
| 239 |
+
Document 37:
|
| 240 |
+
Source: docs/pdfs\paper.pdf
|
| 241 |
+
Type: Unknown
|
| 242 |
+
Content Preview: of the final unanswerable question set. Specifically, questions from
|
| 243 |
+
the Negation category account f...
|
| 244 |
+
--------------------------------------------------------------------------------
|
| 245 |
+
Document 38:
|
| 246 |
+
Source: docs/pdfs\paper.pdf
|
| 247 |
+
Type: Unknown
|
| 248 |
+
Content Preview: under the Negation label, 31.95 unanswerable questions under the
|
| 249 |
+
Antonym label, and only 2.6 unanswe...
|
| 250 |
+
--------------------------------------------------------------------------------
|
| 251 |
+
Document 39:
|
| 252 |
+
Source: docs/pdfs\paper.pdf
|
| 253 |
+
Type: Unknown
|
| 254 |
+
Content Preview: indicates that the question is completely unrelated to the context,
|
| 255 |
+
whereas a score of 1 indicates s...
|
| 256 |
+
--------------------------------------------------------------------------------
|
| 257 |
+
Document 40:
|
| 258 |
+
Source: docs/pdfs\paper.pdf
|
| 259 |
+
Type: Unknown
|
| 260 |
+
Content Preview: SIGIR-AP ’24, December 9–12, 2024, Tokyo, Japan Hadiseh Moradisani, Fattane Zarrinkalam, Julien Serb...
|
| 261 |
+
--------------------------------------------------------------------------------
|
| 262 |
+
Document 41:
|
| 263 |
+
Source: docs/pdfs\paper.pdf
|
| 264 |
+
Type: Unknown
|
| 265 |
+
Content Preview: SQuAD2-CR+UnAnsw
|
| 266 |
+
Gen 71.93 71.93 92.48 96.09 51.43 58.27 46.83 63.78 69.17 81.77 64.86 78.69 86.8 92...
|
| 267 |
+
--------------------------------------------------------------------------------
|
| 268 |
+
Document 42:
|
| 269 |
+
Source: docs/pdfs\paper.pdf
|
| 270 |
+
Type: Unknown
|
| 271 |
+
Content Preview: to evaluate the UnAnswerGen dataset against these criteria. Table
|
| 272 |
+
6 presents the results of Krippend...
|
| 273 |
+
--------------------------------------------------------------------------------
|
| 274 |
+
Document 43:
|
| 275 |
+
Source: docs/pdfs\paper.pdf
|
| 276 |
+
Type: Unknown
|
| 277 |
+
Content Preview: CR, which already includes multi-class labeling of unanswerable
|
| 278 |
+
questions, and (2) the SQuAD2-CR tra...
|
| 279 |
+
--------------------------------------------------------------------------------
|
| 280 |
+
Document 44:
|
| 281 |
+
Source: docs/pdfs\paper.pdf
|
| 282 |
+
Type: Unknown
|
| 283 |
+
Content Preview: underwent training on both the enhanced SQuAD2.0+ UnAnswGen
|
| 284 |
+
and the original SQuAD2-CR datasets, wit...
|
| 285 |
+
--------------------------------------------------------------------------------
|
| 286 |
+
Document 45:
|
| 287 |
+
Source: docs/pdfs\paper.pdf
|
| 288 |
+
Type: Unknown
|
| 289 |
+
Content Preview: BERTa, and 1% for Electra were observed. The balanced dataset
|
| 290 |
+
successfully mitigates issues related ...
|
| 291 |
+
--------------------------------------------------------------------------------
|
| 292 |
+
Document 46:
|
| 293 |
+
Source: docs/pdfs\paper.pdf
|
| 294 |
+
Type: Unknown
|
| 295 |
+
Content Preview: hanced MRC datasets, with a focus on including multi-label unan-
|
| 296 |
+
swerable questions. We have develop...
|
| 297 |
+
--------------------------------------------------------------------------------
|
| 298 |
+
Document 47:
|
| 299 |
+
Source: docs/pdfs\paper.pdf
|
| 300 |
+
Type: Unknown
|
| 301 |
+
Content Preview: flow to enrich other datasets, such as HotPotQA [35] and Natural
|
| 302 |
+
Questions [16], with multi-label un...
|
| 303 |
+
--------------------------------------------------------------------------------
|
| 304 |
+
Document 48:
|
| 305 |
+
Source: docs/pdfs\paper.pdf
|
| 306 |
+
Type: Unknown
|
| 307 |
+
Content Preview: UnAnswGen: A Systematic Approach for Generating Unanswerable Questions in Machine Reading Comprehens...
|
| 308 |
+
--------------------------------------------------------------------------------
|
| 309 |
+
Document 49:
|
| 310 |
+
Source: docs/pdfs\paper.pdf
|
| 311 |
+
Type: Unknown
|
| 312 |
+
Content Preview: 3115–3119.
|
| 313 |
+
[4] Christopher Clark and Matt Gardner. 2017. Simple and effective multi-paragraph
|
| 314 |
+
readin...
|
| 315 |
+
--------------------------------------------------------------------------------
|
| 316 |
+
Document 50:
|
| 317 |
+
Source: docs/pdfs\paper.pdf
|
| 318 |
+
Type: Unknown
|
| 319 |
+
Content Preview: ciation for Computational Linguistics: EMNLP 2023. 7349–7360.
|
| 320 |
+
[9] Kilem L Gwet. 2011. On the Krippen...
|
| 321 |
+
--------------------------------------------------------------------------------
|
| 322 |
+
Document 51:
|
| 323 |
+
Source: docs/pdfs\paper.pdf
|
| 324 |
+
Type: Unknown
|
| 325 |
+
Content Preview: questions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33.
|
| 326 |
+
6529–6537.
|
| 327 |
+
[13...
|
| 328 |
+
--------------------------------------------------------------------------------
|
| 329 |
+
Document 52:
|
| 330 |
+
Source: docs/pdfs\paper.pdf
|
| 331 |
+
Type: Unknown
|
| 332 |
+
Content Preview: [16] Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur
|
| 333 |
+
Parikh, Chris Alb...
|
| 334 |
+
--------------------------------------------------------------------------------
|
| 335 |
+
Document 53:
|
| 336 |
+
Source: docs/pdfs\paper.pdf
|
| 337 |
+
Type: Unknown
|
| 338 |
+
Content Preview: 2022. Ptau: Prompt tuning for attributing unanswerable questions. In Proceedings
|
| 339 |
+
of the 45th Interna...
|
| 340 |
+
--------------------------------------------------------------------------------
|
| 341 |
+
Document 54:
|
| 342 |
+
Source: docs/pdfs\paper.pdf
|
| 343 |
+
Type: Unknown
|
| 344 |
+
Content Preview: Answer Networks for Machine Reading Comprehension. In Proceedings of the
|
| 345 |
+
56th Annual Meeting of the ...
|
| 346 |
+
--------------------------------------------------------------------------------
|
| 347 |
+
Document 55:
|
| 348 |
+
Source: docs/pdfs\paper.pdf
|
| 349 |
+
Type: Unknown
|
| 350 |
+
Content Preview: Majumder, and Li Deng. 2016. Ms marco: A human-generated machine reading
|
| 351 |
+
comprehension dataset. (201...
|
| 352 |
+
--------------------------------------------------------------------------------
|
| 353 |
+
Document 56:
|
| 354 |
+
Source: docs/pdfs\paper.pdf
|
| 355 |
+
Type: Unknown
|
| 356 |
+
Content Preview: Unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822 (2018).
|
| 357 |
+
[30] Pranav Rajpurkar, Jia...
|
| 358 |
+
--------------------------------------------------------------------------------
|
| 359 |
+
Document 57:
|
| 360 |
+
Source: docs/pdfs\paper.pdf
|
| 361 |
+
Type: Unknown
|
| 362 |
+
Content Preview: 7th CCF International Conference, NLPCC 2018, Hohhot, China, August 26–30, 2018,
|
| 363 |
+
Proceedings, Part I...
|
| 364 |
+
--------------------------------------------------------------------------------
|
| 365 |
+
Document 58:
|
| 366 |
+
Source: docs/pdfs\paper.pdf
|
| 367 |
+
Type: Unknown
|
| 368 |
+
Content Preview: [36] Changchang Zeng, Shaobo Li, Qin Li, Jie Hu, and Jianjun Hu. 2020. A survey
|
| 369 |
+
on machine reading c...
|
| 370 |
+
--------------------------------------------------------------------------------
|
| 371 |
+
Document 59:
|
| 372 |
+
Source: docs/pdfs\resume.pdf
|
| 373 |
+
Type: Unknown
|
| 374 |
+
Content Preview: Julien Serbanescu
|
| 375 |
+
437-260-3435 | [email protected] | linkedin.com/in/julien-serbanescu-6ba52a241 ...
|
| 376 |
+
--------------------------------------------------------------------------------
|
| 377 |
+
Document 60:
|
| 378 |
+
Source: docs/pdfs\resume.pdf
|
| 379 |
+
Type: Unknown
|
| 380 |
+
Content Preview: Innovation/Creativity, Technical Communication, Mentoring
|
| 381 |
+
Education
|
| 382 |
+
Computer Engineering Co-op Major...
|
| 383 |
+
--------------------------------------------------------------------------------
|
| 384 |
+
Document 61:
|
| 385 |
+
Source: docs/pdfs\resume.pdf
|
| 386 |
+
Type: Unknown
|
| 387 |
+
Content Preview: on publications. /external-link-altUtilized various NLP methods such as NLTK and SpaCy in Python to ...
|
| 388 |
+
--------------------------------------------------------------------------------
|
| 389 |
+
Document 62:
|
| 390 |
+
Source: docs/pdfs\resume.pdf
|
| 391 |
+
Type: Unknown
|
| 392 |
+
Content Preview: application (Windows EXE) for cybersecurity threat detection and testing
|
| 393 |
+
Organizations
|
| 394 |
+
Guelph AI Clu...
|
| 395 |
+
--------------------------------------------------------------------------------
|
| 396 |
+
Document 63:
|
| 397 |
+
Source: docs/pdfs\resume.pdf
|
| 398 |
+
Type: Unknown
|
| 399 |
+
Content Preview: • /external-link-altLed a team to develop an AI assistant inspired by Jarvis from Iron Man, ensuring...
|
| 400 |
+
--------------------------------------------------------------------------------
|
| 401 |
+
Document 64:
|
| 402 |
+
Source: docs/pdfs\resume.pdf
|
| 403 |
+
Type: Unknown
|
| 404 |
+
Content Preview: • Developing robotics software using Docker and Linux, implementing Python-based control for Webots
|
| 405 |
+
...
|
| 406 |
+
--------------------------------------------------------------------------------
|
| 407 |
+
Document 65:
|
| 408 |
+
Source: docs/pdfs\resume.pdf
|
| 409 |
+
Type: Unknown
|
| 410 |
+
Content Preview: codes and provide medicine information, working on backend, delegating frontend and bridging
|
| 411 |
+
• /exte...
|
| 412 |
+
--------------------------------------------------------------------------------
|
| 413 |
+
Document 66:
|
| 414 |
+
Source: docs/pdfs\resume.pdf
|
| 415 |
+
Type: Unknown
|
| 416 |
+
Content Preview: with a ReactJS frontend dashboard for configuring and managing academic research projects
|
| 417 |
+
GAN to Gen...
|
| 418 |
+
--------------------------------------------------------------------------------
|
| 419 |
+
Document 67:
|
| 420 |
+
Source: BinThere.ai.m4a
|
| 421 |
+
Type: audio_transcription
|
| 422 |
+
Content Preview: All right, so today we're going to be quickly demoing binthere.ai. Now, what this does, it will det...
|
| 423 |
+
--------------------------------------------------------------------------------
|
| 424 |
+
Document 68:
|
| 425 |
+
Source: BinThere.ai.m4a
|
| 426 |
+
Type: audio_transcription
|
| 427 |
+
Content Preview: list it as biodegradable piece. So if I lower it down just because it's getting the white backgroun...
|
| 428 |
+
--------------------------------------------------------------------------------
|
| 429 |
+
Document 69:
|
| 430 |
+
Source: BinThere.ai.m4a
|
| 431 |
+
Type: audio_transcription
|
| 432 |
+
Content Preview: This is our new max score. And then as I listed there, and then it says it typically goes in a comp...
|
| 433 |
+
--------------------------------------------------------------------------------
|
| 434 |
+
Document 70:
|
| 435 |
+
Source: Synthia by Nuvela-AI.m4a
|
| 436 |
+
Type: audio_transcription
|
| 437 |
+
Content Preview: This is the user interface review of Cynthia, which is a service that makes research papers smarter...
|
| 438 |
+
--------------------------------------------------------------------------------
|
| 439 |
+
Document 71:
|
| 440 |
+
Source: Synthia by Nuvela-AI.m4a
|
| 441 |
+
Type: audio_transcription
|
| 442 |
+
Content Preview: load in certain fragments of other papers that have relevant pieces of information to what exactly...
|
| 443 |
+
--------------------------------------------------------------------------------
|
| 444 |
+
Document 72:
|
| 445 |
+
Source: Synthia by Nuvela-AI.m4a
|
| 446 |
+
Type: audio_transcription
|
| 447 |
+
Content Preview: machinery reading comprehension in order to find user sentiment. Something like that. And then we...
|
| 448 |
+
--------------------------------------------------------------------------------
|
| 449 |
+
Document 73:
|
| 450 |
+
Source: Synthia by Nuvela-AI.m4a
|
| 451 |
+
Type: audio_transcription
|
| 452 |
+
Content Preview: be another tool that the model context protocol system using and topic would actually be able to e...
|
| 453 |
+
--------------------------------------------------------------------------------
|
| 454 |
+
Document 74:
|
| 455 |
+
Source: Synthia by Nuvela-AI.m4a
|
| 456 |
+
Type: audio_transcription
|
| 457 |
+
Content Preview: to use. And it helps just make research smarter, more efficient, and better for users overall. No...
|
| 458 |
+
--------------------------------------------------------------------------------
|
| 459 |
+
Document 75:
|
| 460 |
+
Source: Synthia by Nuvela-AI.m4a
|
| 461 |
+
Type: audio_transcription
|
| 462 |
+
Content Preview: on our existing formatted proxy late-tech code. So that's my overview for our Cynthia front-end or...
|
| 463 |
+
--------------------------------------------------------------------------------
|
docs/faiss/index.faiss
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ba19108639e5fbf109bd22d4b856c4b50c6b1a91302f07bab21215029ed95b83
|
| 3 |
+
size 311341
|
docs/faiss/index.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9ac8f74a32c54fd338376934a9cd7b6691f26569029ff505181dbaa7d401e675
|
| 3 |
+
size 79972
|
docs/faiss/metadata.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:30f47cde2aa55557b2170a4497e54ba46aeb4e65b04834da3397dad759807006
|
| 3 |
+
size 1134
|
docs/pdfs/paper.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ad1c9232f3adff51a2b33725926753212e4ba90ffc0f0e6d0ba44224dae07ce3
|
| 3 |
+
size 1000812
|
docs/pdfs/resume.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3ff52c5befd12bbe850e9ab81e9844c71bd4557a72d27fea36c11252f24dbba9
|
| 3 |
+
size 135227
|
docs/youtube/BinThere.ai.m4a
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5c8f178deba8ddb2eb8cbdf2a60e8b4d4a4124891188ec0084d9f55d4675f252
|
| 3 |
+
size 2210006
|
docs/youtube/BinThere.ai_transcript.txt
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
All right, so today we're going to be quickly demoing binthere.ai. Now, what this does, it will detect your garbage and it will categorize it and also give you suggestions to reuse it in a fun, interactive way. So let's just get right into it and see what it does. So first, we have a streamlined interface. We change the colors and make it seamlessly fit in with the background for starters. But let's get right into the detection. So this will actually detect all types of waste live. And we're running a model that we got from yellow. So if I go down on a white background, this works better as it contrasts perfectly. But for example, if I put a bottle in, it will detect it as plastic as that kind of waste. And we'll have that section there. If I replace it with say this apple, it will list it as biodegradable piece. So if I lower it down just because it's getting the white background, it will detect that. And then when I press stop detection, it will stop detection and save the last known item that was available, which was biodegradable. Now from here, we can press analyze and it will give us some insights towards this. So it takes a second because we're using OpenAI's API to do this. But once we get the results, you will see the text below. So it tells us the energy saved. And it also, you could get a source where we kind of did the calculation a bit. And that's where it is. So if we go back, I have to press analyze again. It'll take a second, but yeah, I should open on new tab. But should give us the results quickly. Yeah, exactly. Jewels of energy saved. This is our new max score. And then as I listed there, and then it says it typically goes in a compost bin. We check local guidelines. It gives us options of what to do. Avoid contaminants in it as well. It also gives us local laws in the city as well. So it's a bit more applicable to the user than say other existing technologies that are similar. We also have ways to reuse it. For example, compost at home, making apple cider vinegar, making it feeding it to animals or livestock. The help reduce waste. And it gives you more constructive ways than just simply throwing it out in the right bit. So we actually want to add some extra steps to you. So you could actually reach your energy saving amount. And yeah, that was a quick demo of binthere.ai. I hope you enjoyed.
|
docs/youtube/Synthia by Nuvela-AI.m4a
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b36fac7c00c1d10b03f573acb806315037ad8a8b666b630697d96d8c86195200
|
| 3 |
+
size 4736121
|
docs/youtube/Synthia by Nuvela-AI_transcript.txt
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
This is the user interface review of Cynthia, which is a service that makes research papers smarter through AI, specifically model context protocols. So first, let's create a project. We're going to create our paper. So I have this previously loaded example where it's unanswered building and question answering frameworks. That'll be my name. And this will be the summary of what the paper entails. This is similar to a task I worked over the summer. So this is a bit relating to me, and I understand this task a bit more. Let's create this new project. So as we create a new project, we'll see it's under list of papers. We have our details. And then under fragments, this is where the model context protocol really sets in. As based on a pine cone database, it will be able to load in certain fragments of other papers that have relevant pieces of information to what exactly we're looking for. For example, this fragment, this segment of this introduction to unanswered building paper is going to be very important in our papers. So it suggests we can use that. Then we have causes of hallucination in LLAMS, hallucination of large language models, et cetera. And they all just have their own unique features and in fragments. So we start with 10 by default. You could add your own fragments. You could also delete existing fragments, like I deleted those two. Let's say I want to do MRC sentiment analysis. And then we're going to enter it in. Author, we're going to do Mark Noble. Let's just let's just say Mark Noble. And then year 2015, summary using machinery reading comprehension in order to find user sentiment. Something like that. And then we'll just enter a certain, like a dev post link or we'll put in a GitHub link for now. If we add the fragment, you'll see it's here. And if we go to source, it will be accessed. Now, these other sources, they are just proxy papers for now. But as we implement it and we load our kind of code database, it will have actual papers within it. Now from here, there are many things we could do. Let's start with generating citations. So this will take our existing paper fragments. And it will show us, oh, based on, they have predefined authors in years. And based on that, we loaded them in a citation form. And now, alumni would help us use different methods as well. And this would also be another tool that the model context protocol system using and topic would actually be able to execute within it. So as you see, these are all citations we have. We also have another feature called source analysis where, again, we do one more semantic match with our project summary and title. And we semantically match with each of these fragments, user added and normal. And then we would get sort of a percent match. Like, oh, for example, using the squad 2.0 database is 94% accurate to what we're actually detailing with our paper, detecting unanswerable questions, similar story there. As you can see, it's a very robust system that gives more insights into the paper author as to, oh, here's some sources that you can use. Here's how accurate they are to what you want to use. And it helps just make research smarter, more efficient, and better for users overall. Now, this last feature, it gives you a generated paper. It will generate some late-tech code. And it'll also show you a preview of what this may look like otherwise. Again, this is sort of proxy data based off of our existing fragments. And another tool set from a model context protocol will be able to do this. Again, we also implement a code here and betting as well, just to make sure we have a vector database so every semantic match would work perfectly. Even this generation would work well as well. But I think a more general L1 would be more implemented here. And finally, for a PDF preview, it also gives us our actual, or an actual PDF-looking document of this sort based on our existing formatted proxy late-tech code. So that's my overview for our Cynthia front-end or user experience throughout the process. Obviously, the backend would make this a bit more robust and customizable towards the user, using features such as model context protocol, cohere, embedding, pine cone database storage. And obviously, running it on in the topic would be the thing with model context protocol. Thank you very much for listening.
|
gunicorn_config.py
ADDED
|
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Gunicorn configuration file
|
| 2 |
+
import multiprocessing
|
| 3 |
+
|
| 4 |
+
# Worker processes
|
| 5 |
+
workers = multiprocessing.cpu_count() * 2 + 1
|
| 6 |
+
worker_class = 'sync'
|
| 7 |
+
worker_connections = 1000
|
| 8 |
+
|
| 9 |
+
# Timeouts
|
| 10 |
+
timeout = 120 # 2 minutes
|
| 11 |
+
graceful_timeout = 120
|
| 12 |
+
keepalive = 5
|
| 13 |
+
|
| 14 |
+
# Logging
|
| 15 |
+
accesslog = '-'
|
| 16 |
+
errorlog = '-'
|
| 17 |
+
loglevel = 'info'
|
| 18 |
+
|
| 19 |
+
# Process naming
|
| 20 |
+
proc_name = 'julien-serbanescu-app'
|
| 21 |
+
|
| 22 |
+
# Server mechanics
|
| 23 |
+
daemon = False
|
| 24 |
+
pidfile = None
|
| 25 |
+
umask = 0
|
| 26 |
+
user = None
|
| 27 |
+
group = None
|
| 28 |
+
tmp_upload_dir = None
|
| 29 |
+
|
| 30 |
+
# SSL
|
| 31 |
+
keyfile = None
|
| 32 |
+
certfile = None
|