Spaces:
Runtime error
Runtime error
Update wrapup.md
#1
by
veneres
- opened
wrapup.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
### Putting it all together
|
| 2 |
|
| 3 |
-
When you use the document encoder in an indexing pipeline, the
|
| 4 |
|
| 5 |
<div class="pipeline">
|
| 6 |
<div class="df" title="Document Frame">D</div>
|
|
@@ -18,7 +18,7 @@ import pyt_splade
|
|
| 18 |
dataset = pt.get_dataset('irds:msmarco-passage')
|
| 19 |
splade = pyt_splade.SpladeFactory()
|
| 20 |
|
| 21 |
-
indexer = pt.IterDictIndexer('./msmarco_psg',
|
| 22 |
|
| 23 |
indxer_pipe = splade.indexing() >> indexer
|
| 24 |
indxer_pipe.index(dataset.get_corpus_iter())
|
|
|
|
| 1 |
### Putting it all together
|
| 2 |
|
| 3 |
+
When you use the document encoder in an indexing pipeline, the rewritten document contents are indexed:
|
| 4 |
|
| 5 |
<div class="pipeline">
|
| 6 |
<div class="df" title="Document Frame">D</div>
|
|
|
|
| 18 |
dataset = pt.get_dataset('irds:msmarco-passage')
|
| 19 |
splade = pyt_splade.SpladeFactory()
|
| 20 |
|
| 21 |
+
indexer = pt.IterDictIndexer('./msmarco_psg', pretokenised=True)
|
| 22 |
|
| 23 |
indxer_pipe = splade.indexing() >> indexer
|
| 24 |
indxer_pipe.index(dataset.get_corpus_iter())
|