znation HF Staff commited on
Commit
eb4214d
·
1 Parent(s): 990a2a3

clarifications

Browse files
Files changed (2) hide show
  1. index.html +2 -1
  2. xorbs.json +1 -1
index.html CHANGED
@@ -18,7 +18,8 @@
18
  <body>
19
  <div class="card">
20
  <h1>Visualizing Repo-level Dedupe</h1>
21
- <p>This visualization demonstrates the amount of <a target="_blank" rel="noopener noreferrer" href="https://huggingface.co/blog/from-files-to-chunks">chunk-level dedupe</a> within a repo or across a selection of repos. (For now, demonstrates a hardcoded selection.)</p>
 
22
  </div>
23
  <div id="vis"></div>
24
  <script>
 
18
  <body>
19
  <div class="card">
20
  <h1>Visualizing Repo-level Dedupe</h1>
21
+ <p>This visualization demonstrates the amount of <a target="_blank" rel="noopener noreferrer" href="https://huggingface.co/blog/from-files-to-chunks">chunk-level dedupe</a> across all public repos.</p>
22
+ <p>"Dedupe factor" is defined as the number of re-uses of a given "xorb". A "xorb" is a collection of content-defined chunks, typically around 1,000 chunks comprising up to 64 MB of total data.</p>
23
  </div>
24
  <div id="vis"></div>
25
  <script>
xorbs.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:efd8947a02dabec8b1b0228fb97d12ba923d4c33335e36598096fad3cde7f96f
3
  size 629739
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:feaca61dcd2c355fa10b3a69d844299b798587574d985693b31234ed550b1a66
3
  size 629739