grostaco commited on
Commit
d54ea16
·
1 Parent(s): 34f2c31

feat: add implementation details page

Browse files
Files changed (1) hide show
  1. pages/losses.py +39 -0
pages/losses.py ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ from st_pages import add_indentation
3
+
4
+ add_indentation()
5
+
6
+ st.title('Loss functions')
7
+ st.subheader('SDM Loss')
8
+ st.markdown('''
9
+ The similarity distribution matching (SDM) loss, which is the KL divergence
10
+ of the image to text and text to image to the label distribution.
11
+
12
+ We define $f^v$ and $f^t$ to be the global representation of the visual and textual features respectively.
13
+ The cosine similarity $sim(u, v) = \\frac{u \\cdot v}{|u||v|}$ will be used to compute the probability of the labels.
14
+
15
+ We define $y_{i, j}=1$ if the visual feature $f^v_i$ matches the textual feature $f^t_j$, else $y_{i, j}=0$.
16
+ The predicted label distribution can be formulated by''')
17
+ st.latex(r'''
18
+ p_{i} = \sigma(sim(f^v_i, f^t))
19
+ ''')
20
+
21
+ st.markdown('''
22
+ We can define the image to text loss as
23
+ ''')
24
+
25
+ st.latex(r'''
26
+ \mathcal{L}_{i2t} = KL(\mathbf{p_i} || \mathbf{q_i})
27
+ ''')
28
+
29
+ st.markdown('Where $\\mathbf{q_i}$, the true probability distribution, is defined as')
30
+
31
+ st.latex(r'''
32
+ q_{i, j} = \frac{y_{i, j}}{\sum_{k=1}^{N} y_{i, k}}
33
+ ''')
34
+
35
+ st.markdown('It should be noted that the reason this computation is needed is because there could be multiple correct labels.')
36
+
37
+
38
+ st.subheader('IRR (MLM) Loss')
39
+ st.subheader('ID Loss')