Spaces:

bethgelab
/

lm-similarity

Running

App Files Files Community

Joschka Strueber commited on 15 days ago

Commit

0d09d9a

1 Parent(s): 5623280

[Ref] switch to KaTeX Css in html

Browse files

Files changed (1) hide show

app.py +20 -10

app.py CHANGED Viewed

@@ -78,17 +78,27 @@ with gr.Blocks(title="LLM Similarity Analyzer", css=app_util.custom_css) as demo
     )
     gr.Markdown("## Information")
-    gr.HTML("""
-<script type="text/javascript" async
-  src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/MathJax.js?config=TeX-MML-AM_CHTML">
-</script>
-<p>We propose Chance Adjusted Probabilistic Agreement (<span>\(\operatorname{CAPA}\)</span>, or <span>\(\kappa_p\)</span>), a novel metric
-for model similarity which adjusts for chance agreement due to accuracy. Using CAPA, we find: (1) LLM-as-a-judge scores are \
-biased towards more similar models controlling for the model's capability. (2) Gain from training strong models on annotations \
-of weak supervisors (weak-to-strong generalization) is higher when the two models are more different. (3) Concerningly, model \
-errors are getting more correlated as capabilities increase.</p>
-""")
     with gr.Row():
         gr.Image(value="data/table_capa.png", label="Comparison of different similarity metrics for multiple-choice questions", elem_classes="image_container", interactive=False)
     gr.Markdown("""

     )
     gr.Markdown("## Information")
+    metric_info_html = r"""
+<!-- Include KaTeX CSS for styling -->
+<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.css" integrity="sha384-vZTGXXFDvM1R7zDKx2g5N5S4FcoFdTJuFTz1Xj2A2/J1j4fGmS7a6hLQ6ZPfF1sk" crossorigin="anonymous">
+<!-- Include KaTeX and its auto-render extension -->
+<script defer src="https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.js" integrity="sha384-6R6ckgSpF6yXUHg9+KJGXN9I+ik5U9dviDuzhSxrtk4AUaGr8/8Qovm6N9fl/hkz" crossorigin="anonymous"></script>
+<script defer src="https://cdn.jsdelivr.net/npm/[email protected]/dist/contrib/auto-render.min.js" integrity="sha384-mll67QQ8ErU7t8/QqU3m0Cq56E7i2xUeFYSv6O9V3CRjNdqPzqxK9z6gS9GQFj8D" crossorigin="anonymous"
+    onload="renderMathInElement(document.body);"></script>
+<div>
+  <p>
+    We propose Chance Adjusted Probabilistic Agreement ($\operatorname{CAPA}$, or $\kappa_p$), a novel metric
+    for model similarity which adjusts for chance agreement due to accuracy. Using CAPA, we find:
+  </p>
+  <ol>
+    <li>LLM-as-a-judge scores are biased towards more similar models controlling for the model's capability.</li>
+    <li>Gain from training strong models on annotations of weak supervisors (weak-to-strong generalization) is higher when the two models are more different.</li>
+    <li>Concerningly, model errors are getting more correlated as capabilities increase.</li>
+  </ol>
+</div>
+"""
+    gr.HTML(value=metric_info_html)
     with gr.Row():
         gr.Image(value="data/table_capa.png", label="Comparison of different similarity metrics for multiple-choice questions", elem_classes="image_container", interactive=False)
     gr.Markdown("""