Spaces:
Sleeping
Sleeping
SinaAhmadi
commited on
Commit
•
bf3e769
1
Parent(s):
704e535
Update app.py
Browse files
app.py
CHANGED
@@ -95,29 +95,29 @@ def normalize(text, language_script):
|
|
95 |
return hypotheses[0]
|
96 |
|
97 |
title = """
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
-
|
103 |
-
|
104 |
-
|
105 |
"""
|
106 |
|
107 |
description = """
|
108 |
-
|
109 |
-
|
110 |
-
|
111 |
-
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
|
120 |
-
|
121 |
"""
|
122 |
|
123 |
examples = [
|
|
|
95 |
return hypotheses[0]
|
96 |
|
97 |
title = """
|
98 |
+
<center><strong><font size='8'>Script Normalization for Unconventional Writing<font></strong></center>
|
99 |
+
<h3 style="font-weight: 450; font-size: 1rem; margin: 0rem">
|
100 |
+
[<a href="https://sinaahmadi.github.io/docs/articles/ahmadi2023acl.pdf" style="color:blue;">Paper (ACL 2023)</a>]
|
101 |
+
[<a href="https://sinaahmadi.github.io/docs/slides/ahmadi2023acl_slides.pdf" style="color:blue;">Slides</a>]
|
102 |
+
[<a href="https://github.com/sinaahmadi/ScriptNormalization" style="color:blue;">GitHub</a>]
|
103 |
+
[<a href="https://s3.amazonaws.com/pf-user-files-01/u-59356/uploads/2023-06-04/rw32pwp/ACL2023.mp4" style="color:blue;">Presentation</a>]
|
104 |
+
</h3>
|
105 |
"""
|
106 |
|
107 |
description = """
|
108 |
+
<ul>
|
109 |
+
<li style="font-size:160%;">"<em>mar7aba!</em>"</li>
|
110 |
+
<li style="font-size:160%;">"<em>هاو ئار یوو؟</em>"</li>
|
111 |
+
<li style="font-size:160%;">"<em>Μπιάνβενου α σετ ντεμό!</em>"</li>
|
112 |
+
</ul>
|
113 |
+
|
114 |
+
<p style="font-size:160%;">What all these sentences are in common? Being greeted in Arabic with "<em>mar7aba</em>" written in the Latin script, then asked how you are ("<em>هاو ئار یوو؟</em>") in English using the Perso-Arabic script of Kurdish and then, welcomed to this demo in French ("<em>Μπιάνβενου α σετ ντεμό!</em>") written in Greek script. All these sentences are written in an <strong>unconventional</strong> script.</p>
|
115 |
+
|
116 |
+
<p style="font-size:160%;">Although you may find these sentences risible, unconventional writing is a common practice among millions of speakers in bilingual communities. In our paper entitled "<a href="https://sinaahmadi.github.io/docs/articles/ahmadi2023acl.pdf" target="_blank"><strong>Script Normalization for Unconventional Writing of Under-Resourced Languages in Bilingual Communities</strong></a>", we shed light on this problem and propose an approach to normalize noisy text written in unconventional writing.</p>
|
117 |
+
|
118 |
+
<p style="font-size:160%;">This demo deploys a few models that are trained for <strong>the normalization of unconventional writing</strong>. Please note that this tool is not a spell-checker and cannot correct errors beyond character normalization. For better performance, you can apply hard-coded rules on the input and then pass it to the models, hence a hybrid system.</p>
|
119 |
+
|
120 |
+
For more information, you can check out the project on GitHub too: <a href="https://github.com/sinaahmadi/ScriptNormalization" target="_blank"><strong>https://github.com/sinaahmadi/ScriptNormalization</strong></a>
|
121 |
"""
|
122 |
|
123 |
examples = [
|