2 3 12

Kevin Klyman

kklyman

kevin_klyman

AI & ML interests

Usage policies for foundation models

Recent Activity

liked a dataset 3 months ago

fka/awesome-chatgpt-prompts

updated a dataset 6 months ago

kklyman/acceptableusepoliciesforfoundationmodels

liked a Space 8 months ago

lmarena-ai/gpt-4o-mini_battles

View all activity

Organizations

None yet

kklyman's activity

liked a dataset 3 months ago

fka/awesome-chatgpt-prompts

Viewer • Updated Jan 6 • 203 • 10.8k • 7.67k

updated a dataset 6 months ago

kklyman/acceptableusepoliciesforfoundationmodels

Viewer • Updated Oct 14, 2024 • 5 • 42

liked a Space 8 months ago

Gpt-4o-mini Battles

🏢

Filter and display conversations between models

liked a dataset 10 months ago

stanford-crfm/air-bench-2024

Viewer • Updated Aug 14, 2024 • 21.9k • 941 • 20

updated a collection 10 months ago

playground data

Collection

3 items • Updated Jun 1, 2024

liked a Space 11 months ago

CyberSecEvalTest

📈

Evaluate LLM cybersecurity risks

liked a model 12 months ago

stabilityai/stable-video-diffusion-img2vid-xt

Image-to-Video • Updated Jul 10, 2024 • 340k • 3k

reacted to sted97's post with 🔥 12 months ago

Post

2462

📣 I'm thrilled to announce "ALERT: A Comprehensive #Benchmark for Assessing #LLMs’ Safety through #RedTeaming" 🚨

📄 Paper: https://arxiv.org/pdf/2404.08676.pdf
🗃️ Repo: https://github.com/Babelscape/ALERT
🤗 ALERT benchmark: Babelscape/ALERT
🤗 ALERT DPO data: Babelscape/ALERT_DPO

As a key design principle for ALERT, we developed a fine-grained safety risk taxonomy (Fig. 2). This taxonomy serves as the foundation for the benchmark to provide detailed insights about a model’s weaknesses and vulnerabilities as well as inform targeted safety enhancements 🛡️

For collecting our prompts, we started from the popular
Anthropic's HH-RLHF data, and used automated strategies to filter/classify prompts. We then designed templates to create new prompts (providing sufficient support for each category, cf. Fig. 3) and implemented adversarial attacks.

In our experiments, we extensively evaluated several open- and closed-source LLMs (e.g. #ChatGPT, #Llama and #Mistral), highlighting their strengths and weaknesses (Table 1).

For more details, check out our preprint: https://arxiv.org/pdf/2404.08676.pdf 🤓

Huge thanks to @felfri , @PSaiml , Kristian Kersting, @navigli , @huu-ontocord and @BoLi-aisecure (and all the organizations involved: Babelscape, Sapienza NLP, TU Darmstadt, Hessian.AI, DFKI, Ontocord.AI, UChicago and UIUC)🫂

1 reply

authored a paper 12 months ago

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Paper • 2404.12241 • Published Apr 18, 2024 • 11

updated a collection 12 months ago

playground data

Collection

3 items • Updated Jun 1, 2024

New activity in allenai/OLMo-7B-0424 12 months ago

Update README.md

#1 opened 12 months ago by

kklyman

liked a dataset 12 months ago

Anthropic/persuasion

Viewer • Updated Apr 9, 2024 • 3.94k • 612 • 189

authored a paper 12 months ago

The Foundation Model Transparency Index

Paper • 2310.12941 • Published Oct 19, 2023

liked a model about 1 year ago

aurora-m/aurora-m-base

Text Generation • Updated Mar 26, 2024 • 3 • 16

liked 2 Spaces about 1 year ago

License

⚖

The BigScience RAIL License

BigCode Model License Agreement

🤝

Display PDF Document

liked a model about 1 year ago

aurora-m/aurora-m-biden-harris-redteamed

Text Generation • Updated Nov 11, 2024 • 49 • 20

authored a paper about 1 year ago

On the Societal Impact of Open Foundation Models

Paper • 2403.07918 • Published Feb 27, 2024 • 17

reacted to yjernite's post with 🤗 about 1 year ago

Post

👷🏽‍♀️📚🔨 Announcing the Foundation Model Development Cheatsheet!

My first 🤗Post🤗 ever to announce the release of a fantastic collaborative resource to support model developers across the full development stack: The FM Development Cheatsheet available here: https://fmcheatsheet.org/

The cheatsheet is a growing database of the many crucial resources coming from open research and development efforts to support the responsible development of models. This new resource highlights essential yet often underutilized tools in order to make it as easy as possible for developers to adopt best practices, covering among other aspects:
🧑🏼‍🤝‍🧑🏼 data selection, curation, and governance;
📖 accurate and limitations-aware documentation;
⚡ energy efficiency throughout the training phase;
📊 thorough capability assessments and risk evaluations;
🌏 environmentally and socially conscious deployment strategies.

We strongly encourage developers working on creating and improving models to make full use of the tools listed here, and to help keep the resource up to date by adding the resources that you yourself have developed or found useful in your own practice 🤗

Congrats to all the participants in this effort for the release! Read more about it from:
@Shayne - https://twitter.com/ShayneRedford/status/1763215814860186005
@hails and @stellaathena - https://blog.eleuther.ai/fm-dev-cheatsheet/
@alon-albalak - http://nlp.cs.ucsb.edu/blog/a-new-guide-for-the-responsible-development-of-foundation-models.html

And also to @gabrielilharco @sayashk @kklyman @kylel @mbrauh @fauxneticien @avi-skowron @Bertievidgen Laura Weidinger, Arvind Narayanan, @VictorSanh @Davlan @percyliang Rishi Bommasani, @breakend @sasha 🔥

1 reply