Ontocord.AI

AI & ML interests

We dedicate ourselves to bringing lawful and effective data to AI training so that everyone can benefit from human knowledge. Our focus is data-centric ML focused on performance and legal compliance, pre-training and safety for large multimodal foundation models.

Recent Activity

Harsh1729 updated a model about 20 hours ago

ontocord/wide_3b-stage1_shuf_sample1_jsonl-pretrained

Harsh1729 updated a model about 20 hours ago

ontocord/ontocord_wide_3b-stage1_shuf_sample1_jsonl-pretrained-autoredteam_helpful-0.25_helpful

Harsh1729 published a model about 20 hours ago

ontocord/ontocord_wide_3b-stage1_shuf_sample1_jsonl-pretrained-autoredteam_helpful-0.25_helpful

View all activity

ontocord's activity

Harsh1729

updated 2 models about 20 hours ago

ontocord/wide_3b-stage1_shuf_sample1_jsonl-pretrained

Updated about 20 hours ago • 2

ontocord/ontocord_wide_3b-stage1_shuf_sample1_jsonl-pretrained-autoredteam_helpful-0.25_helpful

Updated about 20 hours ago • 7

Harsh1729

published a model about 20 hours ago

ontocord/ontocord_wide_3b-stage1_shuf_sample1_jsonl-pretrained-autoredteam_helpful-0.25_helpful

Updated about 20 hours ago • 7

Harsh1729

published a model about 22 hours ago

ontocord/wide_3b-stage1_shuf_sample1_jsonl-pretrained

Updated about 20 hours ago • 2

huu-ontocord

updated a dataset 4 days ago

ontocord/quotes

Viewer • Updated 4 days ago • 684 • 6

huu-ontocord

published a dataset 4 days ago

ontocord/quotes

Viewer • Updated 4 days ago • 684 • 6

huu-ontocord

updated a dataset 6 days ago

aurora-m/aurora-m-dataset-1

Viewer • Updated 6 days ago • 122M • 4

huu-ontocord

updated a dataset 6 days ago

ontocord/megawiki_with_gov_docs

Preview • Updated 6 days ago • 15

huu-ontocord

published a dataset 6 days ago

ontocord/megawiki_with_gov_docs

Preview • Updated 6 days ago • 15

huu-ontocord

updated a model 7 days ago

ontocord/wide_3b

Text Generation • Updated 7 days ago • 12

PSaiml

authored a paper about 1 month ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Paper • 2501.10057 • Published Jan 17 • 8

felfri

authored a paper about 1 month ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Paper • 2501.10057 • Published Jan 17 • 8

anoperson

authored a paper about 2 months ago

LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models

Paper • 2501.00874 • Published Jan 1 • 13

felfri

authored a paper 2 months ago

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

PSaiml

authored a paper 2 months ago

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

anoperson

authored a paper 3 months ago

Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering

Paper • 2411.09213 • Published Nov 14, 2024 • 7

ajibawa-2023

posted an update 4 months ago

Post

3155

New Dataset: Software-Architecture
Link: ajibawa-2023/Software-Architecture

I am releasing a Large Dataset covering topics related to Software-Architecture. This dataset consists of around 450,000 lines of data in jsonl.

I have included following topics:

Architectural Frameworks

Architectural Patterns for Reliability

Architectural Patterns for Scalability

Architectural Patterns

Architectural Quality Attributes

Architectural Testing

Architectural Views

Architectural Decision-Making

Advanced Research

Cloud-Based Architectures

Component-Based Architecture

Data Architecture

Emerging Trends

Event-Driven Architecture

Evolvability and Maintainability

Microservices and Monolithic

Microservices Architecture

Security Architecture

Service-Oriented Architecture

Software Design Principles

and Many More!

This dataset is useful in LLM development. Also those who are working on developing Software development related LLMs then this dataset can be useful.

This dataset is very useful to Researchers as well.

4 replies

Taishi-N324

authored a paper 4 months ago

Agent Skill Acquisition for Large Language Models via CycleQD

Paper • 2410.14735 • Published Oct 16, 2024 • 2

Taishi-N324

authored 2 papers 8 months ago

Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities

Paper • 2404.17790 • Published Apr 27, 2024 • 5

Building a Large Japanese Web Corpus for Large Language Models

Paper • 2404.17733 • Published Apr 27, 2024 • 4

AI & ML interests

Recent Activity

Team members 15

ontocord's activity