t.d.a.g. PRO

sequelbox

AI & ML interests

open source, infinite games. (they/them)

Recent Activity

Organizations

Valiant Labs's profile picture

sequelbox's activity

posted an update 6 days ago
view post
Post
582
Do YOU want faster releases and more new, bigger and better datasets and models? Donate to support my open source work:

- 100% of donations used for me to build open source AI, including datasets, model finetunes, and new experimental models! (Apache-2.0 licensing is default.)
- Directly support the progress of open source AI and a future we all get to have a voice in, instead of big tech companies speaking for you :)

Current ways to donate:
Ko-fi: https://ko-fi.com/sequelbox
ETH: 0x9542FEa2621F85fEe5c50ff05B5201af05f7D075

Specific short-term priorities include:
- Advanced reasoning versions of the Tachibana, Supernova, Celestia, and Titanium datasets
- training Esper 3 and Shining Valiant 3 on multiple architectures
- IMPORTANT: working on experimental reasoning datasets and models, including multi-stage structured reasoning, new architectures, and other Fun Ideas. it is crucial that the next major LLM innovation comes from an open lab, not a closed one.

open source needs to win. you don't want to live in the world where we lose. it's worse than you think.

lets build faster, together!

thank you :)
allegra
  • 2 replies
·
reacted to mkurman's post with ❤️ 7 days ago
view post
Post
3628
Introducing a new architecture, MedIT One – a single-token transformer with LSTM-like recurrence.

It is extremely fast in training and inference, but we lack funding for large-scale training. Enjoy 🍓

https://github.com/MedITSolutionsKurman/medit-one

reacted to singhsidhukuldeep's post with 👍 7 days ago
view post
Post
6690
Exciting New Tool for Knowledge Graph Extraction from Plain Text!

I just came across a groundbreaking new tool called KGGen that's solving a major challenge in the AI world - the scarcity of high-quality knowledge graph data.

KGGen is an open-source Python package that leverages language models to extract knowledge graphs (KGs) from plain text. What makes it special is its innovative approach to clustering related entities, which significantly reduces sparsity in the extracted KGs.

The technical approach is fascinating:

1. KGGen uses a multi-stage process involving an LLM (GPT-4o in their implementation) to extract entities and relations from source text
2. It aggregates graphs across sources to reduce redundancy
3. Most importantly, it applies iterative LM-based clustering to refine the raw graph

The clustering stage is particularly innovative - it identifies which nodes and edges refer to the same underlying entities or concepts. This normalizes variations in tense, plurality, stemming, and capitalization (e.g., "labors" clustered with "labor").

The researchers from Stanford and University of Toronto also introduced MINE (Measure of Information in Nodes and Edges), the first benchmark for evaluating KG extractors. When tested against existing methods like OpenIE and GraphRAG, KGGen outperformed them by up to 18%.

For anyone working with knowledge graphs, RAG systems, or KG embeddings, this tool addresses the fundamental challenge of data scarcity that's been holding back progress in graph-based foundation models.

The package is available via pip install kg-gen, making it accessible to everyone. This could be a game-changer for knowledge graph applications!
New activity in open-thoughts/OpenThoughts-114k 23 days ago

license

5
#2 opened about 1 month ago by
sequelbox
New activity in sequelbox/Raiden-DeepSeek-R1 25 days ago
posted an update 26 days ago