Nadav Timor's picture

9 6 6

Nadav Timor

Nadav-Timor

·

AI & ML interests

None yet

Recent Activity

commented on an article 14 days ago

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

commented on their article 14 days ago

Universal Assisted Generation: Faster Decoding with Any Assistant Model

published an article 15 days ago

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

View all activity

Organizations

Nadav-Timor's activity

commented on Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques 14 days ago

As mentioned, we’ve open-sourced our benchmarking code here: https://github.com/keyboardAnt/hf-bench

commented on Universal Assisted Generation: Faster Decoding with Any Assistant Model 14 days ago

Citation



@article
	{timor2025acceleratingllminferencelossless,
      title={Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies}, 
      author={Nadav Timor and Jonathan Mamou and Daniel Korat and Moshe Berchansky and Oren Pereg and Gaurav Jain and Roy Schwartz and Moshe Wasserblat and David Harel},
      year={2025},
      eprint={2502.05202},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.05202}, 
}

published an article 15 days ago

Article

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

By

and 8 others •

15 days ago

• 17

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-7B 2 months ago

Vocab size in config.json mismatches the actual tokenizer size

#4 opened 2 months ago by

upvoted a paper 4 months ago

MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 57

published an article 5 months ago

Article

Universal Assisted Generation: Faster Decoding with Any Assistant Model

By

and 7 others •

Oct 29, 2024

• 55

upvoted an article 6 months ago

Article

Faster Assisted Generation with Dynamic Speculation

By

and 6 others •

Oct 8, 2024

• 46

published an article 6 months ago

Article

Faster Assisted Generation with Dynamic Speculation

By

and 6 others •

Oct 8, 2024

• 46

upvoted a paper 10 months ago

StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 31

upvoted a collection 11 months ago

speed

23 items • Updated Jun 27, 2024 • 6

upvoted a paper 11 months ago

Distributed Speculative Inference of Large Language Models

Paper • 2405.14105 • Published May 23, 2024 • 18

authored a paper 11 months ago

Distributed Speculative Inference of Large Language Models

Paper • 2405.14105 • Published May 23, 2024 • 18

New activity in facebook/CyberSecEval 11 months ago

`llama3p-70b-rc3_vr_mid_3` & `llama3p-7b-rc3_vr_mid_2`?

#2 opened 11 months ago by

reacted to julien-c's post with 👍 about 1 year ago

Post

What if you could casually access your remote GPU in HF Spaces from the comfort of your local VSCode 🤯

8 replies

·

updated a dataset over 1 year ago

Nadav-Timor/CUAD

Viewer • Updated Nov 8, 2023 • 13.8k • 53

New activity in AWS/MistralLite over 1 year ago

`max_position_embeddings=32768` and `precompute_freqs_cis` with `end=128_000`

#6 opened over 1 year ago by

New activity in mistralai/Mistral-7B-Instruct-v0.1 over 1 year ago

`max_position_embeddings=32768` with "attention span of 131K tokens"

#57 opened over 1 year ago by

liked a dataset over 1 year ago

chargoddard/QuALITY-instruct

Viewer • Updated Jul 14, 2023 • 4.61k • 102 • 2

liked a model over 1 year ago

AWS/MistralLite

Text Generation • Updated May 16, 2024 • 52.3k • 430

liked a Space over 1 year ago

Model Memory Utility

Calculate memory needed to train AI models