Cognitive Computations

community

https://erichartford.com

erhartford

ehartford

Activity Feed

AI & ML interests

Supervised Fine Tuning, DPO, and unalignment

Recent Activity

v2ray new activity 1 day ago

cognitivecomputations/DeepSeek-R1-AWQ:Can't get 48 TPS on 8x H800

v2ray new activity 2 days ago

cognitivecomputations/DeepSeek-R1-AWQ:Pipeline Parallellism

v2ray new activity 2 days ago

cognitivecomputations/DeepSeek-R1-AWQ:8*a100 OUT OF MEMORY

View all activity

cognitivecomputations's activity

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 1 day ago

Can't get 48 TPS on 8x H800

#21 opened 1 day ago by

Light4Bear

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 2 days ago

Pipeline Parallellism

#20 opened 2 days ago by

leo98xh

8*a100 OUT OF MEMORY

#19 opened 2 days ago by

Jaren

requests get stuck when sending long prompts (already solved, but still don't know why?)

#18 opened 2 days ago by

uv0xab

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 3 days ago

Significant Speed Drop with Increasing Input Length on H800 GPUs

#17 opened 3 days ago by

wangkkk956

v2ray

in cognitivecomputations/DeepSeek-V3-AWQ 3 days ago

Docker start with vllm failed. Official vllm docker image 0.7.3

#7 opened 3 days ago by

kuliev-vitaly

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 3 days ago

when i use vllm v0.7.2 to deploy r1 awq, i got empty content

#10 opened 11 days ago by

bupalinyu

why "MLA is not supported with awq_marlin quantization. Disabling MLA." with 4090 * 32 (4 node / vllm 0.7.2)

#14 opened 4 days ago by

FightLLM

when i run command ,it didnot work. ( via vllm 0.7.3)

#16 opened 3 days ago by

xueshuai

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 4 days ago

skips the thinking process

#5 opened 16 days ago by

muzizon

Any one can run this model with SGlang framework？

#13 opened 4 days ago by

muziyongshixin

v2ray

in cognitivecomputations/DeepSeek-V3-AWQ 7 days ago

GPTQ Support

#1 opened about 2 months ago by

warlock-edward

vllm support a100

#2 opened about 1 month ago by

HuggingLianWang

Code used to convert this / could you do v3 base?

#3 opened 30 days ago by

deltanym

What calibration dataset do you use when applying AWQ?

#5 opened 11 days ago by

HandH1998

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 7 days ago

Deployment framework

#2 opened about 1 month ago by

xro7

MLA is not supported with moe_wna16 quantization. Disabling MLA.

#7 opened 12 days ago by

AMOSE

triton.runtime.errors.OutOfResources: out of resource: shared memory, Required: 163840, Hardware limit: 101376. Reducing block sizes or `num_stages` may help

#9 opened 11 days ago by

wuhanzaina

Has anyone evaluated the performance of the AWQ version of the model on benchmarks?

#8 opened 11 days ago by

liuqianchao

louisbrulenaudet

posted an update 8 days ago

Post

2994

I am pleased to introduce my first project built upon Hugging Face’s smolagents framework, integrated with Alpaca for financial market analysis automation 🦙🤗

The project implements technical indicators such as the Relative Strength Index (RSI) and Bollinger Bands to provide momentum and volatility analysis. Market data is retrieved through the Alpaca API, enabling access to historical price information across various timeframes.

AI-powered insights are generated using Hugging Face’s inference API, facilitating the analysis of market trends through natural language processing with DuckDuckGo search integration for real-time sentiment analysis based on financial news 🦆

Link to the GitHub project: https://github.com/louisbrulenaudet/agentic-market-tool

AI & ML interests

Recent Activity

Team members 140

cognitivecomputations's activity

Can't get 48 TPS on 8x H800

Pipeline Parallellism

8*a100 OUT OF MEMORY

requests get stuck when sending long prompts (already solved, but still don't know why?)

Significant Speed Drop with Increasing Input Length on H800 GPUs

Docker start with vllm failed. Official vllm docker image 0.7.3

when i use vllm v0.7.2 to deploy r1 awq, i got empty content

why "MLA is not supported with awq_marlin quantization. Disabling MLA." with 4090 * 32 (4 node / vllm 0.7.2)

when i run command ,it didnot work. ( via vllm 0.7.3)

skips the thinking process

Any one can run this model with SGlang framework？

GPTQ Support

vllm support a100

Code used to convert this / could you do v3 base?

What calibration dataset do you use when applying AWQ?

Deployment framework

MLA is not supported with moe_wna16 quantization. Disabling MLA.

triton.runtime.errors.OutOfResources: out of resource: shared memory, Required: 163840, Hardware limit: 101376. Reducing block sizes or `num_stages` may help

Has anyone evaluated the performance of the AWQ version of the model on benchmarks?