Orkut Murat Yılmaz

orkut

orkutmuratyilmaz

AI & ML interests

Geo Sciences, Free Software

Recent Activity

liked a model 13 days ago

manycore-research/SpatialLM-Llama-1B

liked a dataset 22 days ago

alibayram/yapay_zeka_turkce_mmlu_model_cevaplari

liked a model 22 days ago

TencentARC/InstantMesh

View all activity

Organizations

orkut's activity

liked a model 13 days ago

manycore-research/SpatialLM-Llama-1B

Text Generation • Updated 15 days ago • 14.3k • 901

liked a dataset 22 days ago

alibayram/yapay_zeka_turkce_mmlu_model_cevaplari

Viewer • Updated 22 days ago • 6.2k • 243 • 4

liked 3 models 22 days ago

liked a Space 25 days ago

FireDetection

⚡

Fire and smoke detections from video with fine-tuned Yolov12

liked a model about 1 month ago

Wan-AI/Wan2.1-I2V-14B-480P

Image-to-Video • Updated Feb 26 • 68.5k • 134

liked a Space about 1 month ago

1.39k

Wan2.1

💻

Wan: Open and Advanced Large-Scale Video Generative Models

liked a dataset about 2 months ago

yusufbaykaloglu/University_Mevzuat_QA_v2

Viewer • Updated Feb 6 • 14.3k • 116 • 8

upvoted an article about 2 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.21k

liked a model 2 months ago

deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated Feb 1 • 37.9k • 415

liked a model 3 months ago

Comfy-Org/stable-diffusion-v1-5-archive

Updated Aug 29, 2024 • 45

liked 2 models 4 months ago

CohereForAI/c4ai-command-r7b-12-2024

Text Generation • Updated Feb 20 • 9.43k • 379

artificialguybr/LogoRedmond-LogoLoraForSDXL-V2

Text-to-Image • Updated Oct 7, 2023 • 3.62k • • 71

liked a Space 4 months ago

LOGO SDXL LORA FREE DEMO

💻

liked a model 5 months ago

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Text Generation • Updated Oct 25, 2024 • 168k • • 2.03k

reacted to merve's post with 🔥 8 months ago

Post

2293

We have recently merged Video-LLaVA to transformers! 🤗🎞️
What makes this model different?

Demo: llava-hf/video-llava
Model: LanguageBind/Video-LLaVA-7B-hf

Compared to other models that take image and video input and either project them separately or downsampling video and projecting selected frames, Video-LLaVA is converting images and videos to unified representation and project them using a shared projection layer.

It uses Vicuna 1.5 as the language model and LanguageBind's own encoders that's based on OpenCLIP, these encoders project the modalities to an unified representation before passing to projection layer.

I feel like one of the coolest features of this model is the joint understanding which is also introduced recently with many models

It's a relatively older model but ahead of it's time and works very well! Which means, e.g. you can pass model an image of a cat and a video of a cat and ask questions like whether the cat in the image exists in video or not 🤩

liked a model 10 months ago

alibayram/Doktor-Llama

Text Generation • Updated Jul 5, 2024 • 11

liked a Space 10 months ago

1.93k

Stable Diffusion XL on TPUv5e

🏋

Generate images from text prompts with various styles

liked a model 10 months ago

mistralai/Codestral-22B-v0.1

Text Generation • Updated Jul 31, 2024 • 11.7k • 1.24k