2 9

Dana Aubakirova

danaaubakirova

AI & ML interests

DocumentAI, Deep Learning, Multimodal Learning, Computer Vision, Image Processing, NLP

Recent Activity

upvoted a collection 13 days ago

ISSAI-KazLLM-1.0

View all activity

Articles

Introducing TextImage Augmentation for Document Images

Aug 6

• 32

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Jul 25

• 18

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

May 16

• 17

Organizations

danaaubakirova's activity

upvoted a collection 13 days ago

ISSAI-KazLLM-1.0

Collection

7 items • Updated 15 days ago • 21

upvoted a paper 4 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 124

upvoted 2 articles 4 months ago

Article

How to communicate in a Pull Request?

•

Aug 22

• 18

Article

Introducing TextImage Augmentation for Document Images

Aug 6

• 32

posted an update 5 months ago

Post

872

🚀 We are thrilled to introduce TextImage Data Augmentation, developed in collaboration with Albumentations AI! ✨ This multimodal technique modifies document images and text simultaneously, enhancing Vision Language Models (VLMs) for high-text datasets.

👩‍💻 Learn how this innovative approach can improve your document AI projects by checking out our full blog post here: https://huggingface.co/blog/doc_aug_hf_alb

1 reply

upvoted an article 5 months ago

Article

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Jul 25

• 18

updated 2 datasets 5 months ago

danaaubakirova/docmatix-subset

Viewer • Updated Jul 23 • 2.13k • 69

hf-internal-testing/fixtures-captioning

Updated Jul 16 • 2.54k

New activity in danaaubakirova/mplugdocowl1.5-Omni-hf 5 months ago

Is this ready to be used with the transformers library?

#1 opened 5 months ago by

MihaiATK

New activity in mPLUG/DocOwl 6 months ago

Update model_worker.py

#4 opened 6 months ago by

danaaubakirova

updated a dataset 6 months ago

danaaubakirova/patfig

Preview • Updated Jul 10 • 188 • 4

updated 4 models 6 months ago

upvoted 2 articles 7 months ago

Article

Let's talk about LLM evaluation

•

May 23

• 140

Article

MobileNet-V4 (now in timm)

•

Jun 17

• 39

posted an update 7 months ago

Post

1314

The Document AI team ( @Molbap , @rwightman , @danaaubakirova ) at Hugging Face is developing a new multimodal data augmentation pipeline utilising both visual and textual aspects of document images.

Check out my latest blog post for more details:
https://huggingface.co/blog/danaaubakirova/doc-augmentation

Please, share your thoughts and suggestions with us.
And stay tuned for the updates!

upvoted an article 7 months ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 226

published an article 8 months ago

Article

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

•

May 16

• 17