Hey everyone 🤗! We (finegrain) have created some custom ComfyUI nodes to use our refiners micro-framework inside comfy! 🎉
We only support our new Box Segmenter at the moment, but we're thinking of adding more nodes since there seems to be a demand for it. We leverage the new (beta) Comfy Registry to host our nodes. They are available at: https://registry.comfy.org/publishers/finegrain/nodes/comfyui-refiners. You can install them by running:
comfy noderegistry-install comfyui-refiners
Or by unzipping the archive you can download by clicking "Download Latest" into your custom_nodes comfy folder. We are eager to hear your feedbacks and suggestions for new nodes and how you'll use them! 🙏
We (finegrain) have trained this new model in partnership with Nfinite and some of their synthetic data, the resulting model is incredibly accurate 🚀. It’s all open source under the MIT license (finegrain/finegrain-box-segmenter), complete with a test set tailored for e-commerce (finegrain/finegrain-product-masks-lite). Have fun experimenting with it!
Under the hoods, it's a pipeline of models (currently exposed via an API) that allows you to easily erase any object from your image just by naming it or selecting it! Not only will the object disappear, but so will its effects on the scene, like shadows and reflections. Built on top of Refiners, our micro-framework for simple foundation model adaptation (feel free to star it on GitHub if you like it: https://github.com/finegrain-ai/refiners)
Unfortunately, it doesn't look like it can easily be preloaded with some models, so you'll have to bring your own. Here is a quick selection of models you can use: - openai-community/gpt2 - qualcomm/ResNet50 - qualcomm/VIT
A new paper, "Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning," was just published. The approach improves VLMs' decision-making abilities in goal-directed tasks.
This is accomplished with Chain-of-thought (COT) reasoning, which seriously enhances performance. Removing COT reasoning, however, drops effectiveness, highlighting its crucial role.
✨️ Introducing the new fast model SDXL Flash (Mini), we learned that all fast XL models work fast, but the quality decreases, and we also made a fast model, but it is not as fast as LCM, Turbo, Lightning and Hyper, but the quality is higher. Below you will see the study with steps and cfg.
🚀 Features of mini model: It weighs less, consumes less video memory and other resources, and the quality has not dropped much.
👑 Our faster than regular model is better in quality than the coolest modern models such as JuggernautXL X, FluentlyXL v4 and others.
Red Hat and IBM have announced InstructLab, an open-source project for LLM contributions. InstructLab offers a model-agnostic approach for the community to contribute "skills" and or "knowledge" to LLMs via a CLI and tuning backend.
This community-driven approach to GenAI model development is novel to say the least. It will be interesting to see how effective it is in the long run, especially on models beyond the initial Granite and Merlinite familes.
It is amazing that the first group of students has completed the course and in record time!
Will look forward to seeing more submissions from the course soon.
A nice swag item that students get when they complete the course and make their submission is this cool Hugging Face Certificate of Completion. (Its suitable for framing) 🤗 👇
It will be interesting to add the results of the just announced Med-Gemini model to the Leaderboard to see how it compares and if its stated 91.1% MedQA benchmark is accurate.
We are happy to introduce our InstantStyle, which is a framework that employs straightforward yet potent techniques for achieving effective disentanglement of style and content from reference images.