As we advance on the path towards true Artificial General Intelligence (AGI), it's crucial to recognize and address the limitations inherent in current technologies, particularly in large language models (LLMs) like those developed by OpenAI. While LLMs excel in processing and generating text, their capabilities are largely constrained to the domains of natural language understanding and generation. This poses significant limitations when dealing with more complex, abstract mathematical concepts such as topological analysis, 3D geometry, and homotopy type theory.
Topological Analysis and 3D Geometry: LLMs currently do not possess the inherent ability to understand or interpret the spatial and geometric data that is critical in fields like robotics, architecture, and advanced physics. These models lack the capacity to visualize or manipulate three-dimensional objects or comprehend the underlying properties that govern these forms.
Homotopy Type Theory is a branch of mathematics that combines homotopy theory and type theory. Homotopy type theory provides tools for a more robust handling of equivalences and transformations, something that LLMs are not designed to handle directly.
For the development of AGI, it is not sufficient to merely enhance existing models' capacities within their linguistic domains. Instead, a synthesis of symbolic AI with an understanding of homotopy type theory could pave the way. Symbolic AI, which manipulates symbols and performs logical operations, when combined with the abstract mathematical reasoning of homotopy type theory, could lead to breakthroughs in how machines understand and interact with the world.
To address these limitations we have developed Tenzin, which is a one-of-a-kind model with a planned release date within the next 1-2 weeks . To learn more join the waitlist at https://octave-x.com/.
๐ฆ Do you remember IBIS? Not a fancy bird but the open challenge in Inferring Binding Specificities of unexplored human transcription factors. Check our site (https://ibis.autosome.org/) and have a sip of fresh news below.
๐ฅ More than 100 teams registered for the challenge yet only two dozen are using the opportunity to explore their models on the Leaderboard. Don't miss the chance to participate in the Leaderboard stage, although independently of that you can submit the final solution.
๐ Remember, the training data for Leaderboard and Final are available online, and you are free to mix-and-match it in any combination.
๐ For Leaderboard, we have received 650 total submissions of AAA (advanced ML) and 296 PWM models (a whopping set of 6682 PWMs in total).
๐ For PWMs, the baseline is left far behind, but some TFs remain tough nuts to be cracked (see the attached figure 1).
๐ For AAAs, there is a solid improvement over the best-submitted PWMs in A2G, but the G2A discipline remains unpopular (see the attached figure 2). Free hint: this is your chance!
๐ก Another free hint: If your model tends to overfit given a limited set of data for some TFs don't forget to use reverse-complement and shift augmentations. Also, don't hesitate to use multitarget models i.e. predicting the binding of multiple TFs at the same time.
๐ก Last but not least, try to combine knowledge from all accessible experiment types, especially for G2A discipline (ChIP-Seq & genomic HT-SELEX) in a single model!
๐ฃ Finally and importantly, following the requests from the community, we decided to EXTEND the Leaderboard until the final submission deadline.
๐๏ธ The final submission deadline is also EXTENDED until Aug 15. The final submission form and details will be posted on the IBIS website in the first half of July, follow our Telegram group and mailing list (see the links at https://ibis.autosome.org).
It's trendy to share models "fine-tuned for function calling"; but from my observations, this fine-tuning is not necessary or sufficient to build good agent systems. To name only a few: ๐ฆโโฌ Nexusflow/๐ก๐ฒ๐ ๐๐๐ฅ๐ฎ๐๐ฒ๐ป-๐ฉ๐ฎ-๐ญ๐ฏ๐ โ CohereForAI/๐ฐ๐ฐ๐ฎ๐ถ-๐ฐ๐ผ๐บ๐บ๐ฎ๐ป๐ฑ-๐ฟ-๐ฝ๐น๐๐ โต๏ธ mistralai/๐ ๐ถ๐ ๐๐ฟ๐ฎ๐น-๐ด๐ ๐ฎ๐ฎ๐-๐๐ป๐๐๐ฟ๐๐ฐ๐-๐๐ฌ.๐ญ "Fine-tuned for function-calling" generally means "fine-tuned to generate function calls in correct JSON for extremely simple tasks". In other terms, it means "improve the formatting of the tool calls".
Yet I discovered two things while improving Transformers Agents: ๐ง Even when used as JSON agents, these fine-tuned models don't perform very well ๐ ๐๐ค๐ค๐ ๐๐๐จ๐ ๐ข๐ค๐๐๐ก๐จ ๐ฅ๐๐ง๐๐ค๐ง๐ข ๐๐๐ฉ๐ฉ๐๐ง ๐ฌ๐๐ฉ๐๐ค๐ช๐ฉ ๐๐ฃ๐ฎ ๐๐๐ฃ๐-๐ฉ๐ช๐ฃ๐๐ฃ๐, ๐๐ช๐จ๐ฉ ๐ฅ๐ก๐๐๐ฃ ๐ฅ๐ง๐ค๐ข๐ฅ๐ฉ๐๐ฃ๐. (Llama-3-70B-Instruct, GPT-4o, Claude-3.5-Sonnet)
๐ The graph below shows the count of errors for my GPT-4o validation run on the GAIA benchmark: ๐ฐ๐๐๐๐๐ฟ๐๐๐๐๐๐๐ด๐๐๐๐ and ๐ฐ๐๐๐๐๐ด๐ก๐๐๐๐๐๐๐๐ด๐๐๐๐ are the ones caused by incorrect formatting. โค As you can see, their count is already close to 0! And given that GPT-4o is certainly not fine-tuned for our Code tool calling format, this shows that "function calling fine-tuning" is not necessary!
The hardest thing to get right in an agent is still to ๐ฅ๐ก๐๐ฃ ๐๐ค๐ค๐ ๐ฉ๐๐จ๐ -๐จ๐ค๐ก๐ซ๐๐ฃ๐ ๐ฉ๐ง๐๐๐๐๐ฉ๐ค๐ง๐๐๐จ ๐ค๐ซ๐๐ง ๐จ๐๐ซ๐๐ง๐๐ก ๐จ๐ฉ๐๐ฅ๐จ. To improve this, we could: - Use more powerful base models - Make tool calling datasets with complex solving trajectories - Use RL! cc @lvwerra
๐ฅ๐ญ๐ New Research Alert - ECCV 2024 (Avatars Collection)! ๐๐ญ๐ฅ ๐ Title: Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture ๐
๐ Description: Topo4D is a novel method for automated, high-fidelity 4D head tracking that optimizes dynamic topological meshes and 8K texture maps from multi-view time-series images.
๐ฅ Authors: @Dazz1e, Y. Cheng, @Ryan-sjtu, H. Jia, D. Xu, W. Zhu, Y. Yan
I trained this model on a new spot I'm really excited to share (soon!)
This Monday I will be posting my first beginning to end blog showing the tool I've used, dataset, captioning techniques, and parameters to finetune this LoRA.