Seq-to-Pheno

community

AI & ML interests

Genomes & Proteins

Recent Activity

TonicΒ  updated a Space 2 months ago
seq-to-pheno/README
TonicΒ  updated a dataset 2 months ago
seq-to-pheno/Mutated_Protein_Embeddings
TonicΒ  updated a dataset 2 months ago
seq-to-pheno/wildtype_proteins
View all activity

seq-to-pheno's activity

TonicΒ 
posted an update about 2 months ago
view post
Post
3394
πŸ™‹πŸ»β€β™‚οΈhey there folks,

periodic reminder : if you are experiencing ⚠️500 errors ⚠️ or ⚠️ abnormal spaces behavior on load or launch ⚠️

we have a thread πŸ‘‰πŸ» https://discord.com/channels/879548962464493619/1295847667515129877

if you can record the problem and share it there , or on the forums in your own post , please dont be shy because i'm not sure but i do think it helps πŸ€—πŸ€—πŸ€—
  • 2 replies
Β·
TonicΒ 
posted an update 2 months ago
view post
Post
1090
boomers still pick zenodo.org instead of huggingface ??? absolutely clownish nonsense , my random datasets have 30x more downloads and views than front page zenodos ... gonna write a comparison blog , but yeah... cringe.
  • 1 reply
Β·
TonicΒ 
posted an update 2 months ago
view post
Post
817
πŸ™‹πŸ»β€β™‚οΈ hey there folks ,

really enjoying sharing cool genomics and protein datasets on the hub these days , check out our cool new org : https://huggingface.co/seq-to-pheno

scroll down for the datasets, still figuring out how to optimize for discoverability , i do think on that part it will be better than zenodo[dot}org , it would be nice to write a tutorial about that and compare : we already have more downloads than most zenodo datasets from famous researchers !
TonicΒ 
updated a Space 2 months ago
TonicΒ 
posted an update 2 months ago
view post
Post
1446
hey there folks,

twitter is aweful isnt it ? just getting into the habbit of using hf/posts for shares πŸ¦™πŸ¦™

Tonic/on-device-granite-3.0-1b-a400m-instruct

new granite on device instruct model demo , hope you like it πŸš€πŸš€
TonicΒ 
posted an update 2 months ago
TonicΒ 
posted an update 3 months ago
TonicΒ 
posted an update 3 months ago
view post
Post
1853
πŸ™‹πŸ»β€β™‚οΈ Hey there folks ,

🦎Salamandra release by @mvillegas and team
@BSC_CNS https://huggingface.co/BSC-LT is absolutely impressive so far !

perhaps the largest single training dataset of high quality text to date of 7.8 trillion tokens in 35 European languages and code.

the best part : the data was correctly licenced so it's actually future-proof!

the completions model is really creative and instruct fine tuned version is very good also.

now you can use such models for multi-lingual enterprise applications with further finetunes , long response generation, structured outputs (coding) also works.

check out πŸ‘‡πŸ»
the collection : BSC-LT/salamandra-66fc171485944df79469043a
the repo : https://github.com/langtech-bsc/salamandra
7B-Instruct demo : Tonic/Salamandra-7B
TonicΒ 
posted an update 3 months ago
view post
Post
1720
@mlabonne hey there πŸ™‹πŸ»β€β™‚οΈ I kinda got obsessed with your great model , and i found the endpoint for it in lambda labs, but basically i got rate limited / banned for trying to make my DPO dataset project, i was wondering if you all had an open ai compatible solution for me to make a great "thinking" sft + dpo dataset with all the splits πŸ™πŸ»πŸ™πŸ» kinda desparate , it's true , but was looking forward to a nice write ups πŸš€πŸš€πŸš€
  • 1 reply
Β·
Simran27Β 
updated a dataset 3 months ago
TonicΒ 
posted an update 3 months ago
TonicΒ 
posted an update 3 months ago
view post
Post
1240
πŸ™‹πŸ»β€β™‚οΈ Hey there folks,

stepfun-ai/GOT-OCR2_0 is in top trending and spaces of the week for the second week straight !!

This is madness 😱

πŸš€πŸš€check out my demo here : Tonic/GOT-OCR