|
--- |
|
license: mit |
|
--- |
|
This model needs further fine tuning. |
|
|
|
See: |
|
txtai-rag.py for rag implimentation with txt ai wikipedia. |
|
|
|
Llamafyd version of Qwen .5B further fine tuned 1 epoch on wiki, math, science, and chat datasets. Based on Cinder data. This model further fine tuned 1 epoch on rag data. |
|
|
|
Rough list of final datasets: |
|
formatted_beaugogh-openorca-multiplechoice-10k.txt |
|
formatted_BYC-Sophie-samsum-chatgpt-summary.txt |
|
formatted_conversation_bio.txt |
|
formatted_conversation_create_cinder_1.txt |
|
formatted_conversation_Electrical-engineering.txt |
|
formatted_conversation_multiturn_stem.txt |
|
formatted_conversation_physics.txt |
|
formatted_conversation_qa_rag_chem_prog_dataset.txt |
|
formatted_conversation_qa_robot_ai_dataset.txt |
|
formatted_conversation_qa_shopify_dataset1_rag.txt |
|
formatted_conversation_qa_shopify_dataset_rag.txt |
|
formatted_dyumat-databricks-dolly-5k-rag-split.txt |
|
formatted_Hypoxiic-wikipedia-summary-subset1k-summary_token.txt |
|
formatted_neural-bridge-rag-dataset-12000.txt |
|
formatted_rachid16-rag_finetuning_data.txt |
|
formatted_tiny_stories_1_summary_token_tag_token-xaa.txt |
|
formatted_tiny_stories_2_summary_token_assistant-xah.txt |
|
med_rag_small.txt |
|
z_formatted_cinder_test.txt |