Create README.md
Browse files
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,27 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            license: apache-2.0
         | 
| 3 | 
            +
            base_model:
         | 
| 4 | 
            +
            - Qwen/Qwen2.5-7B
         | 
| 5 | 
            +
            pipeline_tag: text-generation
         | 
| 6 | 
            +
            language:
         | 
| 7 | 
            +
            - en
         | 
| 8 | 
            +
            library_name: transformers
         | 
| 9 | 
            +
            tags:
         | 
| 10 | 
            +
            - text-generation-inference
         | 
| 11 | 
            +
            ---
         | 
| 12 | 
            +
             | 
| 13 | 
            +
            ## Model Description
         | 
| 14 | 
            +
             | 
| 15 | 
            +
            Optimized Layer Merging (OLM)
         | 
| 16 | 
            +
            Is a transformer optimization framework implementing automated layer recombination.
         | 
| 17 | 
            +
             | 
| 18 | 
            +
            Olm create Frankenstein's monster out of language models by cherry-picking the best performing layers across different models to create a superior hybrid.
         | 
| 19 | 
            +
            The core mechanism:
         | 
| 20 | 
            +
             | 
| 21 | 
            +
            - Takes multiple language models as input
         | 
| 22 | 
            +
            - Uses a base model as the foundation
         | 
| 23 | 
            +
            - Iteratively replaces individual layers, evaluating performance on specified datasets
         | 
| 24 | 
            +
            - Keeps the best performing layer at each position based on metrics like perplexity, exact match, and a custom "quality" score
         | 
| 25 | 
            +
            - Builds a fusion model layer-by-layer while maintaining or improving performance
         | 
| 26 | 
            +
             | 
| 27 | 
            +
            https://github.com/jeffmeloy/olm
         |