MPT-7b-InstructAndStorywriting-50_50-Merge

A merge between the long context Storywriting and the short context instruct MPT-7b models.

Model Details:

This model is a merge between the following two MPT-7b models:

2048 CTX MTP-7b Instruct: https://huggingface.co/TehVenom/MPT-7b-instruct-V
65k CTX MTP-7b Storywriter: https://huggingface.co/TehVenom/MPT-7b-storywriter-Apache-2.0/

This merge was done using an weighted average merge strategy, and the end result is a model composed of:

MTP-7b Storywriter [50%] + MTP-7b Instruct [50%]

This was done under for the sake of testing the theory of how long context tunes affect attention when merged with a model that has been trained for a different purpose, on a shorter context span.

The end result is intended to be model that is capable of long prose while inheriting some of the Instruct base's Assistant / Instruct / Helpful properties.

Due to the influence of MPT-7b Storywriter, this model may generate content that is considered NSFW due to the wide array of books sampled for MPT-7b Storywriter.

The specific prompting is unknown, but try approaching it as a story / text completion prompt style first, then a mix of that and Alpaca's instruct format to see what brings most interesting results.

Read the original model card to understand how to run inference on this model.