How to start?

#7
by martin3000 - opened

Is there any description how to start? What should be downloaded, what should be run? I am a software developer but for me these pages are confusing :-)

In gptx_tokenizer.py I found the following code:
info_file = save_dir / "tokenizer_config.json"
What does the "/" operator do with strings?
Or is this python program generated by KI and never worked?

OpenGPT-X org

Hey martin3000, I'm assuming that you already found the corresponding section in the README page:
https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4#usage

First, you should install the required Python packages. Ideally, you'd install those in a so-called virtual environment, so that the packages are nicely encapsulated:

python -m pip install numpy torch huggingface_hub transformers sentencepiece

Then you should be able to follow the rest of the section, remembering to always activate the virtual environment should you decide to use one (highly recommended!). :)

As for the / operator: for strings, it is indeed not implemented. However, save_dir is supposed to be of type Path according to the Python type hints for the save_tokenizer_config method. For Paths, the / operator is implemented and joins paths with the correct path separator for your operating system.
It is simple to improve the implementation of save_tokenizer_config to handle passing strings as save_dir (we just need to add save_dir = Path(save_dir), so thanks for making us aware! We'll improve this!

OpenGPT-X org

You should be able to git pull the repository now so that you can pass strings to save_tokenizer_config.

Hey Jaeb, thanks for helping. I did not realize that "openGPT-X/Teuken-7B-instruct-research-v0.4" is a model already registered with huggingface and that transformers.AutoModelForCausalLM automagically finds the model and downloads everthing. So before, I was looking for a way to donwload model*.safetensors, not knowing that this is happening automatically by the example code from https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4#usage
Thanks
Martin

OpenGPT-X org

Yeah, there's a lot of magic happening on the HuggingFace side, so I very much understand the confusion... :)
Glad if it works now!

Is there a way running it on fly.io ?

OpenGPT-X org

I am not sure if this is a question that we can answer. Maybe you have to ask in the fly.io community.

With Ollama it is very easy to run it on fly.io because it is pre-installed: https://fly.io/docs/python/do-more/add-ollama/

mfromm changed discussion status to closed

Sign up or log in to comment