How to start?

by martin3000 - opened Dec 1, 2024

Dec 1, 2024

•

edited Dec 1, 2024

Is there any description how to start? What should be downloaded, what should be run? I am a software developer but for me these pages are confusing :-)

In gptx_tokenizer.py I found the following code:
info_file = save_dir / "tokenizer_config.json"
What does the "/" operator do with strings?
Or is this python program generated by KI and never worked?

jaeb

OpenGPT-X org Dec 1, 2024

Hey martin3000, I'm assuming that you already found the corresponding section in the README page:
https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4#usage

First, you should install the required Python packages. Ideally, you'd install those in a so-called virtual environment, so that the packages are nicely encapsulated:

python -m pip install numpy torch huggingface_hub transformers sentencepiece

Then you should be able to follow the rest of the section, remembering to always activate the virtual environment should you decide to use one (highly recommended!). :)

As for the / operator: for strings, it is indeed not implemented. However, save_dir is supposed to be of type Path according to the Python type hints for the save_tokenizer_config method. For Paths, the / operator is implemented and joins paths with the correct path separator for your operating system.
It is simple to improve the implementation of save_tokenizer_config to handle passing strings as save_dir (we just need to add save_dir = Path(save_dir), so thanks for making us aware! We'll improve this!

jaeb

OpenGPT-X org Dec 1, 2024

You should be able to git pull the repository now so that you can pass strings to save_tokenizer_config.

martin3000

Dec 1, 2024

Hey Jaeb, thanks for helping. I did not realize that "openGPT-X/Teuken-7B-instruct-research-v0.4" is a model already registered with huggingface and that transformers.AutoModelForCausalLM automagically finds the model and downloads everthing. So before, I was looking for a way to donwload model*.safetensors, not knowing that this is happening automatically by the example code from https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4#usage
Thanks
Martin

jaeb

OpenGPT-X org Dec 1, 2024

Yeah, there's a lot of magic happening on the HuggingFace side, so I very much understand the confusion... :)
Glad if it works now!

martin3000

Dec 1, 2024

Is there a way running it on fly.io ?

barthfab

OpenGPT-X org Dec 10, 2024

I am not sure if this is a question that we can answer. Maybe you have to ask in the fly.io community.

martin3000

Dec 10, 2024

With Ollama it is very easy to run it on fly.io because it is pre-installed: https://fly.io/docs/python/do-more/add-ollama/

mfromm changed discussion status to closed Dec 12, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment