software-mansion
/

react-native-executorch-llama-3.2

Model card Files Files and versions Community

react-native-executorch-llama-3.2 / README.md

chmjkb's picture

Update README.md

1c43a3c verified 3 months ago

|

2.03 kB

	---
	license: llama3.2
	---

	# Introduction

	This repository hosts the LLaMa 3.2 models for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library. It includes both the 1B and 3B versions of the LLaMa model, as well as their quantized versions in `.pte` format, ready for use in the ExecuTorch runtime.

	If you'd like to run these models in your own ExecuTorch runtime, refer to the [official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions.

	## Compatibility

	If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the ExecuTorch version used to export the `.pte` files. For more details, see the compatibility note in the [ExecuTorch GitHub repository](https://github.com/pytorch/executorch/blob/11d1742fdeddcf05bc30a6cfac321d2a2e3b6768/runtime/COMPATIBILITY.md?plain=1#L4). If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with runtime used behind the scenes.

	These models were exported using commit `fe20be98c` and no forward compatibility is guaranteed. Older versions of the runtime may not work with these files.

	### Repository Structure

	The repository is organized into two main directories:

	- `llama-3.2-1B`
	- `llama-3.2-3B`

	Each directory contains different versions of the model, including QLoRa, SpinQuant, and the original models.

	- The `.pte` file should be passed to the `modelSource` parameter.
	- The corresponding `.bin` file should be used for `tokenizerSource`.

	If you wish to export the model yourself, you’ll need to obtain model weights and the `params.json` file from the official repositories, which can be found [here](https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf).

	For the best performance-to-quality ratio, we highly recommend using the QLoRa version, which is optimized for speed without sacrificing too much on model quality.