Memory is becoming fully exhausted during the generation of embeddings, leading to a complete server crash.
Hi all
I am trying to create embeddings for 15 lakh rows of data using sentence-transformers/all-MiniLM-L6-v2 for an application and upload embeddings to pgVector database.
While creating embeddings the server memory is getting completely exhausted and getting crashed.
Please help me here.
Hello!
I'm aware of this issue. The gist is that as more of the texts get turned into embeddings, the already processed embeddings all remain in memory until all texts have been processed. This can lead to high memory usage. My recommendation at this time is to chunk your texts and only process e.g. 1 lakh sentences at a time, upload those embeddings, and then do the next chunk.
Hope this helps.
- Tom Aarsen
Hey
@tomaarsen
thank you for your reply,
For now i am just going POC. If this is successful i will scale up the same for 5 Crore + rows of data. In that case this way of implementing is not suggestable.
Is there any way to do parallel processing for creation of embeddings.
Yes, you can use https://sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode_multi_process for encoding on multiple processes or multiple GPUs, but the memory issue might still persist then. Chunking remains a good option I think.
will check the above link and get back to you asap.
thanks !!!