How can I use this model offline?

#81
by BenWan - opened

Download the code of xlm-roberta-flash-implementation will rasie a lot of error.

Hi @BenWan , have you tried cloning jina-embeddings-v3?

I would also be able to use it offline with Ollama 3.1 8b running in Docker.

I guess the author and I encountered the same problem:

I downloaded jina-embeddings-v3 and tried to use it:

class WordEmbeddingUtil(object):
    def __init__(self, wd_emb, model_path, device='cpu'):
        # jina-embeddings-v3
        # https://huggingface.co/jinaai/jina-embeddings-v3
        # !pip install transformers torch einops
        # !pip install 'numpy<2'
        # !pip install sentence-transformers
        # !pip install flash-attn --no-build-isolation
        self.wd_emb = wd_emb
        if self.wd_emb in ['all-mpnet-base-v2', 'jina-embeddings-v3']:
            self.model = SentenceTransformer(model_path, trust_remote_code=True, device=device)
            LogInfo.logs('%s model loaded.', self.wd_emb)
        else:
            raise NotImplementedError('word embedding model [%s] not supported.' % self.wd_emb)

    def get_phrase_emb(self, phrase):
        if self.wd_emb == 'jina-embeddings-v3':
            embeddings = self.model.encode(phrase,truncate_dim=768,max_length=128)
        else:
            embeddings = self.model.encode(phrase)
        return embeddings
      
if __name__ == '__main__':
    # test jina-embeddings-v3
    word_embeding_utils = WordEmbeddingUtil('jina-embeddings-v3', '/root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/sentence-transformers/jina-embeddings-v3-origin',device='cuda:1')
    sentence = 'Follow the white rabbit into the rave party of memories'
    embedds = word_embeding_utils.get_phrase_emb(sentence)
    print(embedds.shape)

I confirm that the directory /root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/sentence-transformers/jina-embeddings-v3-origin has been downloaded, but encountered an error upon execution.

Could not locate the configuration_xlm_roberta.py inside jinaai/xlm-roberta-flash-implementation.
Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connection.py", line 169, in _new_conn
    conn = connection.create_connection(
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/util/connection.py", line 96, in create_connection
    raise err
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/util/connection.py", line 86, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connection.py", line 353, in connect
    conn = self._new_conn()
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connection.py", line 174, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7f3e8a081520>, 'Connection to huggingface.co timed out. (connect timeout=10)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /jinaai/xlm-roberta-flash-implementation/resolve/main/configuration_xlm_roberta.py (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f3e8a081520>, 'Connection to huggingface.co timed out. (connect timeout=10)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1746, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1666, in get_hf_file_metadata
    r = _request_wrapper(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 364, in _request_wrapper
    response = _request_wrapper(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 387, in _request_wrapper
    response = get_session().request(method=method, url=url, **params)
  File "/root/miniconda3/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/utils/_http.py", line 93, in send
    return super().send(request, *args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/requests/adapters.py", line 504, in send
    raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /jinaai/xlm-roberta-flash-implementation/resolve/main/configuration_xlm_roberta.py (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f3e8a081520>, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: ff0895c6-6fe5-4be9-9551-92982529cea9)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/utils/hub.py", line 403, in cached_file
    resolved_file = hf_hub_download(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
    return f(*args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1232, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1339, in _hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1857, in _raise_on_head_call_error
    raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/word_emb.py", line 43, in <module>
    word_embeding_utils = WordEmbeddingUtil('jina-embeddings-v3', '/root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/sentence-transformers/jina-embeddings-v3-origin',device='cuda:1')
  File "/root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/word_emb.py", line 23, in __init__
    self.model = SentenceTransformer(model_path, trust_remote_code=True, device=device)
  File "/root/miniconda3/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 306, in __init__
    modules, self.module_kwargs = self._load_sbert_model(
  File "/root/miniconda3/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 1722, in _load_sbert_model
    module = module_class(model_name_or_path, cache_dir=cache_folder, backend=self.backend, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/jina-embeddings-v3-origin/custom_st.py", line 66, in __init__
    self.config = AutoConfig.from_pretrained(model_name_or_path, **config_args, cache_dir=cache_dir)
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 1015, in from_pretrained
    config_class = get_class_from_dynamic_module(
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 540, in get_class_from_dynamic_module
    final_module = get_cached_module_file(
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 344, in get_cached_module_file
    resolved_module_file = cached_file(
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/utils/hub.py", line 446, in cached_file
    raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like jinaai/xlm-roberta-flash-implementation is not the path to a directory containing a file named configuration_xlm_roberta.py.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

This seems to indicate that we need to download [jinaai/xlm-roberta-flash-implementation](jinaai/xlm-roberta-flash-implementation · Hugging Face), am I supposed to download the files here into the same directory as jina-embeddings-v3? But after trying, it seems to be ineffective.

I guess the author and I encountered the same problem:

I downloaded jina-embeddings-v3 and tried to use it:

class WordEmbeddingUtil(object):
    def __init__(self, wd_emb, model_path, device='cpu'):
        # jina-embeddings-v3
        # https://huggingface.co/jinaai/jina-embeddings-v3
        # !pip install transformers torch einops
        # !pip install 'numpy<2'
        # !pip install sentence-transformers
        # !pip install flash-attn --no-build-isolation
        self.wd_emb = wd_emb
        if self.wd_emb in ['all-mpnet-base-v2', 'jina-embeddings-v3']:
            self.model = SentenceTransformer(model_path, trust_remote_code=True, device=device)
            LogInfo.logs('%s model loaded.', self.wd_emb)
        else:
            raise NotImplementedError('word embedding model [%s] not supported.' % self.wd_emb)

    def get_phrase_emb(self, phrase):
        if self.wd_emb == 'jina-embeddings-v3':
            embeddings = self.model.encode(phrase,truncate_dim=768,max_length=128)
        else:
            embeddings = self.model.encode(phrase)
        return embeddings
      
if __name__ == '__main__':
    # test jina-embeddings-v3
    word_embeding_utils = WordEmbeddingUtil('jina-embeddings-v3', '/root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/sentence-transformers/jina-embeddings-v3-origin',device='cuda:1')
    sentence = 'Follow the white rabbit into the rave party of memories'
    embedds = word_embeding_utils.get_phrase_emb(sentence)
    print(embedds.shape)

I confirm that the directory /root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/sentence-transformers/jina-embeddings-v3-origin has been downloaded, but encountered an error upon execution.

Could not locate the configuration_xlm_roberta.py inside jinaai/xlm-roberta-flash-implementation.
Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connection.py", line 169, in _new_conn
    conn = connection.create_connection(
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/util/connection.py", line 96, in create_connection
    raise err
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/util/connection.py", line 86, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connection.py", line 353, in connect
    conn = self._new_conn()
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connection.py", line 174, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7f3e8a081520>, 'Connection to huggingface.co timed out. (connect timeout=10)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/root/miniconda3/lib/python3.8/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /jinaai/xlm-roberta-flash-implementation/resolve/main/configuration_xlm_roberta.py (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f3e8a081520>, 'Connection to huggingface.co timed out. (connect timeout=10)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1746, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1666, in get_hf_file_metadata
    r = _request_wrapper(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 364, in _request_wrapper
    response = _request_wrapper(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 387, in _request_wrapper
    response = get_session().request(method=method, url=url, **params)
  File "/root/miniconda3/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/utils/_http.py", line 93, in send
    return super().send(request, *args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/requests/adapters.py", line 504, in send
    raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /jinaai/xlm-roberta-flash-implementation/resolve/main/configuration_xlm_roberta.py (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f3e8a081520>, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: ff0895c6-6fe5-4be9-9551-92982529cea9)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/utils/hub.py", line 403, in cached_file
    resolved_file = hf_hub_download(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
    return f(*args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1232, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1339, in _hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  File "/root/miniconda3/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1857, in _raise_on_head_call_error
    raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/word_emb.py", line 43, in <module>
    word_embeding_utils = WordEmbeddingUtil('jina-embeddings-v3', '/root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/sentence-transformers/jina-embeddings-v3-origin',device='cuda:1')
  File "/root/autodl-tmp/Mention_kg_graph/kg_graph_gen_modify/word_emb.py", line 23, in __init__
    self.model = SentenceTransformer(model_path, trust_remote_code=True, device=device)
  File "/root/miniconda3/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 306, in __init__
    modules, self.module_kwargs = self._load_sbert_model(
  File "/root/miniconda3/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 1722, in _load_sbert_model
    module = module_class(model_name_or_path, cache_dir=cache_folder, backend=self.backend, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/jina-embeddings-v3-origin/custom_st.py", line 66, in __init__
    self.config = AutoConfig.from_pretrained(model_name_or_path, **config_args, cache_dir=cache_dir)
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 1015, in from_pretrained
    config_class = get_class_from_dynamic_module(
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 540, in get_class_from_dynamic_module
    final_module = get_cached_module_file(
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 344, in get_cached_module_file
    resolved_module_file = cached_file(
  File "/root/miniconda3/lib/python3.8/site-packages/transformers/utils/hub.py", line 446, in cached_file
    raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like jinaai/xlm-roberta-flash-implementation is not the path to a directory containing a file named configuration_xlm_roberta.py.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

This seems to indicate that we need to download [jinaai/xlm-roberta-flash-implementation](jinaai/xlm-roberta-flash-implementation · Hugging Face), am I supposed to download the files here into the same directory as jina-embeddings-v3? But after trying, it seems to be ineffective.

I solved the problem by putting xlm-roberta-flash-implementation under HF_HOME/jinaai/

1.Put xlm-roberta-flash-implementation to the same file folder like XYZliang
2.Edit config.json - automap
"auto_map": {
"AutoConfig": "path/xlm-roberta-flash-implementation--configuration_xlm_roberta.XLMRobertaFlashConfig",
"AutoModel": "path/xlm-roberta-flash-implementation--modeling_lora.XLMRobertaLoRA",
"AutoModelForMaskedLM": "path/xlm-roberta-flash-implementation--modeling_xlm_roberta.XLMRobertaForMaskedLM",
"AutoModelForPreTraining": "path/xlm-roberta-flash-implementation--modeling_xlm_roberta.XLMRobertaForPreTraining"
}
3.SentenceTransformer(model_path, trust_remote_code=True, local_files_only=True)

I had the same problem: roberta download failed when I used this model under sentencetransformer. Can the authors patch the download config bug?

I had the same issue but wanted to add the model to my Docker image. I solved it by loading the model once during the Docker build - not ideal, but it works for now. I would also appreciate a solution from the author in the repository. I am downloading both models and then running "RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('jinaai/jina-embeddings-v3', local_files_only=True, trust_remote_code=True)"

Load tokenizer error because not save vocab in models ? @BenWan

Sign up or log in to comment