File size: 10,044 Bytes
5472531
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
2024-02-26 22:09:14 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=40002, worker_address='http://localhost:40002', controller_address='http://localhost:10000', model_path='MBZUAI/MobiLlama-1B', revision='main', device='cuda', gpus=None, num_gpus=1, max_gpu_memory=None, dtype=None, load_8bit=False, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=16, gptq_groupsize=-1, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, enable_exllama=False, exllama_max_seq_len=4096, exllama_gpu_split=None, exllama_cache_8bit=False, enable_xft=False, xft_max_seq_len=4096, xft_dtype=None, model_names=None, conv_template=None, embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None, debug=False, ssl=False)
2024-02-26 22:09:14 | INFO | model_worker | Loading the model ['MobiLlama-1B'] on worker 23461ac5 ...
2024-02-26 22:09:15 | ERROR | stderr | 
Loading checkpoint shards:   0%|                                                                                   | 0/2 [00:00<?, ?it/s]
2024-02-26 22:09:22 | ERROR | stderr | 
Loading checkpoint shards:  50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ                                     | 1/2 [00:07<00:07,  7.22s/it]
2024-02-26 22:09:23 | ERROR | stderr | 
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:07<00:00,  3.24s/it]
2024-02-26 22:09:23 | ERROR | stderr | 
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:07<00:00,  3.84s/it]
2024-02-26 22:09:23 | ERROR | stderr | 
2024-02-26 22:09:23 | INFO | model_worker | Register to controller
2024-02-26 22:09:23 | ERROR | stderr | INFO:     Started server process [458643]
2024-02-26 22:09:23 | ERROR | stderr | INFO:     Waiting for application startup.
2024-02-26 22:09:23 | ERROR | stderr | INFO:     Application startup complete.
2024-02-26 22:09:23 | ERROR | stderr | INFO:     Uvicorn running on http://0.0.0.0:40002 (Press CTRL+C to quit)
2024-02-26 22:10:01 | INFO | stdout | INFO:     127.0.0.1:51206 - "POST /worker_get_status HTTP/1.1" 200 OK
2024-02-26 22:10:08 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:10:53 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:11:09 | INFO | stdout | INFO:     127.0.0.1:58710 - "POST /worker_get_status HTTP/1.1" 200 OK
2024-02-26 22:11:38 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:12:23 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:13:08 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:13:53 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:14:39 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:14:41 | INFO | stdout | INFO:     127.0.0.1:60484 - "POST /worker_get_status HTTP/1.1" 200 OK
2024-02-26 22:14:59 | INFO | stdout | INFO:     127.0.0.1:45638 - "POST /worker_get_status HTTP/1.1" 200 OK
2024-02-26 22:15:24 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:16:09 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:16:54 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:17:39 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:17:57 | INFO | stdout | INFO:     127.0.0.1:42542 - "POST /worker_get_status HTTP/1.1" 200 OK
2024-02-26 22:18:14 | INFO | stdout | INFO:     127.0.0.1:60442 - "POST /worker_get_status HTTP/1.1" 200 OK
2024-02-26 22:18:24 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:19:09 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:19:54 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:20:39 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:21:24 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:22:09 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:22:54 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:23:39 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:24:24 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:25:09 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:25:54 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:26:39 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:27:24 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:28:09 | INFO | model_worker | Send heart beat. Models: ['MobiLlama-1B']. Semaphore: None. call_ct: 0. worker_id: 23461ac5. 
2024-02-26 22:28:09 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a00a0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:14 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a0b50>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:19 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a1420>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:24 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a1cf0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:29 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a1000>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:34 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a0550>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:39 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a00a0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:44 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a2530>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:49 | ERROR | model_worker | heart beat error: HTTPConnectionPool(host='localhost', port=10000): Max retries exceeded with url: /receive_heart_beat (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff4ec7a2dd0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2024-02-26 22:28:52 | ERROR | stderr | INFO:     Shutting down
2024-02-26 22:28:52 | ERROR | stderr | INFO:     Waiting for application shutdown.
2024-02-26 22:28:52 | ERROR | stderr | INFO:     Application shutdown complete.
2024-02-26 22:28:52 | ERROR | stderr | INFO:     Finished server process [458643]