JayKimDevolved's picture
JayKimDevolved/deepseek
c011401 verified
raw
history blame
23.5 kB
[2025-01-15 18:15:45,507 I 517589 517589] (raylet) main.cc:180: Setting cluster ID to: c36f90a03eb214af71608b721c24e70055c82cf4a8c1f87ce389b92c
[2025-01-15 18:15:45,516 I 517589 517589] (raylet) main.cc:289: Raylet is not set to kill unknown children.
[2025-01-15 18:15:45,516 I 517589 517589] (raylet) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2025-01-15 18:15:45,517 I 517589 517589] (raylet) main.cc:419: Setting node ID node_id=594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be
[2025-01-15 18:15:45,517 I 517589 517589] (raylet) store_runner.cc:32: Allowing the Plasma store to use up to 2.14748GB of memory.
[2025-01-15 18:15:45,517 I 517589 517589] (raylet) store_runner.cc:48: Starting object store with directory /dev/shm, fallback /tmp/ray, and huge page support disabled
[2025-01-15 18:15:45,518 I 517589 517618] (raylet) dlmalloc.cc:154: create_and_mmap_buffer(2147483656, /dev/shm/plasmaXXXXXX)
[2025-01-15 18:15:45,519 I 517589 517618] (raylet) store.cc:564: Plasma store debug dump:
Current usage: 0 / 2.14748 GB
- num bytes created total: 0
0 pending objects of total size 0MB
- objects spillable: 0
- bytes spillable: 0
- objects unsealed: 0
- bytes unsealed: 0
- objects in use: 0
- bytes in use: 0
- objects evictable: 0
- bytes evictable: 0
- objects created by worker: 0
- bytes created by worker: 0
- objects restored: 0
- bytes restored: 0
- objects received: 0
- bytes received: 0
- objects errored: 0
- bytes errored: 0
[2025-01-15 18:15:46,523 I 517589 517589] (raylet) grpc_server.cc:134: ObjectManager server started, listening on port 33173.
[2025-01-15 18:15:46,526 I 517589 517589] (raylet) worker_killing_policy.cc:101: Running GroupByOwner policy.
[2025-01-15 18:15:46,527 I 517589 517589] (raylet) memory_monitor.cc:47: MemoryMonitor initialized with usage threshold at 94999994368 bytes (0.95 system memory), total system memory bytes: 99999997952
[2025-01-15 18:15:46,527 I 517589 517589] (raylet) node_manager.cc:287: Initializing NodeManager node_id=594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be
[2025-01-15 18:15:46,528 I 517589 517589] (raylet) grpc_server.cc:134: NodeManager server started, listening on port 39451.
[2025-01-15 18:15:46,537 I 517589 517682] (raylet) agent_manager.cc:77: Monitor agent process with name dashboard_agent/424238335
[2025-01-15 18:15:46,538 I 517589 517684] (raylet) agent_manager.cc:77: Monitor agent process with name runtime_env_agent
[2025-01-15 18:15:46,538 I 517589 517589] (raylet) event.cc:493: Ray Event initialized for RAYLET
[2025-01-15 18:15:46,538 I 517589 517589] (raylet) event.cc:324: Set ray event level to warning
[2025-01-15 18:15:46,540 I 517589 517589] (raylet) raylet.cc:134: Raylet of id, 594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be started. Raylet consists of node_manager and object_manager. node_manager address: 192.168.0.2:39451 object_manager address: 192.168.0.2:33173 hostname: 0cd925b1f73b
[2025-01-15 18:15:46,543 I 517589 517589] (raylet) node_manager.cc:525: [state-dump] NodeManager:
[state-dump] Node ID: 594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be
[state-dump] Node name: 192.168.0.2
[state-dump] InitialConfigResources: {node:192.168.0.2: 10000, node:__internal_head__: 10000, accelerator_type:A40: 10000, memory: 864744902660000, object_store_memory: 21474836480000, CPU: 200000, GPU: 20000}
[state-dump] ClusterTaskManager:
[state-dump] ========== Node: 594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be =================
[state-dump] Infeasible queue length: 0
[state-dump] Schedule queue length: 0
[state-dump] Dispatch queue length: 0
[state-dump] num_waiting_for_resource: 0
[state-dump] num_waiting_for_plasma_memory: 0
[state-dump] num_waiting_for_remote_node_resources: 0
[state-dump] num_worker_not_started_by_job_config_not_exist: 0
[state-dump] num_worker_not_started_by_registration_timeout: 0
[state-dump] num_tasks_waiting_for_workers: 0
[state-dump] num_cancelled_tasks: 0
[state-dump] cluster_resource_scheduler state:
[state-dump] Local id: 5613091048481760916 Local resources: {"total":{node:__internal_head__: [10000], node:192.168.0.2: [10000], GPU: [10000, 10000], CPU: [200000], memory: [864744902660000], object_store_memory: [21474836480000], accelerator_type:A40: [10000]}}, "available": {node:__internal_head__: [10000], node:192.168.0.2: [10000], GPU: [10000, 10000], CPU: [200000], memory: [864744902660000], object_store_memory: [21474836480000], accelerator_type:A40: [10000]}}, "labels":{"ray.io/node_id":"594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be",} is_draining: 0 is_idle: 1 Cluster resources: node id: 5613091048481760916{"total":{object_store_memory: 21474836480000, memory: 864744902660000, CPU: 200000, accelerator_type:A40: 10000, node:__internal_head__: 10000, node:192.168.0.2: 10000, GPU: 20000}}, "available": {object_store_memory: 21474836480000, memory: 864744902660000, CPU: 200000, accelerator_type:A40: 10000, node:__internal_head__: 10000, node:192.168.0.2: 10000, GPU: 20000}}, "labels":{"ray.io/node_id":"594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be",}, "is_draining": 0, "draining_deadline_timestamp_ms": -1} { "placment group locations": [], "node to bundles": []}
[state-dump] Waiting tasks size: 0
[state-dump] Number of executing tasks: 0
[state-dump] Number of pinned task arguments: 0
[state-dump] Number of total spilled tasks: 0
[state-dump] Number of spilled waiting tasks: 0
[state-dump] Number of spilled unschedulable tasks: 0
[state-dump] Resource usage {
[state-dump] }
[state-dump] Backlog Size per scheduling descriptor :{workerId: num backlogs}:
[state-dump]
[state-dump] Running tasks by scheduling class:
[state-dump] ==================================================
[state-dump]
[state-dump] ClusterResources:
[state-dump] LocalObjectManager:
[state-dump] - num pinned objects: 0
[state-dump] - pinned objects size: 0
[state-dump] - num objects pending restore: 0
[state-dump] - num objects pending spill: 0
[state-dump] - num bytes pending spill: 0
[state-dump] - num bytes currently spilled: 0
[state-dump] - cumulative spill requests: 0
[state-dump] - cumulative restore requests: 0
[state-dump] - spilled objects pending delete: 0
[state-dump]
[state-dump] ObjectManager:
[state-dump] - num local objects: 0
[state-dump] - num unfulfilled push requests: 0
[state-dump] - num object pull requests: 0
[state-dump] - num chunks received total: 0
[state-dump] - num chunks received failed (all): 0
[state-dump] - num chunks received failed / cancelled: 0
[state-dump] - num chunks received failed / plasma error: 0
[state-dump] Event stats:
[state-dump] Global stats: 0 total (0 active)
[state-dump] Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] Execution time: mean = -nan s, total = 0.000 s
[state-dump] Event stats:
[state-dump] PushManager:
[state-dump] - num pushes in flight: 0
[state-dump] - num chunks in flight: 0
[state-dump] - num chunks remaining: 0
[state-dump] - max chunks allowed: 409
[state-dump] OwnershipBasedObjectDirectory:
[state-dump] - num listeners: 0
[state-dump] - cumulative location updates: 0
[state-dump] - num location updates per second: 69998594105052000.000
[state-dump] - num location lookups per second: 69998594105040000.000
[state-dump] - num locations added per second: 0.000
[state-dump] - num locations removed per second: 0.000
[state-dump] BufferPool:
[state-dump] - create buffer state map size: 0
[state-dump] PullManager:
[state-dump] - num bytes available for pulled objects: 2147483648
[state-dump] - num bytes being pulled (all): 0
[state-dump] - num bytes being pulled / pinned: 0
[state-dump] - get request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable}
[state-dump] - wait request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable}
[state-dump] - task request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable}
[state-dump] - first get request bundle: N/A
[state-dump] - first wait request bundle: N/A
[state-dump] - first task request bundle: N/A
[state-dump] - num objects queued: 0
[state-dump] - num objects actively pulled (all): 0
[state-dump] - num objects actively pulled / pinned: 0
[state-dump] - num bundles being pulled: 0
[state-dump] - num pull retries: 0
[state-dump] - max timeout seconds: 0
[state-dump] - max timeout request is already processed. No entry.
[state-dump]
[state-dump] WorkerPool:
[state-dump] - registered jobs: 0
[state-dump] - process_failed_job_config_missing: 0
[state-dump] - process_failed_rate_limited: 0
[state-dump] - process_failed_pending_registration: 0
[state-dump] - process_failed_runtime_env_setup_failed: 0
[state-dump] - num PYTHON workers: 0
[state-dump] - num PYTHON drivers: 0
[state-dump] - num PYTHON pending start requests: 0
[state-dump] - num PYTHON pending registration requests: 0
[state-dump] - num object spill callbacks queued: 0
[state-dump] - num object restore queued: 0
[state-dump] - num util functions queued: 0
[state-dump] - num idle workers: 0
[state-dump] TaskDependencyManager:
[state-dump] - task deps map size: 0
[state-dump] - get req map size: 0
[state-dump] - wait req map size: 0
[state-dump] - local objects map size: 0
[state-dump] WaitManager:
[state-dump] - num active wait requests: 0
[state-dump] Subscriber:
[state-dump] Channel WORKER_OBJECT_LOCATIONS_CHANNEL
[state-dump] - cumulative subscribe requests: 0
[state-dump] - cumulative unsubscribe requests: 0
[state-dump] - active subscribed publishers: 0
[state-dump] - cumulative published messages: 0
[state-dump] - cumulative processed messages: 0
[state-dump] Channel WORKER_OBJECT_EVICTION
[state-dump] - cumulative subscribe requests: 0
[state-dump] - cumulative unsubscribe requests: 0
[state-dump] - active subscribed publishers: 0
[state-dump] - cumulative published messages: 0
[state-dump] - cumulative processed messages: 0
[state-dump] Channel WORKER_REF_REMOVED_CHANNEL
[state-dump] - cumulative subscribe requests: 0
[state-dump] - cumulative unsubscribe requests: 0
[state-dump] - active subscribed publishers: 0
[state-dump] - cumulative published messages: 0
[state-dump] - cumulative processed messages: 0
[state-dump] num async plasma notifications: 0
[state-dump] Remote node managers:
[state-dump] Event stats:
[state-dump] Global stats: 28 total (13 active)
[state-dump] Queueing time: mean = 1.467 ms, max = 11.475 ms, min = 28.733 us, total = 41.081 ms
[state-dump] Execution time: mean = 36.782 ms, total = 1.030 s
[state-dump] Event stats:
[state-dump] PeriodicalRunner.RunFnPeriodically - 11 total (2 active, 1 running), Execution time: mean = 170.919 us, total = 1.880 ms, Queueing time: mean = 3.701 ms, max = 11.475 ms, min = 28.733 us, total = 40.710 ms
[state-dump] NodeManager.ScheduleAndDispatchTasks - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] NodeManager.deadline_timer.spill_objects_when_over_threshold - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig - 1 total (0 active), Execution time: mean = 1.723 ms, total = 1.723 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] NodeManager.deadline_timer.flush_free_objects - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode - 1 total (0 active), Execution time: mean = 2.309 ms, total = 2.309 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] NodeManager.deadline_timer.debug_state_dump - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] NodeManager.deadline_timer.record_metrics - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch.OnReplyReceived - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] ObjectManager.UpdateAvailableMemory - 1 total (0 active), Execution time: mean = 4.329 us, total = 4.329 us, Queueing time: mean = 140.059 us, max = 140.059 us, min = 140.059 us, total = 140.059 us
[state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig.OnReplyReceived - 1 total (0 active), Execution time: mean = 1.022 s, total = 1.022 s, Queueing time: mean = 109.142 us, max = 109.142 us, min = 109.142 us, total = 109.142 us
[state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch - 1 total (0 active), Execution time: mean = 1.589 ms, total = 1.589 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] NodeManager.GCTaskFailureReason - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] RayletWorkerPool.deadline_timer.kill_idle_workers - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] MemoryMonitor.CheckIsMemoryUsageAboveThreshold - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode.OnReplyReceived - 1 total (0 active), Execution time: mean = 247.102 us, total = 247.102 us, Queueing time: mean = 121.669 us, max = 121.669 us, min = 121.669 us, total = 121.669 us
[state-dump] ClusterResourceManager.ResetRemoteNodeView - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[state-dump] DebugString() time ms: 0
[state-dump]
[state-dump]
[2025-01-15 18:15:46,545 I 517589 517589] (raylet) accessor.cc:762: Received notification for node, IsAlive = 1 node_id=594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be
[2025-01-15 18:15:46,607 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517721, the token is 0
[2025-01-15 18:15:46,611 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517722, the token is 1
[2025-01-15 18:15:46,613 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517723, the token is 2
[2025-01-15 18:15:46,615 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517724, the token is 3
[2025-01-15 18:15:46,617 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517725, the token is 4
[2025-01-15 18:15:46,619 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517726, the token is 5
[2025-01-15 18:15:46,622 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517727, the token is 6
[2025-01-15 18:15:46,624 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517728, the token is 7
[2025-01-15 18:15:46,626 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517729, the token is 8
[2025-01-15 18:15:46,629 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517730, the token is 9
[2025-01-15 18:15:46,632 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517731, the token is 10
[2025-01-15 18:15:46,634 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517732, the token is 11
[2025-01-15 18:15:46,636 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517733, the token is 12
[2025-01-15 18:15:46,638 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517734, the token is 13
[2025-01-15 18:15:46,640 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517735, the token is 14
[2025-01-15 18:15:46,642 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517736, the token is 15
[2025-01-15 18:15:46,645 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517737, the token is 16
[2025-01-15 18:15:46,648 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517738, the token is 17
[2025-01-15 18:15:46,650 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517739, the token is 18
[2025-01-15 18:15:46,652 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 517740, the token is 19
[2025-01-15 18:15:47,355 I 517589 517618] (raylet) object_store.cc:35: Object store current usage 8e-09 / 2.14748 GB.
[2025-01-15 18:15:47,464 I 517589 517589] (raylet) worker_pool.cc:692: Job 01000000 already started in worker pool.
[2025-01-15 18:15:48,253 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=3, has creation task exception = false
[2025-01-15 18:15:48,579 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,579 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,580 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,580 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,580 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,581 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,581 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,581 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,581 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,587 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,589 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,590 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,591 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,591 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,591 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,592 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,592 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,593 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:48,956 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=3, has creation task exception = false
[2025-01-15 18:15:48,964 I 517589 517589] (raylet) worker_pool.cc:501: Started worker process with pid 519419, the token is 20
[2025-01-15 18:15:49,498 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=3, has creation task exception = false
[2025-01-15 18:15:49,499 I 517589 517589] (raylet) node_manager.cc:1586: Driver (pid=517323) is disconnected. worker_id=01000000ffffffffffffffffffffffffffffffffffffffffffffffff job_id=01000000
[2025-01-15 18:15:49,501 I 517589 517589] (raylet) node_manager.cc:1111: The leased worker 755a376c5f34c660099a786e9c0a24496c4ff184dab5588ae567219d is killed because the owner process 01000000ffffffffffffffffffffffffffffffffffffffffffffffff died.
[2025-01-15 18:15:49,503 I 517589 517589] (raylet) worker_pool.cc:692: Job 01000000 already started in worker pool.
[2025-01-15 18:15:49,503 I 517589 517589] (raylet) node_manager.cc:633: The leased worker is killed because the job 01000000 finished. worker_id=755a376c5f34c660099a786e9c0a24496c4ff184dab5588ae567219d
[2025-01-15 18:15:49,512 I 517589 517589] (raylet) node_manager.cc:1481: NodeManager::DisconnectClient, disconnect_type=1, has creation task exception = false
[2025-01-15 18:15:49,708 I 517589 517589] (raylet) main.cc:454: received SIGTERM. Existing local drain request = None
[2025-01-15 18:15:49,708 I 517589 517589] (raylet) main.cc:255: Raylet graceful shutdown triggered, reason = EXPECTED_TERMINATION, reason message = received SIGTERM
[2025-01-15 18:15:49,708 I 517589 517589] (raylet) main.cc:258: Shutting down...
[2025-01-15 18:15:49,708 I 517589 517589] (raylet) accessor.cc:510: Unregistering node node_id=594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be
[2025-01-15 18:15:49,711 I 517589 517589] (raylet) accessor.cc:523: Finished unregistering node info, status = OK node_id=594aea7169520c22e97f1719928454f2113460e3e5c4982ab417b9be
[2025-01-15 18:15:49,715 I 517589 517589] (raylet) agent_manager.cc:112: Killing agent dashboard_agent/424238335, pid 517681.
[2025-01-15 18:15:49,727 I 517589 517682] (raylet) agent_manager.cc:79: Agent process with name dashboard_agent/424238335 exited, exit code 0.
[2025-01-15 18:15:49,728 I 517589 517589] (raylet) agent_manager.cc:112: Killing agent runtime_env_agent, pid 517683.
[2025-01-15 18:15:49,736 I 517589 517684] (raylet) agent_manager.cc:79: Agent process with name runtime_env_agent exited, exit code 0.
[2025-01-15 18:15:49,737 I 517589 517589] (raylet) io_service_pool.cc:47: IOServicePool is stopped.
[2025-01-15 18:15:49,850 I 517589 517589] (raylet) stats.h:120: Stats module has shutdown.