[2025-01-21 05:58:20,105 I 20537 20537] (raylet) main.cc:180: Setting cluster ID to: 9f6a3ba5c12bdaae5991e0c6ce9a10b1c06e51056583b7869135876e [2025-01-21 05:58:20,111 I 20537 20537] (raylet) main.cc:289: Raylet is not set to kill unknown children. [2025-01-21 05:58:20,111 I 20537 20537] (raylet) io_service_pool.cc:35: IOServicePool is running with 1 io_service. [2025-01-21 05:58:20,112 I 20537 20537] (raylet) main.cc:419: Setting node ID node_id=331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c [2025-01-21 05:58:20,112 I 20537 20537] (raylet) store_runner.cc:32: Allowing the Plasma store to use up to 2.14748GB of memory. [2025-01-21 05:58:20,112 I 20537 20537] (raylet) store_runner.cc:48: Starting object store with directory /dev/shm, fallback /tmp/ray, and huge page support disabled [2025-01-21 05:58:20,112 I 20537 20566] (raylet) dlmalloc.cc:154: create_and_mmap_buffer(2147483656, /dev/shm/plasmaXXXXXX) [2025-01-21 05:58:20,113 I 20537 20566] (raylet) store.cc:564: Plasma store debug dump: Current usage: 0 / 2.14748 GB - num bytes created total: 0 0 pending objects of total size 0MB - objects spillable: 0 - bytes spillable: 0 - objects unsealed: 0 - bytes unsealed: 0 - objects in use: 0 - bytes in use: 0 - objects evictable: 0 - bytes evictable: 0 - objects created by worker: 0 - bytes created by worker: 0 - objects restored: 0 - bytes restored: 0 - objects received: 0 - bytes received: 0 - objects errored: 0 - bytes errored: 0 [2025-01-21 05:58:21,116 I 20537 20537] (raylet) grpc_server.cc:134: ObjectManager server started, listening on port 46851. [2025-01-21 05:58:21,119 I 20537 20537] (raylet) worker_killing_policy.cc:101: Running GroupByOwner policy. [2025-01-21 05:58:21,120 I 20537 20537] (raylet) memory_monitor.cc:47: MemoryMonitor initialized with usage threshold at 94999994368 bytes (0.95 system memory), total system memory bytes: 99999997952 [2025-01-21 05:58:21,120 I 20537 20537] (raylet) node_manager.cc:287: Initializing NodeManager node_id=331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c [2025-01-21 05:58:21,121 I 20537 20537] (raylet) grpc_server.cc:134: NodeManager server started, listening on port 45183. [2025-01-21 05:58:21,130 I 20537 20630] (raylet) agent_manager.cc:77: Monitor agent process with name dashboard_agent/424238335 [2025-01-21 05:58:21,130 I 20537 20632] (raylet) agent_manager.cc:77: Monitor agent process with name runtime_env_agent [2025-01-21 05:58:21,130 I 20537 20537] (raylet) event.cc:493: Ray Event initialized for RAYLET [2025-01-21 05:58:21,130 I 20537 20537] (raylet) event.cc:324: Set ray event level to warning [2025-01-21 05:58:21,132 I 20537 20537] (raylet) raylet.cc:134: Raylet of id, 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c started. Raylet consists of node_manager and object_manager. node_manager address: 192.168.0.2:45183 object_manager address: 192.168.0.2:46851 hostname: 0cd925b1f73b [2025-01-21 05:58:21,134 I 20537 20537] (raylet) node_manager.cc:525: [state-dump] NodeManager: [state-dump] Node ID: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c [state-dump] Node name: 192.168.0.2 [state-dump] InitialConfigResources: {node:192.168.0.2: 10000, node:__internal_head__: 10000, memory: 797346525190000, accelerator_type:A40: 10000, object_store_memory: 21474836480000, CPU: 200000, GPU: 20000} [state-dump] ClusterTaskManager: [state-dump] ========== Node: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c ================= [state-dump] Infeasible queue length: 0 [state-dump] Schedule queue length: 0 [state-dump] Dispatch queue length: 0 [state-dump] num_waiting_for_resource: 0 [state-dump] num_waiting_for_plasma_memory: 0 [state-dump] num_waiting_for_remote_node_resources: 0 [state-dump] num_worker_not_started_by_job_config_not_exist: 0 [state-dump] num_worker_not_started_by_registration_timeout: 0 [state-dump] num_tasks_waiting_for_workers: 0 [state-dump] num_cancelled_tasks: 0 [state-dump] cluster_resource_scheduler state: [state-dump] Local id: -6890545921723833623 Local resources: {"total":{node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [200000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "available": {node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [200000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",} is_draining: 0 is_idle: 1 Cluster resources: node id: -6890545921723833623{"total":{object_store_memory: 21474836480000, memory: 797346525190000, CPU: 200000, node:192.168.0.2: 10000, node:__internal_head__: 10000, GPU: 20000, accelerator_type:A40: 10000}}, "available": {object_store_memory: 21474836480000, memory: 797346525190000, CPU: 200000, node:192.168.0.2: 10000, node:__internal_head__: 10000, GPU: 20000, accelerator_type:A40: 10000}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",}, "is_draining": 0, "draining_deadline_timestamp_ms": -1} { "placment group locations": [], "node to bundles": []} [state-dump] Waiting tasks size: 0 [state-dump] Number of executing tasks: 0 [state-dump] Number of pinned task arguments: 0 [state-dump] Number of total spilled tasks: 0 [state-dump] Number of spilled waiting tasks: 0 [state-dump] Number of spilled unschedulable tasks: 0 [state-dump] Resource usage { [state-dump] } [state-dump] Backlog Size per scheduling descriptor :{workerId: num backlogs}: [state-dump] [state-dump] Running tasks by scheduling class: [state-dump] ================================================== [state-dump] [state-dump] ClusterResources: [state-dump] LocalObjectManager: [state-dump] - num pinned objects: 0 [state-dump] - pinned objects size: 0 [state-dump] - num objects pending restore: 0 [state-dump] - num objects pending spill: 0 [state-dump] - num bytes pending spill: 0 [state-dump] - num bytes currently spilled: 0 [state-dump] - cumulative spill requests: 0 [state-dump] - cumulative restore requests: 0 [state-dump] - spilled objects pending delete: 0 [state-dump] [state-dump] ObjectManager: [state-dump] - num local objects: 0 [state-dump] - num unfulfilled push requests: 0 [state-dump] - num object pull requests: 0 [state-dump] - num chunks received total: 0 [state-dump] - num chunks received failed (all): 0 [state-dump] - num chunks received failed / cancelled: 0 [state-dump] - num chunks received failed / plasma error: 0 [state-dump] Event stats: [state-dump] Global stats: 0 total (0 active) [state-dump] Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] Execution time: mean = -nan s, total = 0.000 s [state-dump] Event stats: [state-dump] PushManager: [state-dump] - num pushes in flight: 0 [state-dump] - num chunks in flight: 0 [state-dump] - num chunks remaining: 0 [state-dump] - max chunks allowed: 409 [state-dump] OwnershipBasedObjectDirectory: [state-dump] - num listeners: 0 [state-dump] - cumulative location updates: 0 [state-dump] - num location updates per second: 69816253030172000.000 [state-dump] - num location lookups per second: 69816253030160000.000 [state-dump] - num locations added per second: 0.000 [state-dump] - num locations removed per second: 0.000 [state-dump] BufferPool: [state-dump] - create buffer state map size: 0 [state-dump] PullManager: [state-dump] - num bytes available for pulled objects: 2147483648 [state-dump] - num bytes being pulled (all): 0 [state-dump] - num bytes being pulled / pinned: 0 [state-dump] - get request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - wait request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - task request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - first get request bundle: N/A [state-dump] - first wait request bundle: N/A [state-dump] - first task request bundle: N/A [state-dump] - num objects queued: 0 [state-dump] - num objects actively pulled (all): 0 [state-dump] - num objects actively pulled / pinned: 0 [state-dump] - num bundles being pulled: 0 [state-dump] - num pull retries: 0 [state-dump] - max timeout seconds: 0 [state-dump] - max timeout request is already processed. No entry. [state-dump] [state-dump] WorkerPool: [state-dump] - registered jobs: 0 [state-dump] - process_failed_job_config_missing: 0 [state-dump] - process_failed_rate_limited: 0 [state-dump] - process_failed_pending_registration: 0 [state-dump] - process_failed_runtime_env_setup_failed: 0 [state-dump] - num PYTHON workers: 0 [state-dump] - num PYTHON drivers: 0 [state-dump] - num PYTHON pending start requests: 0 [state-dump] - num PYTHON pending registration requests: 0 [state-dump] - num object spill callbacks queued: 0 [state-dump] - num object restore queued: 0 [state-dump] - num util functions queued: 0 [state-dump] - num idle workers: 0 [state-dump] TaskDependencyManager: [state-dump] - task deps map size: 0 [state-dump] - get req map size: 0 [state-dump] - wait req map size: 0 [state-dump] - local objects map size: 0 [state-dump] WaitManager: [state-dump] - num active wait requests: 0 [state-dump] Subscriber: [state-dump] Channel WORKER_OBJECT_LOCATIONS_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_OBJECT_EVICTION [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_REF_REMOVED_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] num async plasma notifications: 0 [state-dump] Remote node managers: [state-dump] Event stats: [state-dump] Global stats: 28 total (13 active) [state-dump] Queueing time: mean = 1.369 ms, max = 10.761 ms, min = 8.247 us, total = 38.324 ms [state-dump] Execution time: mean = 36.610 ms, total = 1.025 s [state-dump] Event stats: [state-dump] PeriodicalRunner.RunFnPeriodically - 11 total (2 active, 1 running), Execution time: mean = 168.117 us, total = 1.849 ms, Queueing time: mean = 3.480 ms, max = 10.761 ms, min = 21.417 us, total = 38.283 ms [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig.OnReplyReceived - 1 total (0 active), Execution time: mean = 1.020 s, total = 1.020 s, Queueing time: mean = 8.247 us, max = 8.247 us, min = 8.247 us, total = 8.247 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch - 1 total (0 active), Execution time: mean = 710.476 us, total = 710.476 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ObjectManager.UpdateAvailableMemory - 1 total (0 active), Execution time: mean = 1.654 us, total = 1.654 us, Queueing time: mean = 17.825 us, max = 17.825 us, min = 17.825 us, total = 17.825 us [state-dump] NodeManager.GCTaskFailureReason - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.deadline_timer.flush_free_objects - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.deadline_timer.record_metrics - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.deadline_timer.debug_state_dump - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch.OnReplyReceived - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.ScheduleAndDispatchTasks - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.deadline_timer.spill_objects_when_over_threshold - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] RayletWorkerPool.deadline_timer.kill_idle_workers - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ClusterResourceManager.ResetRemoteNodeView - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode - 1 total (0 active), Execution time: mean = 1.806 ms, total = 1.806 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] MemoryMonitor.CheckIsMemoryUsageAboveThreshold - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig - 1 total (0 active), Execution time: mean = 933.039 us, total = 933.039 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode.OnReplyReceived - 1 total (0 active), Execution time: mean = 177.035 us, total = 177.035 us, Queueing time: mean = 15.337 us, max = 15.337 us, min = 15.337 us, total = 15.337 us [state-dump] DebugString() time ms: 0 [state-dump] [state-dump] [2025-01-21 05:58:21,136 I 20537 20537] (raylet) accessor.cc:762: Received notification for node, IsAlive = 1 node_id=331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c [2025-01-21 05:58:21,162 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20669, the token is 0 [2025-01-21 05:58:21,165 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20670, the token is 1 [2025-01-21 05:58:21,167 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20671, the token is 2 [2025-01-21 05:58:21,169 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20672, the token is 3 [2025-01-21 05:58:21,171 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20673, the token is 4 [2025-01-21 05:58:21,174 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20674, the token is 5 [2025-01-21 05:58:21,176 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20675, the token is 6 [2025-01-21 05:58:21,177 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20676, the token is 7 [2025-01-21 05:58:21,179 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20677, the token is 8 [2025-01-21 05:58:21,181 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20678, the token is 9 [2025-01-21 05:58:21,183 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20679, the token is 10 [2025-01-21 05:58:21,185 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20680, the token is 11 [2025-01-21 05:58:21,186 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20681, the token is 12 [2025-01-21 05:58:21,188 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20682, the token is 13 [2025-01-21 05:58:21,189 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20683, the token is 14 [2025-01-21 05:58:21,191 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20684, the token is 15 [2025-01-21 05:58:21,192 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20685, the token is 16 [2025-01-21 05:58:21,194 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20686, the token is 17 [2025-01-21 05:58:21,196 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20687, the token is 18 [2025-01-21 05:58:21,197 I 20537 20537] (raylet) worker_pool.cc:501: Started worker process with pid 20688, the token is 19 [2025-01-21 05:58:21,777 I 20537 20566] (raylet) object_store.cc:35: Object store current usage 8e-09 / 2.14748 GB. [2025-01-21 05:58:21,923 I 20537 20537] (raylet) worker_pool.cc:692: Job 01000000 already started in worker pool. [2025-01-21 05:58:30,125 W 20537 20560] (raylet) metric_exporter.cc:105: [1] Export metrics to agent failed: RpcError: RPC Error message: failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:58636: Failed to connect to remote host: Connection refused; RPC Error details: . This won't affect Ray, but you can lose metrics from the cluster. [2025-01-21 05:59:20,113 I 20537 20566] (raylet) store.cc:564: Plasma store debug dump: Current usage: 0 / 2.14748 GB - num bytes created total: 168 0 pending objects of total size 0MB - objects spillable: 0 - bytes spillable: 0 - objects unsealed: 0 - bytes unsealed: 0 - objects in use: 0 - bytes in use: 0 - objects evictable: 0 - bytes evictable: 0 - objects created by worker: 0 - bytes created by worker: 0 - objects restored: 0 - bytes restored: 0 - objects received: 0 - bytes received: 0 - objects errored: 0 - bytes errored: 0 [2025-01-21 05:59:21,137 I 20537 20537] (raylet) node_manager.cc:525: [state-dump] NodeManager: [state-dump] Node ID: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c [state-dump] Node name: 192.168.0.2 [state-dump] InitialConfigResources: {node:192.168.0.2: 10000, node:__internal_head__: 10000, memory: 797346525190000, accelerator_type:A40: 10000, object_store_memory: 21474836480000, CPU: 200000, GPU: 20000} [state-dump] ClusterTaskManager: [state-dump] ========== Node: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c ================= [state-dump] Infeasible queue length: 0 [state-dump] Schedule queue length: 0 [state-dump] Dispatch queue length: 0 [state-dump] num_waiting_for_resource: 0 [state-dump] num_waiting_for_plasma_memory: 0 [state-dump] num_waiting_for_remote_node_resources: 0 [state-dump] num_worker_not_started_by_job_config_not_exist: 0 [state-dump] num_worker_not_started_by_registration_timeout: 0 [state-dump] num_tasks_waiting_for_workers: 0 [state-dump] num_cancelled_tasks: 0 [state-dump] cluster_resource_scheduler state: [state-dump] Local id: -6890545921723833623 Local resources: {"total":{node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [200000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "available": {node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [190000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",} is_draining: 0 is_idle: 0 Cluster resources: node id: -6890545921723833623{"total":{node:192.168.0.2: 10000, GPU: 20000, accelerator_type:A40: 10000, CPU: 200000, memory: 797346525190000, object_store_memory: 21474836480000, node:__internal_head__: 10000}}, "available": {GPU: 20000, node:192.168.0.2: 10000, node:__internal_head__: 10000, accelerator_type:A40: 10000, CPU: 190000, object_store_memory: 21474836480000, memory: 797346525190000}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",}, "is_draining": 0, "draining_deadline_timestamp_ms": -1} { "placment group locations": [], "node to bundles": []} [state-dump] Waiting tasks size: 0 [state-dump] Number of executing tasks: 1 [state-dump] Number of pinned task arguments: 0 [state-dump] Number of total spilled tasks: 0 [state-dump] Number of spilled waiting tasks: 0 [state-dump] Number of spilled unschedulable tasks: 0 [state-dump] Resource usage { [state-dump] - (language=PYTHON actor_or_task=process_csv_file pid=20684 worker_id=ccafd42c4da13863c8b878c2c6c5c5d04e6b29a0f376eeb80da59a5f): {CPU: 10000} [state-dump] } [state-dump] Backlog Size per scheduling descriptor :{workerId: num backlogs}: [state-dump] [state-dump] Running tasks by scheduling class: [state-dump] - {depth=1 function_descriptor={type=PythonFunctionDescriptor, module_name=__main__, class_name=, function_name=process_csv_file, function_hash=6efd4765ebe3481f9e016260841b31ad} scheduling_strategy=default_scheduling_strategy { [state-dump] } [state-dump] resource_set={CPU : 1, }}: 1/20 [state-dump] ================================================== [state-dump] [state-dump] ClusterResources: [state-dump] LocalObjectManager: [state-dump] - num pinned objects: 0 [state-dump] - pinned objects size: 0 [state-dump] - num objects pending restore: 0 [state-dump] - num objects pending spill: 0 [state-dump] - num bytes pending spill: 0 [state-dump] - num bytes currently spilled: 0 [state-dump] - cumulative spill requests: 0 [state-dump] - cumulative restore requests: 0 [state-dump] - spilled objects pending delete: 0 [state-dump] [state-dump] ObjectManager: [state-dump] - num local objects: 0 [state-dump] - num unfulfilled push requests: 0 [state-dump] - num object pull requests: 0 [state-dump] - num chunks received total: 0 [state-dump] - num chunks received failed (all): 0 [state-dump] - num chunks received failed / cancelled: 0 [state-dump] - num chunks received failed / plasma error: 0 [state-dump] Event stats: [state-dump] Global stats: 0 total (0 active) [state-dump] Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] Execution time: mean = -nan s, total = 0.000 s [state-dump] Event stats: [state-dump] PushManager: [state-dump] - num pushes in flight: 0 [state-dump] - num chunks in flight: 0 [state-dump] - num chunks remaining: 0 [state-dump] - max chunks allowed: 409 [state-dump] OwnershipBasedObjectDirectory: [state-dump] - num listeners: 0 [state-dump] - cumulative location updates: 0 [state-dump] - num location updates per second: 0.000 [state-dump] - num location lookups per second: 0.000 [state-dump] - num locations added per second: 0.000 [state-dump] - num locations removed per second: 0.000 [state-dump] BufferPool: [state-dump] - create buffer state map size: 0 [state-dump] PullManager: [state-dump] - num bytes available for pulled objects: 2147483648 [state-dump] - num bytes being pulled (all): 0 [state-dump] - num bytes being pulled / pinned: 0 [state-dump] - get request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - wait request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - task request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - first get request bundle: N/A [state-dump] - first wait request bundle: N/A [state-dump] - first task request bundle: N/A [state-dump] - num objects queued: 0 [state-dump] - num objects actively pulled (all): 0 [state-dump] - num objects actively pulled / pinned: 0 [state-dump] - num bundles being pulled: 0 [state-dump] - num pull retries: 0 [state-dump] - max timeout seconds: 0 [state-dump] - max timeout request is already processed. No entry. [state-dump] [state-dump] WorkerPool: [state-dump] - registered jobs: 1 [state-dump] - process_failed_job_config_missing: 0 [state-dump] - process_failed_rate_limited: 0 [state-dump] - process_failed_pending_registration: 0 [state-dump] - process_failed_runtime_env_setup_failed: 0 [state-dump] - num PYTHON workers: 20 [state-dump] - num PYTHON drivers: 1 [state-dump] - num PYTHON pending start requests: 0 [state-dump] - num PYTHON pending registration requests: 0 [state-dump] - num object spill callbacks queued: 0 [state-dump] - num object restore queued: 0 [state-dump] - num util functions queued: 0 [state-dump] - num idle workers: 19 [state-dump] TaskDependencyManager: [state-dump] - task deps map size: 0 [state-dump] - get req map size: 0 [state-dump] - wait req map size: 0 [state-dump] - local objects map size: 0 [state-dump] WaitManager: [state-dump] - num active wait requests: 0 [state-dump] Subscriber: [state-dump] Channel WORKER_OBJECT_LOCATIONS_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_OBJECT_EVICTION [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_REF_REMOVED_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] num async plasma notifications: 0 [state-dump] Remote node managers: [state-dump] Event stats: [state-dump] Global stats: 5532 total (35 active) [state-dump] Queueing time: mean = 312.525 us, max = 723.022 ms, min = 67.000 ns, total = 1.729 s [state-dump] Execution time: mean = 506.863 us, total = 2.804 s [state-dump] Event stats: [state-dump] NodeManagerService.grpc_server.ReportWorkerBacklog.HandleRequestImpl - 1260 total (0 active), Execution time: mean = 37.778 us, total = 47.600 ms, Queueing time: mean = 126.479 us, max = 9.238 ms, min = 3.786 us, total = 159.363 ms [state-dump] NodeManagerService.grpc_server.ReportWorkerBacklog - 1260 total (0 active), Execution time: mean = 521.961 us, total = 657.671 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ObjectManager.UpdateAvailableMemory - 600 total (0 active), Execution time: mean = 5.323 us, total = 3.194 ms, Queueing time: mean = 124.001 us, max = 15.839 ms, min = 5.000 us, total = 74.401 ms [state-dump] RaySyncer.OnDemandBroadcasting - 600 total (1 active), Execution time: mean = 10.163 us, total = 6.098 ms, Queueing time: mean = 107.272 us, max = 11.586 ms, min = 11.472 us, total = 64.363 ms [state-dump] NodeManager.CheckGC - 600 total (1 active), Execution time: mean = 3.026 us, total = 1.816 ms, Queueing time: mean = 113.586 us, max = 11.587 ms, min = 7.381 us, total = 68.152 ms [state-dump] RayletWorkerPool.deadline_timer.kill_idle_workers - 300 total (1 active), Execution time: mean = 18.038 us, total = 5.411 ms, Queueing time: mean = 126.209 us, max = 15.957 ms, min = 19.723 us, total = 37.863 ms [state-dump] MemoryMonitor.CheckIsMemoryUsageAboveThreshold - 240 total (1 active), Execution time: mean = 437.606 us, total = 105.025 ms, Queueing time: mean = 136.179 us, max = 15.659 ms, min = 11.930 us, total = 32.683 ms [state-dump] ClientConnection.async_read.ProcessMessageHeader - 85 total (21 active), Execution time: mean = 4.990 us, total = 424.148 us, Queueing time: mean = 13.672 ms, max = 723.022 ms, min = 16.524 us, total = 1.162 s [state-dump] ClientConnection.async_read.ProcessMessage - 64 total (0 active), Execution time: mean = 851.051 us, total = 54.467 ms, Queueing time: mean = 240.193 us, max = 9.118 ms, min = 3.826 us, total = 15.372 ms [state-dump] NodeManagerService.grpc_server.GetResourceLoad - 60 total (0 active), Execution time: mean = 623.230 us, total = 37.394 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManagerService.grpc_server.GetResourceLoad.HandleRequestImpl - 60 total (0 active), Execution time: mean = 120.492 us, total = 7.229 ms, Queueing time: mean = 107.724 us, max = 203.468 us, min = 18.573 us, total = 6.463 ms [state-dump] NodeManager.deadline_timer.flush_free_objects - 60 total (1 active), Execution time: mean = 8.194 us, total = 491.660 us, Queueing time: mean = 176.052 us, max = 1.408 ms, min = 22.520 us, total = 10.563 ms [state-dump] NodeManager.deadline_timer.spill_objects_when_over_threshold - 60 total (1 active), Execution time: mean = 3.064 us, total = 183.859 us, Queueing time: mean = 179.588 us, max = 1.404 ms, min = 21.258 us, total = 10.775 ms [state-dump] NodeManager.ScheduleAndDispatchTasks - 60 total (1 active), Execution time: mean = 15.790 us, total = 947.410 us, Queueing time: mean = 64.739 us, max = 121.635 us, min = 21.724 us, total = 3.884 ms [state-dump] ClientConnection.async_write.DoAsyncWrites - 22 total (0 active), Execution time: mean = 767.364 ns, total = 16.882 us, Queueing time: mean = 49.290 us, max = 318.614 us, min = 14.430 us, total = 1.084 ms [state-dump] NodeManagerService.grpc_server.GetSystemConfig.HandleRequestImpl - 21 total (0 active), Execution time: mean = 88.640 us, total = 1.861 ms, Queueing time: mean = 16.877 us, max = 30.971 us, min = 10.447 us, total = 354.414 us [state-dump] ObjectManager.ObjectAdded - 21 total (0 active), Execution time: mean = 6.379 us, total = 133.950 us, Queueing time: mean = 513.373 us, max = 8.534 ms, min = 3.220 us, total = 10.781 ms [state-dump] ClusterResourceManager.ResetRemoteNodeView - 21 total (1 active), Execution time: mean = 8.753 us, total = 183.810 us, Queueing time: mean = 66.445 us, max = 149.424 us, min = 28.220 us, total = 1.395 ms [state-dump] NodeManagerService.grpc_server.GetSystemConfig - 21 total (0 active), Execution time: mean = 759.112 us, total = 15.941 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ObjectManager.ObjectDeleted - 21 total (0 active), Execution time: mean = 11.946 us, total = 250.872 us, Queueing time: mean = 589.997 us, max = 9.059 ms, min = 23.448 us, total = 12.390 ms [state-dump] PeriodicalRunner.RunFnPeriodically - 13 total (0 active), Execution time: mean = 195.572 us, total = 2.542 ms, Queueing time: mean = 3.255 ms, max = 10.761 ms, min = 21.417 us, total = 42.315 ms [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.CheckAlive.OnReplyReceived - 12 total (0 active), Execution time: mean = 47.548 us, total = 570.571 us, Queueing time: mean = 90.629 us, max = 179.847 us, min = 14.883 us, total = 1.088 ms [state-dump] NodeManager.GcsCheckAlive - 12 total (1 active), Execution time: mean = 255.250 us, total = 3.063 ms, Queueing time: mean = 582.129 us, max = 1.438 ms, min = 172.307 us, total = 6.986 ms [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.CheckAlive - 12 total (0 active), Execution time: mean = 1.373 ms, total = 16.472 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.deadline_timer.record_metrics - 12 total (1 active), Execution time: mean = 529.348 us, total = 6.352 ms, Queueing time: mean = 333.925 us, max = 1.173 ms, min = 34.912 us, total = 4.007 ms [state-dump] NodeManager.deadline_timer.debug_state_dump - 6 total (1 active), Execution time: mean = 1.547 ms, total = 9.283 ms, Queueing time: mean = 42.018 us, max = 70.256 us, min = 16.766 us, total = 252.108 us [state-dump] - 2 total (0 active), Execution time: mean = 493.500 ns, total = 987.000 ns, Queueing time: mean = 25.826 us, max = 27.369 us, min = 24.284 us, total = 51.653 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll - 2 total (1 active), Execution time: mean = 395.258 ms, total = 790.517 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] RaySyncer.BroadcastMessage - 2 total (0 active), Execution time: mean = 116.754 us, total = 233.508 us, Queueing time: mean = 431.000 ns, max = 757.000 ns, min = 105.000 ns, total = 862.000 ns [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch - 2 total (0 active), Execution time: mean = 735.737 us, total = 1.471 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch.OnReplyReceived - 2 total (0 active), Execution time: mean = 96.942 us, total = 193.883 us, Queueing time: mean = 849.112 us, max = 1.686 ms, min = 12.374 us, total = 1.698 ms [state-dump] RaySyncerRegister - 2 total (0 active), Execution time: mean = 1.321 us, total = 2.642 us, Queueing time: mean = 222.000 ns, max = 377.000 ns, min = 67.000 ns, total = 444.000 ns [state-dump] ray::rpc::JobInfoGcsService.grpc_client.AddJob - 1 total (0 active), Execution time: mean = 1.411 ms, total = 1.411 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode.OnReplyReceived - 1 total (0 active), Execution time: mean = 177.035 us, total = 177.035 us, Queueing time: mean = 15.337 us, max = 15.337 us, min = 15.337 us, total = 15.337 us [state-dump] ray::rpc::JobInfoGcsService.grpc_client.AddJob.OnReplyReceived - 1 total (0 active), Execution time: mean = 46.990 us, total = 46.990 us, Queueing time: mean = 159.464 us, max = 159.464 us, min = 159.464 us, total = 159.464 us [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig - 1 total (0 active), Execution time: mean = 933.039 us, total = 933.039 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.GetAllNodeInfo.OnReplyReceived - 1 total (0 active), Execution time: mean = 110.954 us, total = 110.954 us, Queueing time: mean = 39.567 us, max = 39.567 us, min = 39.567 us, total = 39.567 us [state-dump] NodeManagerService.grpc_server.RequestWorkerLease.HandleRequestImpl - 1 total (0 active), Execution time: mean = 224.288 us, total = 224.288 us, Queueing time: mean = 23.760 us, max = 23.760 us, min = 23.760 us, total = 23.760 us [state-dump] ray::rpc::JobInfoGcsService.grpc_client.GetAllJobInfo.OnReplyReceived - 1 total (0 active), Execution time: mean = 12.530 us, total = 12.530 us, Queueing time: mean = 10.668 us, max = 10.668 us, min = 10.668 us, total = 10.668 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll.OnReplyReceived - 1 total (0 active), Execution time: mean = 208.070 us, total = 208.070 us, Queueing time: mean = 22.206 us, max = 22.206 us, min = 22.206 us, total = 22.206 us [state-dump] NodeManager.deadline_timer.print_event_loop_stats - 1 total (1 active, 1 running), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode - 1 total (0 active), Execution time: mean = 1.806 ms, total = 1.806 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManagerService.grpc_server.RequestWorkerLease - 1 total (0 active), Execution time: mean = 498.490 us, total = 498.490 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig.OnReplyReceived - 1 total (0 active), Execution time: mean = 1.020 s, total = 1.020 s, Queueing time: mean = 8.247 us, max = 8.247 us, min = 8.247 us, total = 8.247 us [state-dump] Subscriber.HandlePublishedMessage_GCS_JOB_CHANNEL - 1 total (0 active), Execution time: mean = 52.501 us, total = 52.501 us, Queueing time: mean = 199.519 us, max = 199.519 us, min = 199.519 us, total = 199.519 us [state-dump] NodeManager.GCTaskFailureReason - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.GetAllNodeInfo - 1 total (0 active), Execution time: mean = 859.855 us, total = 859.855 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::JobInfoGcsService.grpc_client.GetAllJobInfo - 1 total (0 active), Execution time: mean = 938.962 us, total = 938.962 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] WorkerPool.PopWorkerCallback - 1 total (0 active), Execution time: mean = 41.143 us, total = 41.143 us, Queueing time: mean = 29.210 us, max = 29.210 us, min = 29.210 us, total = 29.210 us [state-dump] DebugString() time ms: 2 [state-dump] [state-dump] [2025-01-21 06:00:20,114 I 20537 20566] (raylet) store.cc:564: Plasma store debug dump: Current usage: 0 / 2.14748 GB - num bytes created total: 168 0 pending objects of total size 0MB - objects spillable: 0 - bytes spillable: 0 - objects unsealed: 0 - bytes unsealed: 0 - objects in use: 0 - bytes in use: 0 - objects evictable: 0 - bytes evictable: 0 - objects created by worker: 0 - bytes created by worker: 0 - objects restored: 0 - bytes restored: 0 - objects received: 0 - bytes received: 0 - objects errored: 0 - bytes errored: 0 [2025-01-21 06:00:21,140 I 20537 20537] (raylet) node_manager.cc:525: [state-dump] NodeManager: [state-dump] Node ID: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c [state-dump] Node name: 192.168.0.2 [state-dump] InitialConfigResources: {node:192.168.0.2: 10000, node:__internal_head__: 10000, memory: 797346525190000, accelerator_type:A40: 10000, object_store_memory: 21474836480000, CPU: 200000, GPU: 20000} [state-dump] ClusterTaskManager: [state-dump] ========== Node: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c ================= [state-dump] Infeasible queue length: 0 [state-dump] Schedule queue length: 0 [state-dump] Dispatch queue length: 0 [state-dump] num_waiting_for_resource: 0 [state-dump] num_waiting_for_plasma_memory: 0 [state-dump] num_waiting_for_remote_node_resources: 0 [state-dump] num_worker_not_started_by_job_config_not_exist: 0 [state-dump] num_worker_not_started_by_registration_timeout: 0 [state-dump] num_tasks_waiting_for_workers: 0 [state-dump] num_cancelled_tasks: 0 [state-dump] cluster_resource_scheduler state: [state-dump] Local id: -6890545921723833623 Local resources: {"total":{node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [200000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "available": {node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [190000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",} is_draining: 0 is_idle: 0 Cluster resources: node id: -6890545921723833623{"total":{node:192.168.0.2: 10000, GPU: 20000, accelerator_type:A40: 10000, CPU: 200000, memory: 797346525190000, object_store_memory: 21474836480000, node:__internal_head__: 10000}}, "available": {GPU: 20000, node:192.168.0.2: 10000, node:__internal_head__: 10000, accelerator_type:A40: 10000, CPU: 190000, object_store_memory: 21474836480000, memory: 797346525190000}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",}, "is_draining": 0, "draining_deadline_timestamp_ms": -1} { "placment group locations": [], "node to bundles": []} [state-dump] Waiting tasks size: 0 [state-dump] Number of executing tasks: 1 [state-dump] Number of pinned task arguments: 0 [state-dump] Number of total spilled tasks: 0 [state-dump] Number of spilled waiting tasks: 0 [state-dump] Number of spilled unschedulable tasks: 0 [state-dump] Resource usage { [state-dump] - (language=PYTHON actor_or_task=process_csv_file pid=20684 worker_id=ccafd42c4da13863c8b878c2c6c5c5d04e6b29a0f376eeb80da59a5f): {CPU: 10000} [state-dump] } [state-dump] Backlog Size per scheduling descriptor :{workerId: num backlogs}: [state-dump] [state-dump] Running tasks by scheduling class: [state-dump] - {depth=1 function_descriptor={type=PythonFunctionDescriptor, module_name=__main__, class_name=, function_name=process_csv_file, function_hash=6efd4765ebe3481f9e016260841b31ad} scheduling_strategy=default_scheduling_strategy { [state-dump] } [state-dump] resource_set={CPU : 1, }}: 1/20 [state-dump] ================================================== [state-dump] [state-dump] ClusterResources: [state-dump] LocalObjectManager: [state-dump] - num pinned objects: 0 [state-dump] - pinned objects size: 0 [state-dump] - num objects pending restore: 0 [state-dump] - num objects pending spill: 0 [state-dump] - num bytes pending spill: 0 [state-dump] - num bytes currently spilled: 0 [state-dump] - cumulative spill requests: 0 [state-dump] - cumulative restore requests: 0 [state-dump] - spilled objects pending delete: 0 [state-dump] [state-dump] ObjectManager: [state-dump] - num local objects: 0 [state-dump] - num unfulfilled push requests: 0 [state-dump] - num object pull requests: 0 [state-dump] - num chunks received total: 0 [state-dump] - num chunks received failed (all): 0 [state-dump] - num chunks received failed / cancelled: 0 [state-dump] - num chunks received failed / plasma error: 0 [state-dump] Event stats: [state-dump] Global stats: 0 total (0 active) [state-dump] Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] Execution time: mean = -nan s, total = 0.000 s [state-dump] Event stats: [state-dump] PushManager: [state-dump] - num pushes in flight: 0 [state-dump] - num chunks in flight: 0 [state-dump] - num chunks remaining: 0 [state-dump] - max chunks allowed: 409 [state-dump] OwnershipBasedObjectDirectory: [state-dump] - num listeners: 0 [state-dump] - cumulative location updates: 0 [state-dump] - num location updates per second: 0.000 [state-dump] - num location lookups per second: 0.000 [state-dump] - num locations added per second: 0.000 [state-dump] - num locations removed per second: 0.000 [state-dump] BufferPool: [state-dump] - create buffer state map size: 0 [state-dump] PullManager: [state-dump] - num bytes available for pulled objects: 2147483648 [state-dump] - num bytes being pulled (all): 0 [state-dump] - num bytes being pulled / pinned: 0 [state-dump] - get request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - wait request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - task request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - first get request bundle: N/A [state-dump] - first wait request bundle: N/A [state-dump] - first task request bundle: N/A [state-dump] - num objects queued: 0 [state-dump] - num objects actively pulled (all): 0 [state-dump] - num objects actively pulled / pinned: 0 [state-dump] - num bundles being pulled: 0 [state-dump] - num pull retries: 0 [state-dump] - max timeout seconds: 0 [state-dump] - max timeout request is already processed. No entry. [state-dump] [state-dump] WorkerPool: [state-dump] - registered jobs: 1 [state-dump] - process_failed_job_config_missing: 0 [state-dump] - process_failed_rate_limited: 0 [state-dump] - process_failed_pending_registration: 0 [state-dump] - process_failed_runtime_env_setup_failed: 0 [state-dump] - num PYTHON workers: 20 [state-dump] - num PYTHON drivers: 1 [state-dump] - num PYTHON pending start requests: 0 [state-dump] - num PYTHON pending registration requests: 0 [state-dump] - num object spill callbacks queued: 0 [state-dump] - num object restore queued: 0 [state-dump] - num util functions queued: 0 [state-dump] - num idle workers: 19 [state-dump] TaskDependencyManager: [state-dump] - task deps map size: 0 [state-dump] - get req map size: 0 [state-dump] - wait req map size: 0 [state-dump] - local objects map size: 0 [state-dump] WaitManager: [state-dump] - num active wait requests: 0 [state-dump] Subscriber: [state-dump] Channel WORKER_OBJECT_LOCATIONS_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_OBJECT_EVICTION [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_REF_REMOVED_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] num async plasma notifications: 0 [state-dump] Remote node managers: [state-dump] Event stats: [state-dump] Global stats: 10764 total (35 active) [state-dump] Queueing time: mean = 195.840 us, max = 723.022 ms, min = 67.000 ns, total = 2.108 s [state-dump] Execution time: mean = 336.054 us, total = 3.617 s [state-dump] Event stats: [state-dump] NodeManagerService.grpc_server.ReportWorkerBacklog - 2520 total (0 active), Execution time: mean = 487.192 us, total = 1.228 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManagerService.grpc_server.ReportWorkerBacklog.HandleRequestImpl - 2520 total (0 active), Execution time: mean = 36.659 us, total = 92.380 ms, Queueing time: mean = 114.277 us, max = 9.238 ms, min = 3.786 us, total = 287.977 ms [state-dump] NodeManager.CheckGC - 1199 total (1 active), Execution time: mean = 3.005 us, total = 3.603 ms, Queueing time: mean = 103.040 us, max = 11.587 ms, min = 7.381 us, total = 123.545 ms [state-dump] ObjectManager.UpdateAvailableMemory - 1199 total (0 active), Execution time: mean = 5.221 us, total = 6.260 ms, Queueing time: mean = 115.481 us, max = 15.839 ms, min = 4.917 us, total = 138.462 ms [state-dump] RaySyncer.OnDemandBroadcasting - 1199 total (1 active), Execution time: mean = 9.827 us, total = 11.783 ms, Queueing time: mean = 97.015 us, max = 11.586 ms, min = 11.472 us, total = 116.321 ms [state-dump] RayletWorkerPool.deadline_timer.kill_idle_workers - 600 total (1 active), Execution time: mean = 17.110 us, total = 10.266 ms, Queueing time: mean = 95.870 us, max = 15.957 ms, min = 19.060 us, total = 57.522 ms [state-dump] MemoryMonitor.CheckIsMemoryUsageAboveThreshold - 480 total (1 active), Execution time: mean = 433.894 us, total = 208.269 ms, Queueing time: mean = 101.216 us, max = 15.659 ms, min = 11.930 us, total = 48.584 ms [state-dump] NodeManagerService.grpc_server.GetResourceLoad.HandleRequestImpl - 120 total (0 active), Execution time: mean = 118.000 us, total = 14.160 ms, Queueing time: mean = 111.858 us, max = 203.468 us, min = 18.573 us, total = 13.423 ms [state-dump] NodeManager.deadline_timer.spill_objects_when_over_threshold - 120 total (1 active), Execution time: mean = 2.930 us, total = 351.574 us, Queueing time: mean = 166.895 us, max = 1.404 ms, min = 7.005 us, total = 20.027 ms [state-dump] NodeManagerService.grpc_server.GetResourceLoad - 120 total (0 active), Execution time: mean = 609.802 us, total = 73.176 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.ScheduleAndDispatchTasks - 120 total (1 active), Execution time: mean = 15.046 us, total = 1.805 ms, Queueing time: mean = 76.203 us, max = 1.671 ms, min = 17.548 us, total = 9.144 ms [state-dump] NodeManager.deadline_timer.flush_free_objects - 120 total (1 active), Execution time: mean = 7.965 us, total = 955.780 us, Queueing time: mean = 163.462 us, max = 1.408 ms, min = 11.553 us, total = 19.615 ms [state-dump] ClientConnection.async_read.ProcessMessageHeader - 85 total (21 active), Execution time: mean = 4.990 us, total = 424.148 us, Queueing time: mean = 13.672 ms, max = 723.022 ms, min = 16.524 us, total = 1.162 s [state-dump] ClientConnection.async_read.ProcessMessage - 64 total (0 active), Execution time: mean = 851.051 us, total = 54.467 ms, Queueing time: mean = 240.193 us, max = 9.118 ms, min = 3.826 us, total = 15.372 ms [state-dump] ClusterResourceManager.ResetRemoteNodeView - 41 total (1 active), Execution time: mean = 8.438 us, total = 345.944 us, Queueing time: mean = 72.958 us, max = 155.711 us, min = 14.260 us, total = 2.991 ms [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.CheckAlive.OnReplyReceived - 24 total (0 active), Execution time: mean = 46.144 us, total = 1.107 ms, Queueing time: mean = 88.641 us, max = 179.847 us, min = 14.883 us, total = 2.127 ms [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.CheckAlive - 24 total (0 active), Execution time: mean = 1.291 ms, total = 30.990 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.GcsCheckAlive - 24 total (1 active), Execution time: mean = 251.691 us, total = 6.041 ms, Queueing time: mean = 555.534 us, max = 1.438 ms, min = 105.289 us, total = 13.333 ms [state-dump] NodeManager.deadline_timer.record_metrics - 24 total (1 active), Execution time: mean = 499.505 us, total = 11.988 ms, Queueing time: mean = 320.242 us, max = 1.173 ms, min = 25.842 us, total = 7.686 ms [state-dump] ClientConnection.async_write.DoAsyncWrites - 22 total (0 active), Execution time: mean = 767.364 ns, total = 16.882 us, Queueing time: mean = 49.290 us, max = 318.614 us, min = 14.430 us, total = 1.084 ms [state-dump] NodeManagerService.grpc_server.GetSystemConfig.HandleRequestImpl - 21 total (0 active), Execution time: mean = 88.640 us, total = 1.861 ms, Queueing time: mean = 16.877 us, max = 30.971 us, min = 10.447 us, total = 354.414 us [state-dump] ObjectManager.ObjectAdded - 21 total (0 active), Execution time: mean = 6.379 us, total = 133.950 us, Queueing time: mean = 513.373 us, max = 8.534 ms, min = 3.220 us, total = 10.781 ms [state-dump] NodeManagerService.grpc_server.GetSystemConfig - 21 total (0 active), Execution time: mean = 759.112 us, total = 15.941 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ObjectManager.ObjectDeleted - 21 total (0 active), Execution time: mean = 11.946 us, total = 250.872 us, Queueing time: mean = 589.997 us, max = 9.059 ms, min = 23.448 us, total = 12.390 ms [state-dump] PeriodicalRunner.RunFnPeriodically - 13 total (0 active), Execution time: mean = 195.572 us, total = 2.542 ms, Queueing time: mean = 3.255 ms, max = 10.761 ms, min = 21.417 us, total = 42.315 ms [state-dump] NodeManager.deadline_timer.debug_state_dump - 12 total (1 active), Execution time: mean = 1.539 ms, total = 18.470 ms, Queueing time: mean = 46.835 us, max = 102.351 us, min = 16.766 us, total = 562.023 us [state-dump] - 2 total (0 active), Execution time: mean = 493.500 ns, total = 987.000 ns, Queueing time: mean = 25.826 us, max = 27.369 us, min = 24.284 us, total = 51.653 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll - 2 total (1 active), Execution time: mean = 395.258 ms, total = 790.517 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] RaySyncer.BroadcastMessage - 2 total (0 active), Execution time: mean = 116.754 us, total = 233.508 us, Queueing time: mean = 431.000 ns, max = 757.000 ns, min = 105.000 ns, total = 862.000 ns [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch - 2 total (0 active), Execution time: mean = 735.737 us, total = 1.471 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch.OnReplyReceived - 2 total (0 active), Execution time: mean = 96.942 us, total = 193.883 us, Queueing time: mean = 849.112 us, max = 1.686 ms, min = 12.374 us, total = 1.698 ms [state-dump] NodeManager.deadline_timer.print_event_loop_stats - 2 total (1 active, 1 running), Execution time: mean = 1.314 ms, total = 2.628 ms, Queueing time: mean = 26.427 us, max = 52.854 us, min = 52.854 us, total = 52.854 us [state-dump] RaySyncerRegister - 2 total (0 active), Execution time: mean = 1.321 us, total = 2.642 us, Queueing time: mean = 222.000 ns, max = 377.000 ns, min = 67.000 ns, total = 444.000 ns [state-dump] ray::rpc::JobInfoGcsService.grpc_client.AddJob - 1 total (0 active), Execution time: mean = 1.411 ms, total = 1.411 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode.OnReplyReceived - 1 total (0 active), Execution time: mean = 177.035 us, total = 177.035 us, Queueing time: mean = 15.337 us, max = 15.337 us, min = 15.337 us, total = 15.337 us [state-dump] ray::rpc::JobInfoGcsService.grpc_client.AddJob.OnReplyReceived - 1 total (0 active), Execution time: mean = 46.990 us, total = 46.990 us, Queueing time: mean = 159.464 us, max = 159.464 us, min = 159.464 us, total = 159.464 us [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig - 1 total (0 active), Execution time: mean = 933.039 us, total = 933.039 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.GetAllNodeInfo.OnReplyReceived - 1 total (0 active), Execution time: mean = 110.954 us, total = 110.954 us, Queueing time: mean = 39.567 us, max = 39.567 us, min = 39.567 us, total = 39.567 us [state-dump] NodeManagerService.grpc_server.RequestWorkerLease.HandleRequestImpl - 1 total (0 active), Execution time: mean = 224.288 us, total = 224.288 us, Queueing time: mean = 23.760 us, max = 23.760 us, min = 23.760 us, total = 23.760 us [state-dump] ray::rpc::JobInfoGcsService.grpc_client.GetAllJobInfo.OnReplyReceived - 1 total (0 active), Execution time: mean = 12.530 us, total = 12.530 us, Queueing time: mean = 10.668 us, max = 10.668 us, min = 10.668 us, total = 10.668 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll.OnReplyReceived - 1 total (0 active), Execution time: mean = 208.070 us, total = 208.070 us, Queueing time: mean = 22.206 us, max = 22.206 us, min = 22.206 us, total = 22.206 us [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode - 1 total (0 active), Execution time: mean = 1.806 ms, total = 1.806 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManagerService.grpc_server.RequestWorkerLease - 1 total (0 active), Execution time: mean = 498.490 us, total = 498.490 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig.OnReplyReceived - 1 total (0 active), Execution time: mean = 1.020 s, total = 1.020 s, Queueing time: mean = 8.247 us, max = 8.247 us, min = 8.247 us, total = 8.247 us [state-dump] Subscriber.HandlePublishedMessage_GCS_JOB_CHANNEL - 1 total (0 active), Execution time: mean = 52.501 us, total = 52.501 us, Queueing time: mean = 199.519 us, max = 199.519 us, min = 199.519 us, total = 199.519 us [state-dump] NodeManager.GCTaskFailureReason - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.GetAllNodeInfo - 1 total (0 active), Execution time: mean = 859.855 us, total = 859.855 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::JobInfoGcsService.grpc_client.GetAllJobInfo - 1 total (0 active), Execution time: mean = 938.962 us, total = 938.962 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] WorkerPool.PopWorkerCallback - 1 total (0 active), Execution time: mean = 41.143 us, total = 41.143 us, Queueing time: mean = 29.210 us, max = 29.210 us, min = 29.210 us, total = 29.210 us [state-dump] DebugString() time ms: 1 [state-dump] [state-dump] [2025-01-21 06:01:20,114 I 20537 20566] (raylet) store.cc:564: Plasma store debug dump: Current usage: 0 / 2.14748 GB - num bytes created total: 168 0 pending objects of total size 0MB - objects spillable: 0 - bytes spillable: 0 - objects unsealed: 0 - bytes unsealed: 0 - objects in use: 0 - bytes in use: 0 - objects evictable: 0 - bytes evictable: 0 - objects created by worker: 0 - bytes created by worker: 0 - objects restored: 0 - bytes restored: 0 - objects received: 0 - bytes received: 0 - objects errored: 0 - bytes errored: 0 [2025-01-21 06:01:21,143 I 20537 20537] (raylet) node_manager.cc:525: [state-dump] NodeManager: [state-dump] Node ID: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c [state-dump] Node name: 192.168.0.2 [state-dump] InitialConfigResources: {node:192.168.0.2: 10000, node:__internal_head__: 10000, memory: 797346525190000, accelerator_type:A40: 10000, object_store_memory: 21474836480000, CPU: 200000, GPU: 20000} [state-dump] ClusterTaskManager: [state-dump] ========== Node: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c ================= [state-dump] Infeasible queue length: 0 [state-dump] Schedule queue length: 0 [state-dump] Dispatch queue length: 0 [state-dump] num_waiting_for_resource: 0 [state-dump] num_waiting_for_plasma_memory: 0 [state-dump] num_waiting_for_remote_node_resources: 0 [state-dump] num_worker_not_started_by_job_config_not_exist: 0 [state-dump] num_worker_not_started_by_registration_timeout: 0 [state-dump] num_tasks_waiting_for_workers: 0 [state-dump] num_cancelled_tasks: 0 [state-dump] cluster_resource_scheduler state: [state-dump] Local id: -6890545921723833623 Local resources: {"total":{node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [200000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "available": {node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [190000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",} is_draining: 0 is_idle: 0 Cluster resources: node id: -6890545921723833623{"total":{node:192.168.0.2: 10000, GPU: 20000, accelerator_type:A40: 10000, CPU: 200000, memory: 797346525190000, object_store_memory: 21474836480000, node:__internal_head__: 10000}}, "available": {GPU: 20000, node:192.168.0.2: 10000, node:__internal_head__: 10000, accelerator_type:A40: 10000, CPU: 190000, object_store_memory: 21474836480000, memory: 797346525190000}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",}, "is_draining": 0, "draining_deadline_timestamp_ms": -1} { "placment group locations": [], "node to bundles": []} [state-dump] Waiting tasks size: 0 [state-dump] Number of executing tasks: 1 [state-dump] Number of pinned task arguments: 0 [state-dump] Number of total spilled tasks: 0 [state-dump] Number of spilled waiting tasks: 0 [state-dump] Number of spilled unschedulable tasks: 0 [state-dump] Resource usage { [state-dump] - (language=PYTHON actor_or_task=process_csv_file pid=20684 worker_id=ccafd42c4da13863c8b878c2c6c5c5d04e6b29a0f376eeb80da59a5f): {CPU: 10000} [state-dump] } [state-dump] Backlog Size per scheduling descriptor :{workerId: num backlogs}: [state-dump] [state-dump] Running tasks by scheduling class: [state-dump] - {depth=1 function_descriptor={type=PythonFunctionDescriptor, module_name=__main__, class_name=, function_name=process_csv_file, function_hash=6efd4765ebe3481f9e016260841b31ad} scheduling_strategy=default_scheduling_strategy { [state-dump] } [state-dump] resource_set={CPU : 1, }}: 1/20 [state-dump] ================================================== [state-dump] [state-dump] ClusterResources: [state-dump] LocalObjectManager: [state-dump] - num pinned objects: 0 [state-dump] - pinned objects size: 0 [state-dump] - num objects pending restore: 0 [state-dump] - num objects pending spill: 0 [state-dump] - num bytes pending spill: 0 [state-dump] - num bytes currently spilled: 0 [state-dump] - cumulative spill requests: 0 [state-dump] - cumulative restore requests: 0 [state-dump] - spilled objects pending delete: 0 [state-dump] [state-dump] ObjectManager: [state-dump] - num local objects: 0 [state-dump] - num unfulfilled push requests: 0 [state-dump] - num object pull requests: 0 [state-dump] - num chunks received total: 0 [state-dump] - num chunks received failed (all): 0 [state-dump] - num chunks received failed / cancelled: 0 [state-dump] - num chunks received failed / plasma error: 0 [state-dump] Event stats: [state-dump] Global stats: 0 total (0 active) [state-dump] Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] Execution time: mean = -nan s, total = 0.000 s [state-dump] Event stats: [state-dump] PushManager: [state-dump] - num pushes in flight: 0 [state-dump] - num chunks in flight: 0 [state-dump] - num chunks remaining: 0 [state-dump] - max chunks allowed: 409 [state-dump] OwnershipBasedObjectDirectory: [state-dump] - num listeners: 0 [state-dump] - cumulative location updates: 0 [state-dump] - num location updates per second: 0.000 [state-dump] - num location lookups per second: 0.000 [state-dump] - num locations added per second: 0.000 [state-dump] - num locations removed per second: 0.000 [state-dump] BufferPool: [state-dump] - create buffer state map size: 0 [state-dump] PullManager: [state-dump] - num bytes available for pulled objects: 2147483648 [state-dump] - num bytes being pulled (all): 0 [state-dump] - num bytes being pulled / pinned: 0 [state-dump] - get request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - wait request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - task request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - first get request bundle: N/A [state-dump] - first wait request bundle: N/A [state-dump] - first task request bundle: N/A [state-dump] - num objects queued: 0 [state-dump] - num objects actively pulled (all): 0 [state-dump] - num objects actively pulled / pinned: 0 [state-dump] - num bundles being pulled: 0 [state-dump] - num pull retries: 0 [state-dump] - max timeout seconds: 0 [state-dump] - max timeout request is already processed. No entry. [state-dump] [state-dump] WorkerPool: [state-dump] - registered jobs: 1 [state-dump] - process_failed_job_config_missing: 0 [state-dump] - process_failed_rate_limited: 0 [state-dump] - process_failed_pending_registration: 0 [state-dump] - process_failed_runtime_env_setup_failed: 0 [state-dump] - num PYTHON workers: 20 [state-dump] - num PYTHON drivers: 1 [state-dump] - num PYTHON pending start requests: 0 [state-dump] - num PYTHON pending registration requests: 0 [state-dump] - num object spill callbacks queued: 0 [state-dump] - num object restore queued: 0 [state-dump] - num util functions queued: 0 [state-dump] - num idle workers: 19 [state-dump] TaskDependencyManager: [state-dump] - task deps map size: 0 [state-dump] - get req map size: 0 [state-dump] - wait req map size: 0 [state-dump] - local objects map size: 0 [state-dump] WaitManager: [state-dump] - num active wait requests: 0 [state-dump] Subscriber: [state-dump] Channel WORKER_OBJECT_LOCATIONS_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_OBJECT_EVICTION [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_REF_REMOVED_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] num async plasma notifications: 0 [state-dump] Remote node managers: [state-dump] Event stats: [state-dump] Global stats: 15998 total (35 active) [state-dump] Queueing time: mean = 158.545 us, max = 723.022 ms, min = 67.000 ns, total = 2.536 s [state-dump] Execution time: mean = 286.066 us, total = 4.576 s [state-dump] Event stats: [state-dump] NodeManagerService.grpc_server.ReportWorkerBacklog - 3780 total (0 active), Execution time: mean = 507.099 us, total = 1.917 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManagerService.grpc_server.ReportWorkerBacklog.HandleRequestImpl - 3780 total (0 active), Execution time: mean = 38.093 us, total = 143.991 ms, Queueing time: mean = 114.575 us, max = 9.238 ms, min = 3.786 us, total = 433.093 ms [state-dump] ObjectManager.UpdateAvailableMemory - 1799 total (0 active), Execution time: mean = 5.461 us, total = 9.824 ms, Queueing time: mean = 115.036 us, max = 15.839 ms, min = 3.651 us, total = 206.949 ms [state-dump] NodeManager.CheckGC - 1799 total (1 active), Execution time: mean = 3.039 us, total = 5.467 ms, Queueing time: mean = 102.664 us, max = 11.587 ms, min = 7.381 us, total = 184.692 ms [state-dump] RaySyncer.OnDemandBroadcasting - 1799 total (1 active), Execution time: mean = 10.164 us, total = 18.285 ms, Queueing time: mean = 96.364 us, max = 11.586 ms, min = 11.472 us, total = 173.359 ms [state-dump] RayletWorkerPool.deadline_timer.kill_idle_workers - 900 total (1 active), Execution time: mean = 17.490 us, total = 15.741 ms, Queueing time: mean = 89.560 us, max = 15.957 ms, min = 17.360 us, total = 80.604 ms [state-dump] MemoryMonitor.CheckIsMemoryUsageAboveThreshold - 719 total (1 active), Execution time: mean = 439.191 us, total = 315.778 ms, Queueing time: mean = 94.430 us, max = 15.659 ms, min = 11.391 us, total = 67.895 ms [state-dump] NodeManagerService.grpc_server.GetResourceLoad.HandleRequestImpl - 180 total (0 active), Execution time: mean = 121.108 us, total = 21.800 ms, Queueing time: mean = 116.854 us, max = 203.468 us, min = 18.573 us, total = 21.034 ms [state-dump] NodeManager.deadline_timer.spill_objects_when_over_threshold - 180 total (1 active), Execution time: mean = 2.966 us, total = 533.947 us, Queueing time: mean = 181.883 us, max = 2.119 ms, min = 7.005 us, total = 32.739 ms [state-dump] NodeManager.ScheduleAndDispatchTasks - 180 total (1 active), Execution time: mean = 15.702 us, total = 2.826 ms, Queueing time: mean = 76.707 us, max = 1.671 ms, min = 17.548 us, total = 13.807 ms [state-dump] NodeManagerService.grpc_server.GetResourceLoad - 180 total (0 active), Execution time: mean = 638.471 us, total = 114.925 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.deadline_timer.flush_free_objects - 180 total (1 active), Execution time: mean = 8.267 us, total = 1.488 ms, Queueing time: mean = 178.344 us, max = 2.115 ms, min = 9.804 us, total = 32.102 ms [state-dump] ClientConnection.async_read.ProcessMessageHeader - 85 total (21 active), Execution time: mean = 4.990 us, total = 424.148 us, Queueing time: mean = 13.672 ms, max = 723.022 ms, min = 16.524 us, total = 1.162 s [state-dump] ClientConnection.async_read.ProcessMessage - 64 total (0 active), Execution time: mean = 851.051 us, total = 54.467 ms, Queueing time: mean = 240.193 us, max = 9.118 ms, min = 3.826 us, total = 15.372 ms [state-dump] ClusterResourceManager.ResetRemoteNodeView - 61 total (1 active), Execution time: mean = 8.909 us, total = 543.469 us, Queueing time: mean = 77.377 us, max = 168.384 us, min = 14.260 us, total = 4.720 ms [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.CheckAlive.OnReplyReceived - 36 total (0 active), Execution time: mean = 48.421 us, total = 1.743 ms, Queueing time: mean = 102.656 us, max = 179.847 us, min = 14.883 us, total = 3.696 ms [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.CheckAlive - 36 total (0 active), Execution time: mean = 1.356 ms, total = 48.804 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.GcsCheckAlive - 36 total (1 active), Execution time: mean = 258.275 us, total = 9.298 ms, Queueing time: mean = 595.186 us, max = 1.877 ms, min = 71.740 us, total = 21.427 ms [state-dump] NodeManager.deadline_timer.record_metrics - 36 total (1 active), Execution time: mean = 516.047 us, total = 18.578 ms, Queueing time: mean = 345.764 us, max = 1.682 ms, min = 16.167 us, total = 12.447 ms [state-dump] ClientConnection.async_write.DoAsyncWrites - 22 total (0 active), Execution time: mean = 767.364 ns, total = 16.882 us, Queueing time: mean = 49.290 us, max = 318.614 us, min = 14.430 us, total = 1.084 ms [state-dump] NodeManagerService.grpc_server.GetSystemConfig.HandleRequestImpl - 21 total (0 active), Execution time: mean = 88.640 us, total = 1.861 ms, Queueing time: mean = 16.877 us, max = 30.971 us, min = 10.447 us, total = 354.414 us [state-dump] ObjectManager.ObjectAdded - 21 total (0 active), Execution time: mean = 6.379 us, total = 133.950 us, Queueing time: mean = 513.373 us, max = 8.534 ms, min = 3.220 us, total = 10.781 ms [state-dump] NodeManagerService.grpc_server.GetSystemConfig - 21 total (0 active), Execution time: mean = 759.112 us, total = 15.941 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ObjectManager.ObjectDeleted - 21 total (0 active), Execution time: mean = 11.946 us, total = 250.872 us, Queueing time: mean = 589.997 us, max = 9.059 ms, min = 23.448 us, total = 12.390 ms [state-dump] NodeManager.deadline_timer.debug_state_dump - 18 total (1 active), Execution time: mean = 1.630 ms, total = 29.335 ms, Queueing time: mean = 59.874 us, max = 127.557 us, min = 16.766 us, total = 1.078 ms [state-dump] PeriodicalRunner.RunFnPeriodically - 13 total (0 active), Execution time: mean = 195.572 us, total = 2.542 ms, Queueing time: mean = 3.255 ms, max = 10.761 ms, min = 21.417 us, total = 42.315 ms [state-dump] NodeManager.deadline_timer.print_event_loop_stats - 3 total (1 active, 1 running), Execution time: mean = 1.903 ms, total = 5.708 ms, Queueing time: mean = 37.629 us, max = 60.032 us, min = 52.854 us, total = 112.886 us [state-dump] - 2 total (0 active), Execution time: mean = 493.500 ns, total = 987.000 ns, Queueing time: mean = 25.826 us, max = 27.369 us, min = 24.284 us, total = 51.653 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll - 2 total (1 active), Execution time: mean = 395.258 ms, total = 790.517 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] RaySyncer.BroadcastMessage - 2 total (0 active), Execution time: mean = 116.754 us, total = 233.508 us, Queueing time: mean = 431.000 ns, max = 757.000 ns, min = 105.000 ns, total = 862.000 ns [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch - 2 total (0 active), Execution time: mean = 735.737 us, total = 1.471 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch.OnReplyReceived - 2 total (0 active), Execution time: mean = 96.942 us, total = 193.883 us, Queueing time: mean = 849.112 us, max = 1.686 ms, min = 12.374 us, total = 1.698 ms [state-dump] RaySyncerRegister - 2 total (0 active), Execution time: mean = 1.321 us, total = 2.642 us, Queueing time: mean = 222.000 ns, max = 377.000 ns, min = 67.000 ns, total = 444.000 ns [state-dump] ray::rpc::JobInfoGcsService.grpc_client.GetAllJobInfo.OnReplyReceived - 1 total (0 active), Execution time: mean = 12.530 us, total = 12.530 us, Queueing time: mean = 10.668 us, max = 10.668 us, min = 10.668 us, total = 10.668 us [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode.OnReplyReceived - 1 total (0 active), Execution time: mean = 177.035 us, total = 177.035 us, Queueing time: mean = 15.337 us, max = 15.337 us, min = 15.337 us, total = 15.337 us [state-dump] ray::rpc::JobInfoGcsService.grpc_client.AddJob.OnReplyReceived - 1 total (0 active), Execution time: mean = 46.990 us, total = 46.990 us, Queueing time: mean = 159.464 us, max = 159.464 us, min = 159.464 us, total = 159.464 us [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig - 1 total (0 active), Execution time: mean = 933.039 us, total = 933.039 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.GetAllNodeInfo.OnReplyReceived - 1 total (0 active), Execution time: mean = 110.954 us, total = 110.954 us, Queueing time: mean = 39.567 us, max = 39.567 us, min = 39.567 us, total = 39.567 us [state-dump] NodeManagerService.grpc_server.RequestWorkerLease.HandleRequestImpl - 1 total (0 active), Execution time: mean = 224.288 us, total = 224.288 us, Queueing time: mean = 23.760 us, max = 23.760 us, min = 23.760 us, total = 23.760 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll.OnReplyReceived - 1 total (0 active), Execution time: mean = 208.070 us, total = 208.070 us, Queueing time: mean = 22.206 us, max = 22.206 us, min = 22.206 us, total = 22.206 us [state-dump] ray::rpc::JobInfoGcsService.grpc_client.AddJob - 1 total (0 active), Execution time: mean = 1.411 ms, total = 1.411 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode - 1 total (0 active), Execution time: mean = 1.806 ms, total = 1.806 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManagerService.grpc_server.RequestWorkerLease - 1 total (0 active), Execution time: mean = 498.490 us, total = 498.490 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] Subscriber.HandlePublishedMessage_GCS_JOB_CHANNEL - 1 total (0 active), Execution time: mean = 52.501 us, total = 52.501 us, Queueing time: mean = 199.519 us, max = 199.519 us, min = 199.519 us, total = 199.519 us [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig.OnReplyReceived - 1 total (0 active), Execution time: mean = 1.020 s, total = 1.020 s, Queueing time: mean = 8.247 us, max = 8.247 us, min = 8.247 us, total = 8.247 us [state-dump] WorkerPool.PopWorkerCallback - 1 total (0 active), Execution time: mean = 41.143 us, total = 41.143 us, Queueing time: mean = 29.210 us, max = 29.210 us, min = 29.210 us, total = 29.210 us [state-dump] NodeManager.GCTaskFailureReason - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.GetAllNodeInfo - 1 total (0 active), Execution time: mean = 859.855 us, total = 859.855 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::JobInfoGcsService.grpc_client.GetAllJobInfo - 1 total (0 active), Execution time: mean = 938.962 us, total = 938.962 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] DebugString() time ms: 1 [state-dump] [state-dump] [2025-01-21 06:02:20,114 I 20537 20566] (raylet) store.cc:564: Plasma store debug dump: Current usage: 0 / 2.14748 GB - num bytes created total: 168 0 pending objects of total size 0MB - objects spillable: 0 - bytes spillable: 0 - objects unsealed: 0 - bytes unsealed: 0 - objects in use: 0 - bytes in use: 0 - objects evictable: 0 - bytes evictable: 0 - objects created by worker: 0 - bytes created by worker: 0 - objects restored: 0 - bytes restored: 0 - objects received: 0 - bytes received: 0 - objects errored: 0 - bytes errored: 0 [2025-01-21 06:02:21,146 I 20537 20537] (raylet) node_manager.cc:525: [state-dump] NodeManager: [state-dump] Node ID: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c [state-dump] Node name: 192.168.0.2 [state-dump] InitialConfigResources: {node:192.168.0.2: 10000, node:__internal_head__: 10000, memory: 797346525190000, accelerator_type:A40: 10000, object_store_memory: 21474836480000, CPU: 200000, GPU: 20000} [state-dump] ClusterTaskManager: [state-dump] ========== Node: 331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c ================= [state-dump] Infeasible queue length: 0 [state-dump] Schedule queue length: 0 [state-dump] Dispatch queue length: 0 [state-dump] num_waiting_for_resource: 0 [state-dump] num_waiting_for_plasma_memory: 0 [state-dump] num_waiting_for_remote_node_resources: 0 [state-dump] num_worker_not_started_by_job_config_not_exist: 0 [state-dump] num_worker_not_started_by_registration_timeout: 0 [state-dump] num_tasks_waiting_for_workers: 0 [state-dump] num_cancelled_tasks: 0 [state-dump] cluster_resource_scheduler state: [state-dump] Local id: -6890545921723833623 Local resources: {"total":{node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [200000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "available": {node:__internal_head__: [10000], accelerator_type:A40: [10000], GPU: [10000, 10000], CPU: [190000], memory: [797346525190000], object_store_memory: [21474836480000], node:192.168.0.2: [10000]}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",} is_draining: 0 is_idle: 0 Cluster resources: node id: -6890545921723833623{"total":{node:192.168.0.2: 10000, GPU: 20000, accelerator_type:A40: 10000, CPU: 200000, memory: 797346525190000, object_store_memory: 21474836480000, node:__internal_head__: 10000}}, "available": {GPU: 20000, node:192.168.0.2: 10000, node:__internal_head__: 10000, accelerator_type:A40: 10000, CPU: 190000, object_store_memory: 21474836480000, memory: 797346525190000}}, "labels":{"ray.io/node_id":"331bf2382daee9e1be6a8d365bb49297cc7c87f5fbcfb7e5998c831c",}, "is_draining": 0, "draining_deadline_timestamp_ms": -1} { "placment group locations": [], "node to bundles": []} [state-dump] Waiting tasks size: 0 [state-dump] Number of executing tasks: 1 [state-dump] Number of pinned task arguments: 0 [state-dump] Number of total spilled tasks: 0 [state-dump] Number of spilled waiting tasks: 0 [state-dump] Number of spilled unschedulable tasks: 0 [state-dump] Resource usage { [state-dump] - (language=PYTHON actor_or_task=process_csv_file pid=20684 worker_id=ccafd42c4da13863c8b878c2c6c5c5d04e6b29a0f376eeb80da59a5f): {CPU: 10000} [state-dump] } [state-dump] Backlog Size per scheduling descriptor :{workerId: num backlogs}: [state-dump] [state-dump] Running tasks by scheduling class: [state-dump] - {depth=1 function_descriptor={type=PythonFunctionDescriptor, module_name=__main__, class_name=, function_name=process_csv_file, function_hash=6efd4765ebe3481f9e016260841b31ad} scheduling_strategy=default_scheduling_strategy { [state-dump] } [state-dump] resource_set={CPU : 1, }}: 1/20 [state-dump] ================================================== [state-dump] [state-dump] ClusterResources: [state-dump] LocalObjectManager: [state-dump] - num pinned objects: 0 [state-dump] - pinned objects size: 0 [state-dump] - num objects pending restore: 0 [state-dump] - num objects pending spill: 0 [state-dump] - num bytes pending spill: 0 [state-dump] - num bytes currently spilled: 0 [state-dump] - cumulative spill requests: 0 [state-dump] - cumulative restore requests: 0 [state-dump] - spilled objects pending delete: 0 [state-dump] [state-dump] ObjectManager: [state-dump] - num local objects: 0 [state-dump] - num unfulfilled push requests: 0 [state-dump] - num object pull requests: 0 [state-dump] - num chunks received total: 0 [state-dump] - num chunks received failed (all): 0 [state-dump] - num chunks received failed / cancelled: 0 [state-dump] - num chunks received failed / plasma error: 0 [state-dump] Event stats: [state-dump] Global stats: 0 total (0 active) [state-dump] Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] Execution time: mean = -nan s, total = 0.000 s [state-dump] Event stats: [state-dump] PushManager: [state-dump] - num pushes in flight: 0 [state-dump] - num chunks in flight: 0 [state-dump] - num chunks remaining: 0 [state-dump] - max chunks allowed: 409 [state-dump] OwnershipBasedObjectDirectory: [state-dump] - num listeners: 0 [state-dump] - cumulative location updates: 0 [state-dump] - num location updates per second: 0.000 [state-dump] - num location lookups per second: 0.000 [state-dump] - num locations added per second: 0.000 [state-dump] - num locations removed per second: 0.000 [state-dump] BufferPool: [state-dump] - create buffer state map size: 0 [state-dump] PullManager: [state-dump] - num bytes available for pulled objects: 2147483648 [state-dump] - num bytes being pulled (all): 0 [state-dump] - num bytes being pulled / pinned: 0 [state-dump] - get request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - wait request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - task request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable} [state-dump] - first get request bundle: N/A [state-dump] - first wait request bundle: N/A [state-dump] - first task request bundle: N/A [state-dump] - num objects queued: 0 [state-dump] - num objects actively pulled (all): 0 [state-dump] - num objects actively pulled / pinned: 0 [state-dump] - num bundles being pulled: 0 [state-dump] - num pull retries: 0 [state-dump] - max timeout seconds: 0 [state-dump] - max timeout request is already processed. No entry. [state-dump] [state-dump] WorkerPool: [state-dump] - registered jobs: 1 [state-dump] - process_failed_job_config_missing: 0 [state-dump] - process_failed_rate_limited: 0 [state-dump] - process_failed_pending_registration: 0 [state-dump] - process_failed_runtime_env_setup_failed: 0 [state-dump] - num PYTHON workers: 20 [state-dump] - num PYTHON drivers: 1 [state-dump] - num PYTHON pending start requests: 0 [state-dump] - num PYTHON pending registration requests: 0 [state-dump] - num object spill callbacks queued: 0 [state-dump] - num object restore queued: 0 [state-dump] - num util functions queued: 0 [state-dump] - num idle workers: 19 [state-dump] TaskDependencyManager: [state-dump] - task deps map size: 0 [state-dump] - get req map size: 0 [state-dump] - wait req map size: 0 [state-dump] - local objects map size: 0 [state-dump] WaitManager: [state-dump] - num active wait requests: 0 [state-dump] Subscriber: [state-dump] Channel WORKER_OBJECT_LOCATIONS_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_OBJECT_EVICTION [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] Channel WORKER_REF_REMOVED_CHANNEL [state-dump] - cumulative subscribe requests: 0 [state-dump] - cumulative unsubscribe requests: 0 [state-dump] - active subscribed publishers: 0 [state-dump] - cumulative published messages: 0 [state-dump] - cumulative processed messages: 0 [state-dump] num async plasma notifications: 0 [state-dump] Remote node managers: [state-dump] Event stats: [state-dump] Global stats: 21230 total (35 active) [state-dump] Queueing time: mean = 135.334 us, max = 723.022 ms, min = 67.000 ns, total = 2.873 s [state-dump] Execution time: mean = 249.042 us, total = 5.287 s [state-dump] Event stats: [state-dump] NodeManagerService.grpc_server.ReportWorkerBacklog - 5040 total (0 active), Execution time: mean = 476.092 us, total = 2.400 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManagerService.grpc_server.ReportWorkerBacklog.HandleRequestImpl - 5040 total (0 active), Execution time: mean = 36.302 us, total = 182.961 ms, Queueing time: mean = 104.117 us, max = 9.238 ms, min = 3.786 us, total = 524.749 ms [state-dump] ObjectManager.UpdateAvailableMemory - 2398 total (0 active), Execution time: mean = 5.162 us, total = 12.379 ms, Queueing time: mean = 103.001 us, max = 15.839 ms, min = 3.651 us, total = 246.995 ms [state-dump] RaySyncer.OnDemandBroadcasting - 2398 total (1 active), Execution time: mean = 9.793 us, total = 23.484 ms, Queueing time: mean = 94.835 us, max = 12.998 ms, min = 11.472 us, total = 227.414 ms [state-dump] NodeManager.CheckGC - 2398 total (1 active), Execution time: mean = 3.028 us, total = 7.260 ms, Queueing time: mean = 100.811 us, max = 13.000 ms, min = 7.381 us, total = 241.744 ms [state-dump] RayletWorkerPool.deadline_timer.kill_idle_workers - 1200 total (1 active), Execution time: mean = 17.233 us, total = 20.680 ms, Queueing time: mean = 96.829 us, max = 17.595 ms, min = 15.850 us, total = 116.195 ms [state-dump] MemoryMonitor.CheckIsMemoryUsageAboveThreshold - 959 total (1 active), Execution time: mean = 434.660 us, total = 416.838 ms, Queueing time: mean = 85.164 us, max = 15.659 ms, min = 10.208 us, total = 81.672 ms [state-dump] NodeManagerService.grpc_server.GetResourceLoad.HandleRequestImpl - 240 total (0 active), Execution time: mean = 116.744 us, total = 28.018 ms, Queueing time: mean = 106.486 us, max = 203.468 us, min = 17.670 us, total = 25.557 ms [state-dump] NodeManager.ScheduleAndDispatchTasks - 240 total (1 active), Execution time: mean = 15.226 us, total = 3.654 ms, Queueing time: mean = 71.509 us, max = 1.671 ms, min = 15.828 us, total = 17.162 ms [state-dump] NodeManagerService.grpc_server.GetResourceLoad - 240 total (0 active), Execution time: mean = 598.138 us, total = 143.553 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.deadline_timer.spill_objects_when_over_threshold - 240 total (1 active), Execution time: mean = 2.910 us, total = 698.413 us, Queueing time: mean = 178.102 us, max = 2.119 ms, min = 7.005 us, total = 42.744 ms [state-dump] NodeManager.deadline_timer.flush_free_objects - 240 total (1 active), Execution time: mean = 8.087 us, total = 1.941 ms, Queueing time: mean = 174.657 us, max = 2.115 ms, min = 9.804 us, total = 41.918 ms [state-dump] ClientConnection.async_read.ProcessMessageHeader - 85 total (21 active), Execution time: mean = 4.990 us, total = 424.148 us, Queueing time: mean = 13.672 ms, max = 723.022 ms, min = 16.524 us, total = 1.162 s [state-dump] ClusterResourceManager.ResetRemoteNodeView - 81 total (1 active), Execution time: mean = 8.489 us, total = 687.605 us, Queueing time: mean = 72.754 us, max = 168.384 us, min = 14.260 us, total = 5.893 ms [state-dump] ClientConnection.async_read.ProcessMessage - 64 total (0 active), Execution time: mean = 851.051 us, total = 54.467 ms, Queueing time: mean = 240.193 us, max = 9.118 ms, min = 3.826 us, total = 15.372 ms [state-dump] NodeManager.deadline_timer.record_metrics - 48 total (1 active), Execution time: mean = 517.895 us, total = 24.859 ms, Queueing time: mean = 375.455 us, max = 1.702 ms, min = 16.167 us, total = 18.022 ms [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.CheckAlive.OnReplyReceived - 48 total (0 active), Execution time: mean = 47.892 us, total = 2.299 ms, Queueing time: mean = 97.735 us, max = 179.847 us, min = 14.883 us, total = 4.691 ms [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.CheckAlive - 48 total (0 active), Execution time: mean = 1.283 ms, total = 61.600 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManager.GcsCheckAlive - 48 total (1 active), Execution time: mean = 259.386 us, total = 12.451 ms, Queueing time: mean = 628.167 us, max = 1.971 ms, min = 71.740 us, total = 30.152 ms [state-dump] NodeManager.deadline_timer.debug_state_dump - 24 total (1 active), Execution time: mean = 1.699 ms, total = 40.781 ms, Queueing time: mean = 58.882 us, max = 127.557 us, min = 16.766 us, total = 1.413 ms [state-dump] ClientConnection.async_write.DoAsyncWrites - 22 total (0 active), Execution time: mean = 767.364 ns, total = 16.882 us, Queueing time: mean = 49.290 us, max = 318.614 us, min = 14.430 us, total = 1.084 ms [state-dump] NodeManagerService.grpc_server.GetSystemConfig.HandleRequestImpl - 21 total (0 active), Execution time: mean = 88.640 us, total = 1.861 ms, Queueing time: mean = 16.877 us, max = 30.971 us, min = 10.447 us, total = 354.414 us [state-dump] ObjectManager.ObjectAdded - 21 total (0 active), Execution time: mean = 6.379 us, total = 133.950 us, Queueing time: mean = 513.373 us, max = 8.534 ms, min = 3.220 us, total = 10.781 ms [state-dump] NodeManagerService.grpc_server.GetSystemConfig - 21 total (0 active), Execution time: mean = 759.112 us, total = 15.941 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ObjectManager.ObjectDeleted - 21 total (0 active), Execution time: mean = 11.946 us, total = 250.872 us, Queueing time: mean = 589.997 us, max = 9.059 ms, min = 23.448 us, total = 12.390 ms [state-dump] PeriodicalRunner.RunFnPeriodically - 13 total (0 active), Execution time: mean = 195.572 us, total = 2.542 ms, Queueing time: mean = 3.255 ms, max = 10.761 ms, min = 21.417 us, total = 42.315 ms [state-dump] NodeManager.deadline_timer.print_event_loop_stats - 4 total (1 active, 1 running), Execution time: mean = 2.131 ms, total = 8.526 ms, Queueing time: mean = 43.319 us, max = 60.390 us, min = 52.854 us, total = 173.276 us [state-dump] - 2 total (0 active), Execution time: mean = 493.500 ns, total = 987.000 ns, Queueing time: mean = 25.826 us, max = 27.369 us, min = 24.284 us, total = 51.653 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll - 2 total (1 active), Execution time: mean = 395.258 ms, total = 790.517 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] RaySyncer.BroadcastMessage - 2 total (0 active), Execution time: mean = 116.754 us, total = 233.508 us, Queueing time: mean = 431.000 ns, max = 757.000 ns, min = 105.000 ns, total = 862.000 ns [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch - 2 total (0 active), Execution time: mean = 735.737 us, total = 1.471 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch.OnReplyReceived - 2 total (0 active), Execution time: mean = 96.942 us, total = 193.883 us, Queueing time: mean = 849.112 us, max = 1.686 ms, min = 12.374 us, total = 1.698 ms [state-dump] RaySyncerRegister - 2 total (0 active), Execution time: mean = 1.321 us, total = 2.642 us, Queueing time: mean = 222.000 ns, max = 377.000 ns, min = 67.000 ns, total = 444.000 ns [state-dump] ray::rpc::JobInfoGcsService.grpc_client.GetAllJobInfo.OnReplyReceived - 1 total (0 active), Execution time: mean = 12.530 us, total = 12.530 us, Queueing time: mean = 10.668 us, max = 10.668 us, min = 10.668 us, total = 10.668 us [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode.OnReplyReceived - 1 total (0 active), Execution time: mean = 177.035 us, total = 177.035 us, Queueing time: mean = 15.337 us, max = 15.337 us, min = 15.337 us, total = 15.337 us [state-dump] ray::rpc::JobInfoGcsService.grpc_client.AddJob.OnReplyReceived - 1 total (0 active), Execution time: mean = 46.990 us, total = 46.990 us, Queueing time: mean = 159.464 us, max = 159.464 us, min = 159.464 us, total = 159.464 us [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig - 1 total (0 active), Execution time: mean = 933.039 us, total = 933.039 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.GetAllNodeInfo.OnReplyReceived - 1 total (0 active), Execution time: mean = 110.954 us, total = 110.954 us, Queueing time: mean = 39.567 us, max = 39.567 us, min = 39.567 us, total = 39.567 us [state-dump] NodeManagerService.grpc_server.RequestWorkerLease.HandleRequestImpl - 1 total (0 active), Execution time: mean = 224.288 us, total = 224.288 us, Queueing time: mean = 23.760 us, max = 23.760 us, min = 23.760 us, total = 23.760 us [state-dump] ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll.OnReplyReceived - 1 total (0 active), Execution time: mean = 208.070 us, total = 208.070 us, Queueing time: mean = 22.206 us, max = 22.206 us, min = 22.206 us, total = 22.206 us [state-dump] ray::rpc::JobInfoGcsService.grpc_client.AddJob - 1 total (0 active), Execution time: mean = 1.411 ms, total = 1.411 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode - 1 total (0 active), Execution time: mean = 1.806 ms, total = 1.806 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] NodeManagerService.grpc_server.RequestWorkerLease - 1 total (0 active), Execution time: mean = 498.490 us, total = 498.490 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] Subscriber.HandlePublishedMessage_GCS_JOB_CHANNEL - 1 total (0 active), Execution time: mean = 52.501 us, total = 52.501 us, Queueing time: mean = 199.519 us, max = 199.519 us, min = 199.519 us, total = 199.519 us [state-dump] ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig.OnReplyReceived - 1 total (0 active), Execution time: mean = 1.020 s, total = 1.020 s, Queueing time: mean = 8.247 us, max = 8.247 us, min = 8.247 us, total = 8.247 us [state-dump] WorkerPool.PopWorkerCallback - 1 total (0 active), Execution time: mean = 41.143 us, total = 41.143 us, Queueing time: mean = 29.210 us, max = 29.210 us, min = 29.210 us, total = 29.210 us [state-dump] NodeManager.GCTaskFailureReason - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::NodeInfoGcsService.grpc_client.GetAllNodeInfo - 1 total (0 active), Execution time: mean = 859.855 us, total = 859.855 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] ray::rpc::JobInfoGcsService.grpc_client.GetAllJobInfo - 1 total (0 active), Execution time: mean = 938.962 us, total = 938.962 us, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s [state-dump] DebugString() time ms: 1 [state-dump] [state-dump]