File size: 11,735 Bytes
c011401 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
[2025-01-15 18:14:57,076 I 515138 515138] (gcs_server) gcs_server_main.cc:52: Ray cluster metadata ray_version=2.40.0 ray_commit=22541c38dbef25286cd6d19f1c151bf4fd62f2ed
[2025-01-15 18:14:57,076 I 515138 515138] (gcs_server) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2025-01-15 18:14:57,081 I 515138 515138] (gcs_server) event.cc:493: Ray Event initialized for GCS
[2025-01-15 18:14:57,081 I 515138 515138] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_NODE
[2025-01-15 18:14:57,081 I 515138 515138] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_ACTOR
[2025-01-15 18:14:57,081 I 515138 515138] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_DRIVER_JOB
[2025-01-15 18:14:57,081 I 515138 515138] (gcs_server) event.cc:324: Set ray event level to warning
[2025-01-15 18:14:57,083 I 515138 515138] (gcs_server) gcs_server.cc:73: GCS storage type is StorageType::IN_MEMORY
[2025-01-15 18:14:57,084 I 515138 515138] (gcs_server) gcs_init_data.cc:42: Loading job table data.
[2025-01-15 18:14:57,084 I 515138 515138] (gcs_server) gcs_init_data.cc:54: Loading node table data.
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_init_data.cc:80: Loading actor table data.
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_init_data.cc:93: Loading actor task spec table data.
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_init_data.cc:66: Loading placement group table data.
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_init_data.cc:46: Finished loading job table data, size = 0
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_init_data.cc:58: Finished loading node table data, size = 0
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_init_data.cc:84: Finished loading actor table data, size = 0
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_init_data.cc:97: Finished loading actor task spec table data, size = 0
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_init_data.cc:71: Finished loading placement group table data, size = 0
[2025-01-15 18:14:57,085 I 515138 515138] (gcs_server) gcs_server.cc:162: No existing server cluster ID found. Generating new ID: a419fe8f44f8ad2b636439cf673e9c3439ef0bc9e5cdaaf6ae4e6f0e
[2025-01-15 18:14:57,086 I 515138 515138] (gcs_server) gcs_server.cc:644: Autoscaler V2 enabled: 0
[2025-01-15 18:14:57,089 I 515138 515138] (gcs_server) grpc_server.cc:134: GcsServer server started, listening on port 54681.
[2025-01-15 18:14:57,305 I 515138 515138] (gcs_server) gcs_server.cc:245: Gcs Debug state:
GcsNodeManager:
- RegisterNode request count: 0
- DrainNode request count: 0
- GetAllNodeInfo request count: 0
GcsActorManager:
- RegisterActor request count: 0
- CreateActor request count: 0
- GetActorInfo request count: 0
- GetNamedActorInfo request count: 0
- GetAllActorInfo request count: 0
- KillActor request count: 0
- ListNamedActors request count: 0
- Registered actors count: 0
- Destroyed actors count: 0
- Named actors count: 0
- Unresolved actors count: 0
- Pending actors count: 0
- Created actors count: 0
- owners_: 0
- actor_to_register_callbacks_: 0
- actor_to_restart_callbacks_: 0
- actor_to_create_callbacks_: 0
- sorted_destroyed_actor_list_: 0
GcsResourceManager:
- GetAllAvailableResources request count: 0
- GetAllTotalResources request count: 0
- GetAllResourceUsage request count: 0
GcsPlacementGroupManager:
- CreatePlacementGroup request count: 0
- RemovePlacementGroup request count: 0
- GetPlacementGroup request count: 0
- GetAllPlacementGroup request count: 0
- WaitPlacementGroupUntilReady request count: 0
- GetNamedPlacementGroup request count: 0
- Scheduling pending placement group count: 0
- Registered placement groups count: 0
- Named placement group count: 0
- Pending placement groups count: 0
- Infeasible placement groups count: 0
Publisher:
[runtime env manager] ID to URIs table:
[runtime env manager] URIs reference table:
GcsTaskManager:
-Total num task events reported: 0
-Total num status task events dropped: 0
-Total num profile events dropped: 0
-Current num of task events stored: 0
-Total num of actor creation tasks: 0
-Total num of actor tasks: 0
-Total num of normal tasks: 0
-Total num of driver tasks: 0
GcsAutoscalerStateManager:
- last_seen_autoscaler_state_version_: 0
- last_cluster_resource_state_version_: 0
- pending demands:
[2025-01-15 18:14:57,305 I 515138 515138] (gcs_server) gcs_server.cc:843: Main service Event stats:
Global stats: 25 total (5 active)
Queueing time: mean = 78.389 ms, max = 217.674 ms, min = 4.793 us, total = 1.960 s
Execution time: mean = 8.792 ms, total = 219.796 ms
Event stats:
GcsInMemoryStore.Put - 9 total (0 active), Execution time: mean = 24.192 ms, total = 217.731 ms, Queueing time: mean = 168.296 ms, max = 216.843 ms, min = 4.793 us, total = 1.515 s
GcsInMemoryStore.GetAll - 5 total (0 active), Execution time: mean = 18.340 us, total = 91.699 us, Queueing time: mean = 128.029 us, max = 142.769 us, min = 115.660 us, total = 640.144 us
PeriodicalRunner.RunFnPeriodically - 4 total (2 active, 1 running), Execution time: mean = 5.202 us, total = 20.808 us, Queueing time: mean = 108.787 ms, max = 217.674 ms, min = 217.476 ms, total = 435.150 ms
event_loop_lag_probe - 2 total (0 active), Execution time: mean = 21.390 us, total = 42.781 us, Queueing time: mean = 4.237 ms, max = 8.129 ms, min = 346.040 us, total = 8.475 ms
NodeInfoGcsService.grpc_server.GetClusterId.HandleRequestImpl - 1 total (0 active), Execution time: mean = 1.878 ms, total = 1.878 ms, Queueing time: mean = 787.398 us, max = 787.398 us, min = 787.398 us, total = 787.398 us
RayletLoadPulled - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
NodeInfoGcsService.grpc_server.GetClusterId - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
ClusterResourceManager.ResetRemoteNodeView - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
GcsInMemoryStore.Get - 1 total (0 active), Execution time: mean = 31.932 us, total = 31.932 us, Queueing time: mean = 6.847 us, max = 6.847 us, min = 6.847 us, total = 6.847 us
[2025-01-15 18:14:57,305 I 515138 515138] (gcs_server) gcs_server.cc:847: task_io_context Event stats:
Global stats: 4 total (1 active)
Queueing time: mean = 103.746 us, max = 342.086 us, min = 20.975 us, total = 414.983 us
Execution time: mean = 113.502 us, total = 454.008 us
Event stats:
event_loop_lag_probe - 2 total (0 active), Execution time: mean = 222.211 us, total = 444.422 us, Queueing time: mean = 197.004 us, max = 342.086 us, min = 51.922 us, total = 394.008 us
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 9.586 us, total = 9.586 us, Queueing time: mean = 20.975 us, max = 20.975 us, min = 20.975 us, total = 20.975 us
GcsTaskManager.GcJobSummary - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 18:14:57,305 I 515138 515138] (gcs_server) gcs_server.cc:847: pubsub_io_context Event stats:
Global stats: 4 total (1 active)
Queueing time: mean = 620.415 us, max = 2.442 ms, min = 18.115 us, total = 2.482 ms
Execution time: mean = 38.981 us, total = 155.923 us
Event stats:
event_loop_lag_probe - 2 total (0 active), Execution time: mean = 74.652 us, total = 149.304 us, Queueing time: mean = 1.232 ms, max = 2.442 ms, min = 21.644 us, total = 2.464 ms
Publisher.CheckDeadSubscribers - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 6.619 us, total = 6.619 us, Queueing time: mean = 18.115 us, max = 18.115 us, min = 18.115 us, total = 18.115 us
[2025-01-15 18:14:57,305 I 515138 515138] (gcs_server) gcs_server.cc:847: ray_syncer_io_context Event stats:
Global stats: 4 total (0 active)
Queueing time: mean = 505.683 us, max = 1.960 ms, min = 13.580 us, total = 2.023 ms
Execution time: mean = 20.581 us, total = 82.325 us
Event stats:
RaySyncerRegister - 2 total (0 active), Execution time: mean = 783.500 ns, total = 1.567 us, Queueing time: mean = 24.638 us, max = 29.563 us, min = 19.713 us, total = 49.276 us
event_loop_lag_probe - 2 total (0 active), Execution time: mean = 40.379 us, total = 80.758 us, Queueing time: mean = 986.727 us, max = 1.960 ms, min = 13.580 us, total = 1.973 ms
[2025-01-15 18:14:59,416 I 515138 515138] (gcs_server) gcs_node_manager.cc:85: Registering node info, address = 192.168.0.2, node name = 192.168.0.2 node_id=ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:14:59,416 I 515138 515138] (gcs_server) gcs_node_manager.cc:91: Finished registering node info, address = 192.168.0.2, node name = 192.168.0.2, is_head_node = 1 node_id=ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:14:59,416 I 515138 515138] (gcs_server) gcs_placement_group_manager.cc:819: A new node: ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa registered, will try to reschedule all the infeasible placement groups.
[2025-01-15 18:14:59,425 I 515138 515223] (gcs_server) ray_syncer.cc:377: Get connection node_id=ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:15:00,221 I 515138 515138] (gcs_server) gcs_job_manager.cc:90: Adding job, job id = 01000000, driver pid = 515070
[2025-01-15 18:15:00,222 I 515138 515138] (gcs_server) gcs_job_manager.cc:111: Finished adding job, job id = 01000000, driver pid = 515070
[2025-01-15 18:15:02,286 I 515138 515138] (gcs_server) gcs_job_manager.cc:149: Finished marking job state, job id = 01000000
[2025-01-15 18:15:02,367 I 515138 515138] (gcs_server) gcs_node_manager.cc:366: Removing node, node name = 192.168.0.2, death reason = EXPECTED_TERMINATION, death message = received SIGTERM node_id=ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:15:02,368 I 515138 515138] (gcs_server) gcs_placement_group_manager.cc:789: Node failed, rescheduling the placement groups on the dead node. node_id=ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:15:02,368 I 515138 515138] (gcs_server) gcs_actor_manager.cc:1274: Node failed, reconstructing actors. node_id=ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:15:02,368 I 515138 515138] (gcs_server) gcs_job_manager.cc:454: Node failed, mark all jobs from this node as finished node_id=ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:15:02,620 I 515138 515187] (gcs_server) ray_syncer-inl.h:318: Failed to read the message from: ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:15:02,620 I 515138 515187] (gcs_server) ray_syncer.cc:373: Connection is broken. node_id=ec948722efaeeddc4be8139d4dfc4bcf5960bb2e8c96129f28aa92fa
[2025-01-15 18:15:02,631 I 515138 515138] (gcs_server) gcs_server_main.cc:130: GCS server received SIGTERM, shutting down...
[2025-01-15 18:15:02,632 I 515138 515138] (gcs_server) gcs_server.cc:267: Stopping GCS server.
[2025-01-15 18:15:02,720 I 515138 515138] (gcs_server) gcs_server.cc:284: GCS server stopped.
[2025-01-15 18:15:02,720 I 515138 515138] (gcs_server) io_service_pool.cc:47: IOServicePool is stopped.
[2025-01-15 18:15:02,786 I 515138 515138] (gcs_server) stats.h:120: Stats module has shutdown.
|