File size: 11,734 Bytes
c011401 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
[2025-01-15 18:16:18,392 I 521975 521975] (gcs_server) gcs_server_main.cc:52: Ray cluster metadata ray_version=2.40.0 ray_commit=22541c38dbef25286cd6d19f1c151bf4fd62f2ed
[2025-01-15 18:16:18,393 I 521975 521975] (gcs_server) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2025-01-15 18:16:18,399 I 521975 521975] (gcs_server) event.cc:493: Ray Event initialized for GCS
[2025-01-15 18:16:18,400 I 521975 521975] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_NODE
[2025-01-15 18:16:18,400 I 521975 521975] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_ACTOR
[2025-01-15 18:16:18,400 I 521975 521975] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_DRIVER_JOB
[2025-01-15 18:16:18,400 I 521975 521975] (gcs_server) event.cc:324: Set ray event level to warning
[2025-01-15 18:16:18,408 I 521975 521975] (gcs_server) gcs_server.cc:73: GCS storage type is StorageType::IN_MEMORY
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:42: Loading job table data.
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:54: Loading node table data.
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:80: Loading actor table data.
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:93: Loading actor task spec table data.
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:66: Loading placement group table data.
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:46: Finished loading job table data, size = 0
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:58: Finished loading node table data, size = 0
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:84: Finished loading actor table data, size = 0
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:97: Finished loading actor task spec table data, size = 0
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_init_data.cc:71: Finished loading placement group table data, size = 0
[2025-01-15 18:16:18,410 I 521975 521975] (gcs_server) gcs_server.cc:162: No existing server cluster ID found. Generating new ID: fbcacd8c56c8a248301e2268894a7704d9f9a83b5100964a45abdcd8
[2025-01-15 18:16:18,411 I 521975 521975] (gcs_server) gcs_server.cc:644: Autoscaler V2 enabled: 0
[2025-01-15 18:16:18,414 I 521975 521975] (gcs_server) grpc_server.cc:134: GcsServer server started, listening on port 64035.
[2025-01-15 18:16:18,681 I 521975 521975] (gcs_server) gcs_server.cc:245: Gcs Debug state:
GcsNodeManager:
- RegisterNode request count: 0
- DrainNode request count: 0
- GetAllNodeInfo request count: 0
GcsActorManager:
- RegisterActor request count: 0
- CreateActor request count: 0
- GetActorInfo request count: 0
- GetNamedActorInfo request count: 0
- GetAllActorInfo request count: 0
- KillActor request count: 0
- ListNamedActors request count: 0
- Registered actors count: 0
- Destroyed actors count: 0
- Named actors count: 0
- Unresolved actors count: 0
- Pending actors count: 0
- Created actors count: 0
- owners_: 0
- actor_to_register_callbacks_: 0
- actor_to_restart_callbacks_: 0
- actor_to_create_callbacks_: 0
- sorted_destroyed_actor_list_: 0
GcsResourceManager:
- GetAllAvailableResources request count: 0
- GetAllTotalResources request count: 0
- GetAllResourceUsage request count: 0
GcsPlacementGroupManager:
- CreatePlacementGroup request count: 0
- RemovePlacementGroup request count: 0
- GetPlacementGroup request count: 0
- GetAllPlacementGroup request count: 0
- WaitPlacementGroupUntilReady request count: 0
- GetNamedPlacementGroup request count: 0
- Scheduling pending placement group count: 0
- Registered placement groups count: 0
- Named placement group count: 0
- Pending placement groups count: 0
- Infeasible placement groups count: 0
Publisher:
[runtime env manager] ID to URIs table:
[runtime env manager] URIs reference table:
GcsTaskManager:
-Total num task events reported: 0
-Total num status task events dropped: 0
-Total num profile events dropped: 0
-Current num of task events stored: 0
-Total num of actor creation tasks: 0
-Total num of actor tasks: 0
-Total num of normal tasks: 0
-Total num of driver tasks: 0
GcsAutoscalerStateManager:
- last_seen_autoscaler_state_version_: 0
- last_cluster_resource_state_version_: 0
- pending demands:
[2025-01-15 18:16:18,681 I 521975 521975] (gcs_server) gcs_server.cc:843: Main service Event stats:
Global stats: 25 total (5 active)
Queueing time: mean = 96.306 ms, max = 266.430 ms, min = 4.812 us, total = 2.408 s
Execution time: mean = 10.818 ms, total = 270.458 ms
Event stats:
GcsInMemoryStore.Put - 9 total (0 active), Execution time: mean = 29.610 ms, total = 266.491 ms, Queueing time: mean = 206.223 ms, max = 265.446 ms, min = 4.812 us, total = 1.856 s
GcsInMemoryStore.GetAll - 5 total (0 active), Execution time: mean = 19.145 us, total = 95.723 us, Queueing time: mean = 122.109 us, max = 136.184 us, min = 109.748 us, total = 610.546 us
PeriodicalRunner.RunFnPeriodically - 4 total (2 active, 1 running), Execution time: mean = 2.837 us, total = 11.347 us, Queueing time: mean = 133.134 ms, max = 266.430 ms, min = 266.105 ms, total = 532.535 ms
event_loop_lag_probe - 2 total (0 active), Execution time: mean = 16.512 us, total = 33.025 us, Queueing time: mean = 8.832 ms, max = 17.253 ms, min = 410.534 us, total = 17.663 ms
ClusterResourceManager.ResetRemoteNodeView - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
GcsInMemoryStore.Get - 1 total (0 active), Execution time: mean = 28.642 us, total = 28.642 us, Queueing time: mean = 6.770 us, max = 6.770 us, min = 6.770 us, total = 6.770 us
RayletLoadPulled - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
NodeInfoGcsService.grpc_server.GetClusterId - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
NodeInfoGcsService.grpc_server.GetClusterId.HandleRequestImpl - 1 total (0 active), Execution time: mean = 3.798 ms, total = 3.798 ms, Queueing time: mean = 821.260 us, max = 821.260 us, min = 821.260 us, total = 821.260 us
[2025-01-15 18:16:18,681 I 521975 521975] (gcs_server) gcs_server.cc:847: task_io_context Event stats:
Global stats: 5 total (1 active)
Queueing time: mean = 195.899 us, max = 836.357 us, min = 12.790 us, total = 979.495 us
Execution time: mean = 691.599 us, total = 3.458 ms
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 1.149 ms, total = 3.446 ms, Queueing time: mean = 295.656 us, max = 836.357 us, min = 12.790 us, total = 886.967 us
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 11.991 us, total = 11.991 us, Queueing time: mean = 92.528 us, max = 92.528 us, min = 92.528 us, total = 92.528 us
GcsTaskManager.GcJobSummary - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 18:16:18,681 I 521975 521975] (gcs_server) gcs_server.cc:847: pubsub_io_context Event stats:
Global stats: 5 total (1 active)
Queueing time: mean = 490.313 us, max = 2.032 ms, min = 9.160 us, total = 2.452 ms
Execution time: mean = 778.608 us, total = 3.893 ms
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 1.288 ms, total = 3.865 ms, Queueing time: mean = 783.284 us, max = 2.032 ms, min = 9.160 us, total = 2.350 ms
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 27.793 us, total = 27.793 us, Queueing time: mean = 101.715 us, max = 101.715 us, min = 101.715 us, total = 101.715 us
Publisher.CheckDeadSubscribers - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 18:16:18,682 I 521975 521975] (gcs_server) gcs_server.cc:847: ray_syncer_io_context Event stats:
Global stats: 5 total (0 active)
Queueing time: mean = 386.562 us, max = 1.135 ms, min = 5.272 us, total = 1.933 ms
Execution time: mean = 941.379 us, total = 4.707 ms
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 1.568 ms, total = 4.705 ms, Queueing time: mean = 552.885 us, max = 1.135 ms, min = 5.272 us, total = 1.659 ms
RaySyncerRegister - 2 total (0 active), Execution time: mean = 1.046 us, total = 2.091 us, Queueing time: mean = 137.077 us, max = 137.620 us, min = 136.534 us, total = 274.154 us
[2025-01-15 18:16:20,986 I 521975 521975] (gcs_server) gcs_node_manager.cc:85: Registering node info, address = 192.168.0.2, node name = 192.168.0.2 node_id=a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:20,986 I 521975 521975] (gcs_server) gcs_node_manager.cc:91: Finished registering node info, address = 192.168.0.2, node name = 192.168.0.2, is_head_node = 1 node_id=a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:20,986 I 521975 521975] (gcs_server) gcs_placement_group_manager.cc:819: A new node: a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e registered, will try to reschedule all the infeasible placement groups.
[2025-01-15 18:16:20,992 I 521975 522058] (gcs_server) ray_syncer.cc:377: Get connection node_id=a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:21,928 I 521975 521975] (gcs_server) gcs_job_manager.cc:90: Adding job, job id = 01000000, driver pid = 521907
[2025-01-15 18:16:21,928 I 521975 521975] (gcs_server) gcs_job_manager.cc:111: Finished adding job, job id = 01000000, driver pid = 521907
[2025-01-15 18:16:28,230 I 521975 521975] (gcs_server) gcs_job_manager.cc:149: Finished marking job state, job id = 01000000
[2025-01-15 18:16:28,262 I 521975 521975] (gcs_server) gcs_node_manager.cc:366: Removing node, node name = 192.168.0.2, death reason = EXPECTED_TERMINATION, death message = received SIGTERM node_id=a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:28,262 I 521975 521975] (gcs_server) gcs_placement_group_manager.cc:789: Node failed, rescheduling the placement groups on the dead node. node_id=a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:28,262 I 521975 521975] (gcs_server) gcs_actor_manager.cc:1274: Node failed, reconstructing actors. node_id=a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:28,262 I 521975 521975] (gcs_server) gcs_job_manager.cc:454: Node failed, mark all jobs from this node as finished node_id=a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:28,307 I 521975 522024] (gcs_server) ray_syncer-inl.h:318: Failed to read the message from: a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:28,307 I 521975 522024] (gcs_server) ray_syncer.cc:373: Connection is broken. node_id=a15143d135fdf90e02f60919a4290e512d91060e1590c2c0978ed15e
[2025-01-15 18:16:28,325 I 521975 521975] (gcs_server) gcs_server_main.cc:130: GCS server received SIGTERM, shutting down...
[2025-01-15 18:16:28,326 I 521975 521975] (gcs_server) gcs_server.cc:267: Stopping GCS server.
[2025-01-15 18:16:28,409 I 521975 521975] (gcs_server) gcs_server.cc:284: GCS server stopped.
[2025-01-15 18:16:28,409 I 521975 521975] (gcs_server) io_service_pool.cc:47: IOServicePool is stopped.
[2025-01-15 18:16:28,513 I 521975 521975] (gcs_server) stats.h:120: Stats module has shutdown.
|