File size: 11,706 Bytes
c011401 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
[2025-01-15 18:18:02,492 I 536396 536396] (gcs_server) gcs_server_main.cc:52: Ray cluster metadata ray_version=2.40.0 ray_commit=22541c38dbef25286cd6d19f1c151bf4fd62f2ed
[2025-01-15 18:18:02,493 I 536396 536396] (gcs_server) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2025-01-15 18:18:02,498 I 536396 536396] (gcs_server) event.cc:493: Ray Event initialized for GCS
[2025-01-15 18:18:02,498 I 536396 536396] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_NODE
[2025-01-15 18:18:02,498 I 536396 536396] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_ACTOR
[2025-01-15 18:18:02,498 I 536396 536396] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_DRIVER_JOB
[2025-01-15 18:18:02,498 I 536396 536396] (gcs_server) event.cc:324: Set ray event level to warning
[2025-01-15 18:18:02,503 I 536396 536396] (gcs_server) gcs_server.cc:73: GCS storage type is StorageType::IN_MEMORY
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:42: Loading job table data.
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:54: Loading node table data.
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:80: Loading actor table data.
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:93: Loading actor task spec table data.
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:66: Loading placement group table data.
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:46: Finished loading job table data, size = 0
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:58: Finished loading node table data, size = 0
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:84: Finished loading actor table data, size = 0
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:97: Finished loading actor task spec table data, size = 0
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_init_data.cc:71: Finished loading placement group table data, size = 0
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_server.cc:162: No existing server cluster ID found. Generating new ID: a461c5a60dd8717dbff63b0f8b8483a2c52b91cc9483133e8ee1f369
[2025-01-15 18:18:02,504 I 536396 536396] (gcs_server) gcs_server.cc:644: Autoscaler V2 enabled: 0
[2025-01-15 18:18:02,509 I 536396 536396] (gcs_server) grpc_server.cc:134: GcsServer server started, listening on port 40679.
[2025-01-15 18:18:02,774 I 536396 536396] (gcs_server) gcs_server.cc:245: Gcs Debug state:
GcsNodeManager:
- RegisterNode request count: 0
- DrainNode request count: 0
- GetAllNodeInfo request count: 0
GcsActorManager:
- RegisterActor request count: 0
- CreateActor request count: 0
- GetActorInfo request count: 0
- GetNamedActorInfo request count: 0
- GetAllActorInfo request count: 0
- KillActor request count: 0
- ListNamedActors request count: 0
- Registered actors count: 0
- Destroyed actors count: 0
- Named actors count: 0
- Unresolved actors count: 0
- Pending actors count: 0
- Created actors count: 0
- owners_: 0
- actor_to_register_callbacks_: 0
- actor_to_restart_callbacks_: 0
- actor_to_create_callbacks_: 0
- sorted_destroyed_actor_list_: 0
GcsResourceManager:
- GetAllAvailableResources request count: 0
- GetAllTotalResources request count: 0
- GetAllResourceUsage request count: 0
GcsPlacementGroupManager:
- CreatePlacementGroup request count: 0
- RemovePlacementGroup request count: 0
- GetPlacementGroup request count: 0
- GetAllPlacementGroup request count: 0
- WaitPlacementGroupUntilReady request count: 0
- GetNamedPlacementGroup request count: 0
- Scheduling pending placement group count: 0
- Registered placement groups count: 0
- Named placement group count: 0
- Pending placement groups count: 0
- Infeasible placement groups count: 0
Publisher:
[runtime env manager] ID to URIs table:
[runtime env manager] URIs reference table:
GcsTaskManager:
-Total num task events reported: 0
-Total num status task events dropped: 0
-Total num profile events dropped: 0
-Current num of task events stored: 0
-Total num of actor creation tasks: 0
-Total num of actor tasks: 0
-Total num of normal tasks: 0
-Total num of driver tasks: 0
GcsAutoscalerStateManager:
- last_seen_autoscaler_state_version_: 0
- last_cluster_resource_state_version_: 0
- pending demands:
[2025-01-15 18:18:02,774 I 536396 536396] (gcs_server) gcs_server.cc:843: Main service Event stats:
Global stats: 25 total (5 active)
Queueing time: mean = 96.433 ms, max = 266.813 ms, min = 1.433 us, total = 2.411 s
Execution time: mean = 10.795 ms, total = 269.867 ms
Event stats:
GcsInMemoryStore.Put - 9 total (0 active), Execution time: mean = 29.650 ms, total = 266.846 ms, Queueing time: mean = 207.037 ms, max = 266.386 ms, min = 1.433 us, total = 1.863 s
GcsInMemoryStore.GetAll - 5 total (0 active), Execution time: mean = 6.883 us, total = 34.414 us, Queueing time: mean = 42.942 us, max = 49.679 us, min = 38.497 us, total = 214.711 us
PeriodicalRunner.RunFnPeriodically - 4 total (2 active, 1 running), Execution time: mean = 2.668 us, total = 10.673 us, Queueing time: mean = 133.380 ms, max = 266.813 ms, min = 266.708 ms, total = 533.522 ms
event_loop_lag_probe - 2 total (0 active), Execution time: mean = 5.942 us, total = 11.884 us, Queueing time: mean = 5.562 ms, max = 10.904 ms, min = 219.635 us, total = 11.124 ms
NodeInfoGcsService.grpc_server.GetClusterId - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
RayletLoadPulled - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
NodeInfoGcsService.grpc_server.GetClusterId.HandleRequestImpl - 1 total (0 active), Execution time: mean = 2.954 ms, total = 2.954 ms, Queueing time: mean = 2.621 ms, max = 2.621 ms, min = 2.621 ms, total = 2.621 ms
ClusterResourceManager.ResetRemoteNodeView - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
GcsInMemoryStore.Get - 1 total (0 active), Execution time: mean = 9.902 us, total = 9.902 us, Queueing time: mean = 2.284 us, max = 2.284 us, min = 2.284 us, total = 2.284 us
[2025-01-15 18:18:02,774 I 536396 536396] (gcs_server) gcs_server.cc:847: task_io_context Event stats:
Global stats: 5 total (1 active)
Queueing time: mean = 1.308 ms, max = 6.314 ms, min = 8.996 us, total = 6.539 ms
Execution time: mean = 26.115 us, total = 130.574 us
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 29.966 us, total = 89.897 us, Queueing time: mean = 2.129 ms, max = 6.314 ms, min = 8.996 us, total = 6.386 ms
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 40.677 us, total = 40.677 us, Queueing time: mean = 153.348 us, max = 153.348 us, min = 153.348 us, total = 153.348 us
GcsTaskManager.GcJobSummary - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 18:18:02,774 I 536396 536396] (gcs_server) gcs_server.cc:847: pubsub_io_context Event stats:
Global stats: 5 total (1 active)
Queueing time: mean = 4.183 ms, max = 13.152 ms, min = 7.029 us, total = 20.914 ms
Execution time: mean = 36.127 us, total = 180.633 us
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 45.624 us, total = 136.873 us, Queueing time: mean = 4.413 ms, max = 13.152 ms, min = 7.029 us, total = 13.238 ms
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 43.760 us, total = 43.760 us, Queueing time: mean = 7.676 ms, max = 7.676 ms, min = 7.676 ms, total = 7.676 ms
Publisher.CheckDeadSubscribers - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 18:18:02,774 I 536396 536396] (gcs_server) gcs_server.cc:847: ray_syncer_io_context Event stats:
Global stats: 5 total (0 active)
Queueing time: mean = 462.062 us, max = 1.155 ms, min = 13.994 us, total = 2.310 ms
Execution time: mean = 809.187 us, total = 4.046 ms
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 1.348 ms, total = 4.043 ms, Queueing time: mean = 751.871 us, max = 1.155 ms, min = 13.994 us, total = 2.256 ms
RaySyncerRegister - 2 total (0 active), Execution time: mean = 1.229 us, total = 2.458 us, Queueing time: mean = 27.349 us, max = 27.643 us, min = 27.054 us, total = 54.697 us
[2025-01-15 18:18:05,051 I 536396 536396] (gcs_server) gcs_node_manager.cc:85: Registering node info, address = 192.168.0.2, node name = 192.168.0.2 node_id=49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:05,051 I 536396 536396] (gcs_server) gcs_node_manager.cc:91: Finished registering node info, address = 192.168.0.2, node name = 192.168.0.2, is_head_node = 1 node_id=49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:05,051 I 536396 536396] (gcs_server) gcs_placement_group_manager.cc:819: A new node: 49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360 registered, will try to reschedule all the infeasible placement groups.
[2025-01-15 18:18:05,058 I 536396 536474] (gcs_server) ray_syncer.cc:377: Get connection node_id=49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:06,074 I 536396 536396] (gcs_server) gcs_job_manager.cc:90: Adding job, job id = 01000000, driver pid = 536329
[2025-01-15 18:18:06,074 I 536396 536396] (gcs_server) gcs_job_manager.cc:111: Finished adding job, job id = 01000000, driver pid = 536329
[2025-01-15 18:18:06,177 I 536396 536396] (gcs_server) gcs_job_manager.cc:149: Finished marking job state, job id = 01000000
[2025-01-15 18:18:06,325 I 536396 536396] (gcs_server) gcs_node_manager.cc:366: Removing node, node name = 192.168.0.2, death reason = EXPECTED_TERMINATION, death message = received SIGTERM node_id=49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:06,326 I 536396 536396] (gcs_server) gcs_placement_group_manager.cc:789: Node failed, rescheduling the placement groups on the dead node. node_id=49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:06,326 I 536396 536396] (gcs_server) gcs_actor_manager.cc:1274: Node failed, reconstructing actors. node_id=49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:06,326 I 536396 536396] (gcs_server) gcs_job_manager.cc:454: Node failed, mark all jobs from this node as finished node_id=49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:06,568 I 536396 536445] (gcs_server) ray_syncer-inl.h:318: Failed to read the message from: 49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:06,568 I 536396 536445] (gcs_server) ray_syncer.cc:373: Connection is broken. node_id=49709ded25b009838cca283b77f0f8a63a6d0f1300f65be831971360
[2025-01-15 18:18:06,589 I 536396 536396] (gcs_server) gcs_server_main.cc:130: GCS server received SIGTERM, shutting down...
[2025-01-15 18:18:06,591 I 536396 536396] (gcs_server) gcs_server.cc:267: Stopping GCS server.
[2025-01-15 18:18:06,686 I 536396 536396] (gcs_server) gcs_server.cc:284: GCS server stopped.
[2025-01-15 18:18:06,686 I 536396 536396] (gcs_server) io_service_pool.cc:47: IOServicePool is stopped.
[2025-01-15 18:18:06,703 I 536396 536396] (gcs_server) stats.h:120: Stats module has shutdown.
|