File size: 11,725 Bytes
c011401 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
[2025-01-15 18:16:33,157 I 524477 524477] (gcs_server) gcs_server_main.cc:52: Ray cluster metadata ray_version=2.40.0 ray_commit=22541c38dbef25286cd6d19f1c151bf4fd62f2ed
[2025-01-15 18:16:33,158 I 524477 524477] (gcs_server) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2025-01-15 18:16:33,165 I 524477 524477] (gcs_server) event.cc:493: Ray Event initialized for GCS
[2025-01-15 18:16:33,165 I 524477 524477] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_NODE
[2025-01-15 18:16:33,165 I 524477 524477] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_ACTOR
[2025-01-15 18:16:33,165 I 524477 524477] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_DRIVER_JOB
[2025-01-15 18:16:33,165 I 524477 524477] (gcs_server) event.cc:324: Set ray event level to warning
[2025-01-15 18:16:33,169 I 524477 524477] (gcs_server) gcs_server.cc:73: GCS storage type is StorageType::IN_MEMORY
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:42: Loading job table data.
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:54: Loading node table data.
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:80: Loading actor table data.
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:93: Loading actor task spec table data.
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:66: Loading placement group table data.
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:46: Finished loading job table data, size = 0
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:58: Finished loading node table data, size = 0
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:84: Finished loading actor table data, size = 0
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:97: Finished loading actor task spec table data, size = 0
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_init_data.cc:71: Finished loading placement group table data, size = 0
[2025-01-15 18:16:33,172 I 524477 524477] (gcs_server) gcs_server.cc:162: No existing server cluster ID found. Generating new ID: ab3f4b398b63931344ccd6485d119ddda8f66a4078dbcb143e8422fb
[2025-01-15 18:16:33,173 I 524477 524477] (gcs_server) gcs_server.cc:644: Autoscaler V2 enabled: 0
[2025-01-15 18:16:33,176 I 524477 524477] (gcs_server) grpc_server.cc:134: GcsServer server started, listening on port 60391.
[2025-01-15 18:16:33,423 I 524477 524477] (gcs_server) gcs_server.cc:245: Gcs Debug state:
GcsNodeManager:
- RegisterNode request count: 0
- DrainNode request count: 0
- GetAllNodeInfo request count: 0
GcsActorManager:
- RegisterActor request count: 0
- CreateActor request count: 0
- GetActorInfo request count: 0
- GetNamedActorInfo request count: 0
- GetAllActorInfo request count: 0
- KillActor request count: 0
- ListNamedActors request count: 0
- Registered actors count: 0
- Destroyed actors count: 0
- Named actors count: 0
- Unresolved actors count: 0
- Pending actors count: 0
- Created actors count: 0
- owners_: 0
- actor_to_register_callbacks_: 0
- actor_to_restart_callbacks_: 0
- actor_to_create_callbacks_: 0
- sorted_destroyed_actor_list_: 0
GcsResourceManager:
- GetAllAvailableResources request count: 0
- GetAllTotalResources request count: 0
- GetAllResourceUsage request count: 0
GcsPlacementGroupManager:
- CreatePlacementGroup request count: 0
- RemovePlacementGroup request count: 0
- GetPlacementGroup request count: 0
- GetAllPlacementGroup request count: 0
- WaitPlacementGroupUntilReady request count: 0
- GetNamedPlacementGroup request count: 0
- Scheduling pending placement group count: 0
- Registered placement groups count: 0
- Named placement group count: 0
- Pending placement groups count: 0
- Infeasible placement groups count: 0
Publisher:
[runtime env manager] ID to URIs table:
[runtime env manager] URIs reference table:
GcsTaskManager:
-Total num task events reported: 0
-Total num status task events dropped: 0
-Total num profile events dropped: 0
-Current num of task events stored: 0
-Total num of actor creation tasks: 0
-Total num of actor tasks: 0
-Total num of normal tasks: 0
-Total num of driver tasks: 0
GcsAutoscalerStateManager:
- last_seen_autoscaler_state_version_: 0
- last_cluster_resource_state_version_: 0
- pending demands:
[2025-01-15 18:16:33,423 I 524477 524477] (gcs_server) gcs_server.cc:843: Main service Event stats:
Global stats: 25 total (5 active)
Queueing time: mean = 89.373 ms, max = 247.243 ms, min = 4.314 us, total = 2.234 s
Execution time: mean = 10.036 ms, total = 250.904 ms
Event stats:
GcsInMemoryStore.Put - 9 total (0 active), Execution time: mean = 27.478 ms, total = 247.302 ms, Queueing time: mean = 191.297 ms, max = 246.401 ms, min = 4.314 us, total = 1.722 s
GcsInMemoryStore.GetAll - 5 total (0 active), Execution time: mean = 17.860 us, total = 89.302 us, Queueing time: mean = 153.630 us, max = 162.900 us, min = 143.761 us, total = 768.150 us
PeriodicalRunner.RunFnPeriodically - 4 total (2 active, 1 running), Execution time: mean = 3.463 us, total = 13.852 us, Queueing time: mean = 123.572 ms, max = 247.243 ms, min = 247.044 ms, total = 494.288 ms
event_loop_lag_probe - 2 total (0 active), Execution time: mean = 35.718 us, total = 71.435 us, Queueing time: mean = 7.272 ms, max = 14.191 ms, min = 352.666 us, total = 14.544 ms
NodeInfoGcsService.grpc_server.GetClusterId - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
GcsInMemoryStore.Get - 1 total (0 active), Execution time: mean = 26.104 us, total = 26.104 us, Queueing time: mean = 6.260 us, max = 6.260 us, min = 6.260 us, total = 6.260 us
NodeInfoGcsService.grpc_server.GetClusterId.HandleRequestImpl - 1 total (0 active), Execution time: mean = 3.401 ms, total = 3.401 ms, Queueing time: mean = 3.049 ms, max = 3.049 ms, min = 3.049 ms, total = 3.049 ms
ClusterResourceManager.ResetRemoteNodeView - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
RayletLoadPulled - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 18:16:33,423 I 524477 524477] (gcs_server) gcs_server.cc:847: task_io_context Event stats:
Global stats: 5 total (1 active)
Queueing time: mean = 118.887 us, max = 466.435 us, min = 10.405 us, total = 594.433 us
Execution time: mean = 291.379 us, total = 1.457 ms
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 481.862 us, total = 1.446 ms, Queueing time: mean = 169.524 us, max = 466.435 us, min = 10.405 us, total = 508.572 us
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 11.307 us, total = 11.307 us, Queueing time: mean = 85.861 us, max = 85.861 us, min = 85.861 us, total = 85.861 us
GcsTaskManager.GcJobSummary - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 18:16:33,423 I 524477 524477] (gcs_server) gcs_server.cc:847: pubsub_io_context Event stats:
Global stats: 5 total (1 active)
Queueing time: mean = 1.368 ms, max = 6.613 ms, min = 8.946 us, total = 6.839 ms
Execution time: mean = 38.214 us, total = 191.071 us
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 47.538 us, total = 142.615 us, Queueing time: mean = 2.237 ms, max = 6.613 ms, min = 8.946 us, total = 6.710 ms
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 48.456 us, total = 48.456 us, Queueing time: mean = 129.759 us, max = 129.759 us, min = 129.759 us, total = 129.759 us
Publisher.CheckDeadSubscribers - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 18:16:33,423 I 524477 524477] (gcs_server) gcs_server.cc:847: ray_syncer_io_context Event stats:
Global stats: 5 total (0 active)
Queueing time: mean = 1.176 ms, max = 5.645 ms, min = 8.756 us, total = 5.879 ms
Execution time: mean = 46.221 us, total = 231.103 us
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 76.322 us, total = 228.966 us, Queueing time: mean = 1.898 ms, max = 5.645 ms, min = 8.756 us, total = 5.694 ms
RaySyncerRegister - 2 total (0 active), Execution time: mean = 1.069 us, total = 2.137 us, Queueing time: mean = 92.701 us, max = 96.092 us, min = 89.311 us, total = 185.403 us
[2025-01-15 18:16:35,707 I 524477 524477] (gcs_server) gcs_node_manager.cc:85: Registering node info, address = 192.168.0.2, node name = 192.168.0.2 node_id=b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:35,707 I 524477 524477] (gcs_server) gcs_node_manager.cc:91: Finished registering node info, address = 192.168.0.2, node name = 192.168.0.2, is_head_node = 1 node_id=b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:35,707 I 524477 524477] (gcs_server) gcs_placement_group_manager.cc:819: A new node: b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866 registered, will try to reschedule all the infeasible placement groups.
[2025-01-15 18:16:35,714 I 524477 524553] (gcs_server) ray_syncer.cc:377: Get connection node_id=b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:36,663 I 524477 524477] (gcs_server) gcs_job_manager.cc:90: Adding job, job id = 01000000, driver pid = 524410
[2025-01-15 18:16:36,663 I 524477 524477] (gcs_server) gcs_job_manager.cc:111: Finished adding job, job id = 01000000, driver pid = 524410
[2025-01-15 18:16:36,759 I 524477 524477] (gcs_server) gcs_job_manager.cc:149: Finished marking job state, job id = 01000000
[2025-01-15 18:16:36,783 I 524477 524477] (gcs_server) gcs_node_manager.cc:366: Removing node, node name = 192.168.0.2, death reason = EXPECTED_TERMINATION, death message = received SIGTERM node_id=b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:36,783 I 524477 524477] (gcs_server) gcs_placement_group_manager.cc:789: Node failed, rescheduling the placement groups on the dead node. node_id=b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:36,784 I 524477 524477] (gcs_server) gcs_actor_manager.cc:1274: Node failed, reconstructing actors. node_id=b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:36,784 I 524477 524477] (gcs_server) gcs_job_manager.cc:454: Node failed, mark all jobs from this node as finished node_id=b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:37,027 I 524477 524526] (gcs_server) ray_syncer-inl.h:318: Failed to read the message from: b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:37,027 I 524477 524526] (gcs_server) ray_syncer.cc:373: Connection is broken. node_id=b7c872131a77b6a90b7f82e0d3613c1ee6e1bf132dbc42aab7cd8866
[2025-01-15 18:16:37,047 I 524477 524477] (gcs_server) gcs_server_main.cc:130: GCS server received SIGTERM, shutting down...
[2025-01-15 18:16:37,049 I 524477 524477] (gcs_server) gcs_server.cc:267: Stopping GCS server.
[2025-01-15 18:16:37,125 I 524477 524477] (gcs_server) gcs_server.cc:284: GCS server stopped.
[2025-01-15 18:16:37,125 I 524477 524477] (gcs_server) io_service_pool.cc:47: IOServicePool is stopped.
[2025-01-15 18:16:37,169 I 524477 524477] (gcs_server) stats.h:120: Stats module has shutdown.
|