JayKimDevolved's picture
JayKimDevolved/deepseek
c011401 verified
NodeManager:
Node ID: 58d80ea2d267ecee5ffb7fe19b48e69ad93cbf27de2f37efe3848792
Node name: 192.168.0.2
InitialConfigResources: {memory: 853957464070000, CPU: 200000, GPU: 20000, accelerator_type:A40: 10000, node:192.168.0.2: 10000, object_store_memory: 21474836480000, node:__internal_head__: 10000}
ClusterTaskManager:
========== Node: 58d80ea2d267ecee5ffb7fe19b48e69ad93cbf27de2f37efe3848792 =================
Infeasible queue length: 0
Schedule queue length: 0
Dispatch queue length: 0
num_waiting_for_resource: 0
num_waiting_for_plasma_memory: 0
num_waiting_for_remote_node_resources: 0
num_worker_not_started_by_job_config_not_exist: 0
num_worker_not_started_by_registration_timeout: 0
num_tasks_waiting_for_workers: 0
num_cancelled_tasks: 0
cluster_resource_scheduler state:
Local id: 4176580051252218132 Local resources: {"total":{GPU: [10000, 10000], node:__internal_head__: [10000], memory: [853957464070000], object_store_memory: [21474836480000], node:192.168.0.2: [10000], accelerator_type:A40: [10000], CPU: [200000]}}, "available": {GPU: [10000, 10000], node:__internal_head__: [10000], memory: [853957464070000], object_store_memory: [21474836480000], node:192.168.0.2: [10000], accelerator_type:A40: [10000], CPU: [200000]}}, "labels":{"ray.io/node_id":"58d80ea2d267ecee5ffb7fe19b48e69ad93cbf27de2f37efe3848792",} is_draining: 0 is_idle: 1 Cluster resources: node id: 4176580051252218132{"total":{accelerator_type:A40: 10000, GPU: 20000, object_store_memory: 21474836480000, CPU: 200000, node:__internal_head__: 10000, memory: 853957464070000, node:192.168.0.2: 10000}}, "available": {accelerator_type:A40: 10000, GPU: 20000, object_store_memory: 21474836480000, CPU: 200000, node:__internal_head__: 10000, memory: 853957464070000, node:192.168.0.2: 10000}}, "labels":{"ray.io/node_id":"58d80ea2d267ecee5ffb7fe19b48e69ad93cbf27de2f37efe3848792",}, "is_draining": 0, "draining_deadline_timestamp_ms": -1} { "placment group locations": [], "node to bundles": []}
Waiting tasks size: 0
Number of executing tasks: 0
Number of pinned task arguments: 0
Number of total spilled tasks: 0
Number of spilled waiting tasks: 0
Number of spilled unschedulable tasks: 0
Resource usage {
}
Backlog Size per scheduling descriptor :{workerId: num backlogs}:
Running tasks by scheduling class:
==================================================
ClusterResources:
LocalObjectManager:
- num pinned objects: 0
- pinned objects size: 0
- num objects pending restore: 0
- num objects pending spill: 0
- num bytes pending spill: 0
- num bytes currently spilled: 0
- cumulative spill requests: 0
- cumulative restore requests: 0
- spilled objects pending delete: 0
ObjectManager:
- num local objects: 0
- num unfulfilled push requests: 0
- num object pull requests: 0
- num chunks received total: 0
- num chunks received failed (all): 0
- num chunks received failed / cancelled: 0
- num chunks received failed / plasma error: 0
Event stats:
Global stats: 0 total (0 active)
Queueing time: mean = -nan s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
Execution time: mean = -nan s, total = 0.000 s
Event stats:
PushManager:
- num pushes in flight: 0
- num chunks in flight: 0
- num chunks remaining: 0
- max chunks allowed: 409
OwnershipBasedObjectDirectory:
- num listeners: 0
- cumulative location updates: 0
- num location updates per second: 0.000
- num location lookups per second: 0.000
- num locations added per second: 0.000
- num locations removed per second: 0.000
BufferPool:
- create buffer state map size: 0
PullManager:
- num bytes available for pulled objects: 2147483648
- num bytes being pulled (all): 0
- num bytes being pulled / pinned: 0
- get request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable}
- wait request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable}
- task request bundles: BundlePullRequestQueue{0 total, 0 active, 0 inactive, 0 unpullable}
- first get request bundle: N/A
- first wait request bundle: N/A
- first task request bundle: N/A
- num objects queued: 0
- num objects actively pulled (all): 0
- num objects actively pulled / pinned: 0
- num bundles being pulled: 0
- num pull retries: 0
- max timeout seconds: 0
- max timeout request is already processed. No entry.
WorkerPool:
- registered jobs: 0
- process_failed_job_config_missing: 0
- process_failed_rate_limited: 0
- process_failed_pending_registration: 0
- process_failed_runtime_env_setup_failed: 0
- num PYTHON workers: 0
- num PYTHON drivers: 0
- num PYTHON pending start requests: 0
- num PYTHON pending registration requests: 0
- num object spill callbacks queued: 0
- num object restore queued: 0
- num util functions queued: 0
- num idle workers: 0
TaskDependencyManager:
- task deps map size: 0
- get req map size: 0
- wait req map size: 0
- local objects map size: 0
WaitManager:
- num active wait requests: 0
Subscriber:
Channel WORKER_OBJECT_EVICTION
- cumulative subscribe requests: 0
- cumulative unsubscribe requests: 0
- active subscribed publishers: 0
- cumulative published messages: 0
- cumulative processed messages: 0
Channel WORKER_REF_REMOVED_CHANNEL
- cumulative subscribe requests: 0
- cumulative unsubscribe requests: 0
- active subscribed publishers: 0
- cumulative published messages: 0
- cumulative processed messages: 0
Channel WORKER_OBJECT_LOCATIONS_CHANNEL
- cumulative subscribe requests: 0
- cumulative unsubscribe requests: 0
- active subscribed publishers: 0
- cumulative published messages: 0
- cumulative processed messages: 0
num async plasma notifications: 0
Remote node managers:
Event stats:
Global stats: 23 total (13 active)
Queueing time: mean = 1.369 ms, max = 9.572 ms, min = 28.030 us, total = 31.478 ms
Execution time: mean = 44.464 ms, total = 1.023 s
Event stats:
PeriodicalRunner.RunFnPeriodically - 11 total (6 active, 1 running), Execution time: mean = 26.069 us, total = 286.759 us, Queueing time: mean = 2.845 ms, max = 9.572 ms, min = 1.530 ms, total = 31.298 ms
MemoryMonitor.CheckIsMemoryUsageAboveThreshold - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
NodeManager.GCTaskFailureReason - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode.OnReplyReceived - 1 total (0 active), Execution time: mean = 296.305 us, total = 296.305 us, Queueing time: mean = 34.539 us, max = 34.539 us, min = 34.539 us, total = 34.539 us
ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberPoll - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig - 1 total (0 active), Execution time: mean = 1.938 ms, total = 1.938 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
ray::rpc::InternalKVGcsService.grpc_client.GetInternalConfig.OnReplyReceived - 1 total (0 active), Execution time: mean = 1.018 s, total = 1.018 s, Queueing time: mean = 118.007 us, max = 118.007 us, min = 118.007 us, total = 118.007 us
RayletWorkerPool.deadline_timer.kill_idle_workers - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
ClusterResourceManager.ResetRemoteNodeView - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
ObjectManager.UpdateAvailableMemory - 1 total (0 active), Execution time: mean = 3.894 us, total = 3.894 us, Queueing time: mean = 28.030 us, max = 28.030 us, min = 28.030 us, total = 28.030 us
ray::rpc::NodeInfoGcsService.grpc_client.RegisterNode - 1 total (0 active), Execution time: mean = 2.489 ms, total = 2.489 ms, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
ray::rpc::InternalPubSubGcsService.grpc_client.GcsSubscriberCommandBatch - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
NodeManager.ScheduleAndDispatchTasks - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
DebugString() time ms: 1