Node Management
Focusing on building a robust system to register, monitor, and manage nodes.
Node Management Workflow
Node Registration: Each node registers itself with the cluster when it starts.
Heartbeat Mechanism: Nodes send periodic health checks to the cluster manager or peers.
Failure Detection: Identify and mark failed nodes.
Metadata Management: Maintain a dynamic list of active nodes. Rebalancing Trigger: Respond to node addition or removal.
Components to Implement (Current Scope):
Node Metadata:
Store: Store information about each node (e.g., ID, IP, status, capacity).
Use an in-memory database (e.g., Redis) or a distributed store (e.g., etcd).
Node Registration API:
Nodes register themselves with the cluster manager.
Store metadata in the metadata store.
Health Monitoring:
Nodes periodically send heartbeats to report their status. If a node misses multiple heartbeats, it is marked as failed.
Rebalancing:
Tasks/data are automatically redistributed when a node is added or removed.
Future Scope:
Rebalancing: Implement data redistribution logic on node failure.
UI/CLI: Create a tool to visualize and manage nodes.
Last updated