High-Risk Operations and Solutions

2025-12-18 07:11:30

This section introduces high-risk operations and solutions for the Cloud Container Engine.

During the deployment of services on container clusters, users may perform potentially high-risk operations that could trigger business disruptions of varying severity. To help users better anticipate and avoid operational risks, this document highlights some high-risk operations at the cluster node level, their potential consequences, and recommended solutions to prevent accidental issues.

Node Type	High-Risk Operation	Consequences	Solution
Master Node	Node expiration or deletion	Master node becomes unavailable. If it’s the only master, the entire cluster fails.	Irrecoverable
Master Node	Manually modifying master or etcd versions	May cause cluster failure.	Revert to original versions.
Master Node	Deleting or formatting core directories (e.g., /etc/kubernetes, /data/containerd)	Master node becomes unavailable. If it’s the only master, the entire cluster fails.	Irrecoverable
Master Node	Reinstalling the OS	Master components are deleted. If it’s the only master, the entire cluster fails.	Irrecoverable
Master Node	Removing critical kernel modules/files	Master node becomes unavailable. If it’s the only master, the entire cluster fails.	Irrecoverable
Master Node	Modifying OS configurations	May cause master node failure. If it’s the only master, the entire cluster fails.	Manually restore original configurations.
Master Node	Modifying core component parameters	May cause master node failure.	Restore default parameters.
Master Node	Modifying /etc/resolv.conf or other key configs	May cause network failures or image pull errors.	Manually restore original configurations.
Master Node	Manually replacing master/etcd certificates	May cause cluster failure.	Irrecoverable
Master Node	Changing the node IP	Master node becomes unavailable.	Revert to the original IP.
Master Node	High resource usage by workloads	May cause core component or node failure.	Clean up resources and set proper quotas.
Master Node	Changing the hostname	Master node becomes unavailable.	Revert to the original hostname.
Node (Worker) Node	Node deletion or expiration	Node becomes unavailable.	Irrecoverable
Node (Worker) Node	Reinstalling the OS	Node becomes unavailable.	Irrecoverable
Node (Worker) Node	Removing critical kernel modules/files	Node becomes unavailable.	Irrecoverable
Node (Worker) Node	Modifying OS configurations	May cause node failure.	Attempt to restore original configurations.
Node (Worker) Node	Modifying core component parameters	May cause node failure.	Restore default parameters.
Node (Worker) Node	Deleting/modifying critical data directories or disks	Node becomes unavailable.	Irrecoverable
Node (Worker) Node	Changing directory/container permissions	Permission errors.	Avoid modification. Restore original permissions if needed.
Node (Worker) Node	Changing the node IP	Node becomes unavailable.	Revert to the original IP.
Node (Worker) Node	High resource usage by workloads	May cause core component or node failure.	Clean up resources and set proper quotas.
Node (Worker) Node	Changing the hostname	Node becomes unavailable.	Revert to the original hostname.

When nodes are activated in a container cluster, interconnected and invisible security group rules will be created. Please do not modify these security groups without caution.

Direction	Action	IP Version	Priority	Protocol	CIDR Block	Port Range	Solution
Inbound	Allow	IPv4	99	Any	VPC CIDR	All Ports	Do not modify this security group rule
Inbound	Allow	IPv6	99	Any	VPC CIDR	All Ports
Inbound	Allow	IPv6	99	Any	100::/16	All Ports
Outbound	Allow	IPv4	99	Any	0.0.0.0/0 (All addresses)	All Ports
Outbound	Allow	IPv6	99	Any	::/0 (All IPv6 addresses)	All Ports

Cloud Container Service Engine (CCSE)

High-Risk Operations and Solutions