Cloud Container Service Engine (CCSE)

High-Risk Operations and Solutions

2025-12-18 07:11:30

This section introduces high-risk operations and solutions for the Cloud Container Engine.

 

During the deployment of services on container clusters, users may perform potentially high-risk operations that could trigger business disruptions of varying severity. To help users better anticipate and avoid operational risks, this document highlights some high-risk operations at the cluster node level, their potential consequences, and recommended solutions to prevent accidental issues.

 

Node Type

High-Risk Operation

Consequences

Solution

Master Node

Node expiration or deletion

Master node becomes unavailable. If it’s the only master, the entire cluster fails.

Irrecoverable

Master Node

Manually modifying master or etcd versions

May cause cluster failure.

Revert to original versions.

Master Node

Deleting or formatting core directories (e.g., /etc/kubernetes/data/containerd)

Master node becomes unavailable. If it’s the only master, the entire cluster fails.

Irrecoverable

Master Node

Reinstalling the OS

Master components are deleted. If it’s the only master, the entire cluster fails.

Irrecoverable

Master Node

Removing critical kernel modules/files

Master node becomes unavailable. If it’s the only master, the entire cluster fails.

Irrecoverable

Master Node

Modifying OS configurations

May cause master node failure. If it’s the only master, the entire cluster fails.

Manually restore original configurations.

Master Node

Modifying core component parameters

May cause master node failure.

Restore default parameters.

Master Node

Modifying /etc/resolv.conf or other key configs

May cause network failures or image pull errors.

Manually restore original configurations.

Master Node

Manually replacing master/etcd certificates

May cause cluster failure.

Irrecoverable

Master Node

Changing the node IP

Master node becomes unavailable.

Revert to the original IP.

Master Node

High resource usage by workloads

May cause core component or node failure.

Clean up resources and set proper quotas.

Master Node

Changing the hostname

Master node becomes unavailable.

Revert to the original hostname.

Node (Worker) Node

Node deletion or expiration

Node becomes unavailable.

Irrecoverable

Node (Worker) Node

Reinstalling the OS

Node becomes unavailable.

Irrecoverable

Node (Worker) Node

Removing critical kernel modules/files

Node becomes unavailable.

Irrecoverable

Node (Worker) Node

Modifying OS configurations

May cause node failure.

Attempt to restore original configurations.

Node (Worker) Node

Modifying core component parameters

May cause node failure.

Restore default parameters.

Node (Worker) Node

Deleting/modifying critical data directories or disks

Node becomes unavailable.

Irrecoverable

Node (Worker) Node

Changing directory/container permissions

Permission errors.

Avoid modification. Restore original permissions if needed.

Node (Worker) Node

Changing the node IP

Node becomes unavailable.

Revert to the original IP.

Node (Worker) Node

High resource usage by workloads

May cause core component or node failure.

Clean up resources and set proper quotas.

Node (Worker) Node

Changing the hostname

Node becomes unavailable.

Revert to the original hostname.

 

When nodes are activated in a container cluster, interconnected and invisible security group rules will be created. Please do not modify these security groups without caution.

Direction

Action

IP Version

Priority

Protocol

CIDR Block

Port Range

Solution

Inbound

Allow

IPv4

99

Any

VPC CIDR

All Ports

Do not modify this security group rule

Inbound

Allow

IPv6

99

Any

VPC CIDR

All Ports

Inbound

Allow

IPv6

99

Any

100::/16

All Ports

Outbound

Allow

IPv4

99

Any

0.0.0.0/0 (All addresses)

All Ports

Outbound

Allow

IPv6

99

Any

::/0 (All IPv6 addresses)

All Ports

 

6F7JFJPsWpye