Deploy a Production Ready Kubernetes Cluster

Mohamed Omar Zaian 4b9349a052 Update 'KUBESPRAY_VERSION and kube_version_min_required', cleanup old hashes for v2.25.0 (#11221) 1 周之前
.github a5714a8c6b change dependbot to interval weekly (#11189) 2 周之前
.gitlab-ci 76dae63c69 Check that PRs have correctly ran the sidebar gen 1 周之前
contrib 4dbfd42f1d modify doc structure and update existing doc-links as preparation for new doc generation script 1 周之前
docs f85111f6d4 CI: add ubuntu 24.04 support (#11132) 1 周之前
extra_playbooks 51069223f5 Decouple kubespray-defaults from download (#10626) 5 月之前
inventory 40cbdceb3c Merge branch 'kubernetes-sigs:master' into master 1 周之前
library acbf44a1b4 Adds support for Ansible collections (#9582) 1 年之前
logo 2c93c997cf pre-commit autocorrected files (#9750) 1 年之前
meta 7f6ca804a1 Upgrade ansible-core to 2.16.4 (#10984) 2 月之前
playbooks 7f6ca804a1 Upgrade ansible-core to 2.16.4 (#10984) 2 月之前
plugins 73ce6aef97 kube.py support kubeconfig (#9982) 1 年之前
roles 4b9349a052 Update 'KUBESPRAY_VERSION and kube_version_min_required', cleanup old hashes for v2.25.0 (#11221) 1 周之前
scripts 96bb0a3e12 sidebar_gen: force C locale 1 周之前
test-infra 975362249c add-ubuntu-2404-image (#11167) 2 周之前
tests e54e7c0e1d Bump ansible-lint from 24.2.3 to 24.5.0 1 周之前
.ansible-lint cd7d11fea2 Feat: dependabot initial config (#11084) 1 月之前
.ansible-lint-ignore 51069223f5 Decouple kubespray-defaults from download (#10626) 5 月之前
.editorconfig 4c1e0b188d Add .editorconfig file (#6307) 3 年之前
.gitattributes 4123cf13ef add gen_docs_sidebar.sh result, mark docs/_sidebar.md as a generated file 1 周之前
.gitignore 8fa5ae1865 bin: improve manage-offline-container-images script (#10857) 3 月之前
.gitlab-ci.yml 4b9349a052 Update 'KUBESPRAY_VERSION and kube_version_min_required', cleanup old hashes for v2.25.0 (#11221) 1 周之前
.gitmodules 611c7744a1 Remove submodules 8 年之前
.markdownlint.yaml e6976a54e1 add pre-commit hook to facilitate local testing (#9158) 1 年之前
.nojekyll 9e76aafc1c Publish docs with docsify (#4193) 5 年之前
.pre-commit-config.yaml 5d01dfa179 add auto generate documentation sidebar script, introduce script as pre-commit-hook, adapt existing scripts to work with documentation structure 1 周之前
.yamllint cd7d11fea2 Feat: dependabot initial config (#11084) 1 月之前
CHANGELOG.md 9312ae7c6e project: fix galaxy ansible-lint rule (#10277) 10 月之前
CNAME 2c93c997cf pre-commit autocorrected files (#9750) 1 年之前
CONTRIBUTING.md 25cb90bc2d Upgrade ansible (#10190) 11 月之前
Dockerfile 9a31f3285a chore(Dockerfile): best practices (#10708) 5 月之前
LICENSE 2c93c997cf pre-commit autocorrected files (#9750) 1 年之前
Makefile 8d553f7e91 Mitogen: deprecate the use of mitogen and remove coverage from CI (#8147) 2 年之前
OWNERS 2c93c997cf pre-commit autocorrected files (#9750) 1 年之前
OWNERS_ALIASES 4a259ee3f0 Remove mirwan from approvers (#10930) 3 月之前
README.md 4b9349a052 Update 'KUBESPRAY_VERSION and kube_version_min_required', cleanup old hashes for v2.25.0 (#11221) 1 周之前
RELEASE.md 4e52fb7a1f Adjust the releases process documentation. (#10727) 4 月之前
SECURITY_CONTACTS 5603f9f374 Update security contacts file (#9235) 1 年之前
Vagrantfile f85111f6d4 CI: add ubuntu 24.04 support (#11132) 1 周之前
_config.yml 4c1e0b188d Add .editorconfig file (#6307) 3 年之前
ansible.cfg 44950efc34 fix ssh_args in ansible.cfg no effect (#10981) 2 月之前
cluster.yml acbf44a1b4 Adds support for Ansible collections (#9582) 1 年之前
code-of-conduct.md 3cd06b0eb4 Update code-of-conduct.md 6 年之前
galaxy.yml 4b9349a052 Update 'KUBESPRAY_VERSION and kube_version_min_required', cleanup old hashes for v2.25.0 (#11221) 1 周之前
index.html cc77a8c395 Add logo folders (#4515) 5 年之前
pipeline.Dockerfile a306f15a74 bump vagrant 2.3.7 (#10787) 4 月之前
recover-control-plane.yml 169eb34a59 Fix playbook names for galaxy (#10021) 1 年之前
remove-node.yml 169eb34a59 Fix playbook names for galaxy (#10021) 1 年之前
requirements.txt 2c2b2ed96e Bump pbr from 5.11.1 to 6.0.0 (#11188) 1 周之前
reset.yml acbf44a1b4 Adds support for Ansible collections (#9582) 1 年之前
run.rc 25cb90bc2d Upgrade ansible (#10190) 11 月之前
scale.yml acbf44a1b4 Adds support for Ansible collections (#9582) 1 年之前
setup.cfg a6853cb79d library files added to setup.cfg (#5274) 4 年之前
setup.py 8058cdbc0e Add pbr build configuration 6 年之前
upgrade-cluster.yml 169eb34a59 Fix playbook names for galaxy (#10021) 1 年之前

README.md

Deploy a Production Ready Kubernetes Cluster

Kubernetes Logo

If you have questions, check the documentation at kubespray.io and join us on the kubernetes slack, channel #kubespray. You can get your invite here

  • Can be deployed on AWS, GCE, Azure, OpenStack, vSphere, Equinix Metal (bare metal), Oracle Cloud Infrastructure (Experimental), or Baremetal
  • Highly available cluster
  • Composable (Choice of the network plugin for instance)
  • Supports most popular Linux distributions
  • Continuous integration tests

Quick Start

Below are several ways to use Kubespray to deploy a Kubernetes cluster.

Ansible

Usage

Install Ansible according to Ansible installation guide then run the following steps:

# Copy ``inventory/sample`` as ``inventory/mycluster``
cp -rfp inventory/sample inventory/mycluster

# Update Ansible inventory file with inventory builder
declare -a IPS=(10.10.1.3 10.10.1.4 10.10.1.5)
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}

# Review and change parameters under ``inventory/mycluster/group_vars``
cat inventory/mycluster/group_vars/all/all.yml
cat inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml

# Clean up old Kubernetes cluster with Ansible Playbook - run the playbook as root
# The option `--become` is required, as for example cleaning up SSL keys in /etc/,
# uninstalling old packages and interacting with various systemd daemons.
# Without --become the playbook will fail to run!
# And be mind it will remove the current kubernetes cluster (if it's running)!
ansible-playbook -i inventory/mycluster/hosts.yaml  --become --become-user=root reset.yml

# Deploy Kubespray with Ansible Playbook - run the playbook as root
# The option `--become` is required, as for example writing SSL keys in /etc/,
# installing packages and interacting with various systemd daemons.
# Without --become the playbook will fail to run!
ansible-playbook -i inventory/mycluster/hosts.yaml  --become --become-user=root cluster.yml

Note: When Ansible is already installed via system packages on the control node, Python packages installed via sudo pip install -r requirements.txt will go to a different directory tree (e.g. /usr/local/lib/python2.7/dist-packages on Ubuntu) from Ansible's (e.g. /usr/lib/python2.7/dist-packages/ansible still on Ubuntu). As a consequence, the ansible-playbook command will fail with:

ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.

This likely indicates that a task depends on a module present in requirements.txt.

One way of addressing this is to uninstall the system Ansible package then reinstall Ansible via pip, but this not always possible and one must take care regarding package versions. A workaround consists of setting the ANSIBLE_LIBRARY and ANSIBLE_MODULE_UTILS environment variables respectively to the ansible/modules and ansible/module_utils subdirectories of the pip installation location, which is the Location shown by running pip show [package] before executing ansible-playbook.

A simple way to ensure you get all the correct version of Ansible is to use the pre-built docker image from Quay. You will then need to use bind mounts to access the inventory and SSH key in the container, like this:

git checkout v2.25.0
docker pull quay.io/kubespray/kubespray:v2.25.0
docker run --rm -it --mount type=bind,source="$(pwd)"/inventory/sample,dst=/inventory \
  --mount type=bind,source="${HOME}"/.ssh/id_rsa,dst=/root/.ssh/id_rsa \
  quay.io/kubespray/kubespray:v2.25.0 bash
# Inside the container you may now run the kubespray playbooks:
ansible-playbook -i /inventory/inventory.ini --private-key /root/.ssh/id_rsa cluster.yml

Collection

See here if you wish to use this repository as an Ansible collection

Vagrant

For Vagrant we need to install Python dependencies for provisioning tasks. Check that Python and pip are installed:

python -V && pip -V

If this returns the version of the software, you're good to go. If not, download and install Python from here https://www.python.org/downloads/source/

Install Ansible according to Ansible installation guide then run the following step:

vagrant up

Documents

Supported Linux Distributions

  • Flatcar Container Linux by Kinvolk
  • Debian Bookworm, Bullseye, Buster
  • Ubuntu 20.04, 22.04
  • CentOS/RHEL 7, 8, 9
  • Fedora 37, 38
  • Fedora CoreOS (see fcos Note)
  • openSUSE Leap 15.x/Tumbleweed
  • Oracle Linux 7, 8, 9
  • Alma Linux 8, 9
  • Rocky Linux 8, 9
  • Kylin Linux Advanced Server V10 (experimental: see kylin linux notes)
  • Amazon Linux 2 (experimental: see amazon linux notes)
  • UOS Linux (experimental: see uos linux notes)
  • openEuler (experimental: see openEuler notes)

Note: Upstart/SysV init based OS types are not supported.

Supported Components

Container Runtime Notes

  • Supported Docker versions are 18.09, 19.03, 20.10, 23.0 and 24.0. The recommended Docker version is 24.0. Kubelet might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. the YUM versionlock plugin or apt pin).
  • The cri-o version should be aligned with the respective kubernetes version (i.e. kube_version=1.20.x, crio_version=1.20)

Requirements

  • Minimum required version of Kubernetes is v1.28
  • Ansible v2.14+, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands
  • The target servers must have access to the Internet in order to pull docker images. Otherwise, additional configuration is required (See Offline Environment)
  • The target servers are configured to allow IPv4 forwarding.
  • If using IPv6 for pods and services, the target servers are configured to allow IPv6 forwarding.
  • The firewalls are not managed, you'll need to implement your own rules the way you used to. in order to avoid any issue during deployment you should disable your firewall.
  • If kubespray is run from non-root user account, correct privilege escalation method should be configured in the target servers. Then the ansible_become flag or command parameters --become or -b should be specified.

Hardware: These limits are safeguarded by Kubespray. Actual requirements for your workload can differ. For a sizing guide go to the Building Large Clusters guide.

  • Master
    • Memory: 1500 MB
  • Node
    • Memory: 1024 MB

Network Plugins

You can choose among ten network plugins. (default: calico, except Vagrant uses flannel)

  • flannel: gre/vxlan (layer 2) networking.

  • Calico is a networking and network policy provider. Calico supports a flexible set of networking options designed to give you the most efficient networking across a range of situations, including non-overlay and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts, pods, and (if using Istio and Envoy) applications at the service mesh layer.

  • cilium: layer 3/4 networking (as well as layer 7 to protect and secure application protocols), supports dynamic insertion of BPF bytecode into the Linux kernel to implement security services, networking and visibility logic.

  • weave: Weave is a lightweight container overlay network that doesn't require an external K/V database cluster. (Please refer to weave troubleshooting documentation).

  • kube-ovn: Kube-OVN integrates the OVN-based Network Virtualization with Kubernetes. It offers an advanced Container Network Fabric for Enterprises.

  • kube-router: Kube-router is a L3 CNI for Kubernetes networking aiming to provide operational simplicity and high performance: it uses IPVS to provide Kube Services Proxy (if setup to replace kube-proxy), iptables for network policies, and BGP for ods L3 networking (with optionally BGP peering with out-of-cluster BGP peers). It can also optionally advertise routes to Kubernetes cluster Pods CIDRs, ClusterIPs, ExternalIPs and LoadBalancerIPs.

  • macvlan: Macvlan is a Linux network driver. Pods have their own unique Mac and Ip address, connected directly the physical (layer 2) network.

  • multus: Multus is a meta CNI plugin that provides multiple network interface support to pods. For each interface Multus delegates CNI calls to secondary CNI plugins such as Calico, macvlan, etc.

  • custom_cni : You can specify some manifests that will be applied to the clusters to bring you own CNI and use non-supported ones by Kubespray. See tests/files/custom_cni/README.md and tests/files/custom_cni/values.yamlfor an example with a CNI provided by a Helm Chart.

The network plugin to use is defined by the variable kube_network_plugin. There is also an option to leverage built-in cloud provider networking instead. See also Network checker.

Ingress Plugins

  • nginx: the NGINX Ingress Controller.

  • metallb: the MetalLB bare-metal service LoadBalancer provider.

Community docs and resources

Tools and projects on top of Kubespray

CI Tests

Build graphs

CI/end-to-end tests sponsored by: CNCF, Equinix Metal, OVHcloud, ELASTX.

See the test matrix for details.