- 22 Jan, 2020 2 commits
-
-
Guillaume Abrioux authored
When upgrading from RHCS 3.x where ceph-metrics was deployed on a dedicated node to RHCS 4.0, it fails like following: ``` fatal: [magna005]: FAILED! => changed=false gid: 0 group: root mode: '0755' msg: 'chown failed: failed to look up user ceph' owner: root path: /etc/ceph secontext: unconfined_u:object_r:etc_t:s0 size: 4096 state: directory uid: 0 ``` because we are trying to run `ceph-config` on this node, it doesn't make sense so we should simply run this play on all groups except `[grafana-server]`. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1793885 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit e5812fe4)
-
Dimitri Savineau authored
When osd_auto_discovery is set then we need to refresh the ansible_devices fact between after the filestore OSD purge otherwise the devices fact won't be populated. Also remove the gpt header on ceph_disk_osds_devices because the devices is empty at this point for osd_auto_discovery. Adding the bool filter when needed. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit bb3eae0c)
-
- 21 Jan, 2020 1 commit
-
-
Dimitri Savineau authored
We still need --destroy when using a raw device otherwise we won't be able to recreate the lvm stack on that device with bluestore. Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-bdc67a84-894a-4687-b43f-bcd76317580a /dev/sdd stderr: Physical volume '/dev/sdd' is already in volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' Unable to add physical volume '/dev/sdd' to volume group 'ceph-b7801d50-e827-4857-95ec-3291ad6f0151' /dev/sdd: physical volume not initialized. --> Was unable to complete a new OSD, will rollback changes Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792227 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit f995b079)
-
- 20 Jan, 2020 3 commits
-
-
Dimitri Savineau authored
Because we need to manage legacy ceph-disk based OSD with ceph-volume then we need a way to know the osd_objectstore in the container. This was done like this previously with ceph-disk so we should also do it with ceph-volume. Note that this won't have any impact for ceph-volume lvm based OSD. Rename docker_env_args fact to container_env_args and move the container condition on the include_tasks call. Remove OSD_DMCRYPT env variable from the ceph-osd template because it's now included in the container_env_args variable. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792122 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit c9e1fe3d)
-
Benoît Knecht authored
In 3c31b19a , I fixed the `customize pool size` task by replacing `item.size` with `item.value.size`. However, I missed the same issue in the `when` condition. Signed-off-by:
Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit 3842aa1a)
-
Guillaume Abrioux authored
When unsetting the noup flag, we must call container_exec_cmd from the delegated node (first mon member) Also, adding a `run_once: true` because this task needs to be run only 1 time. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1792320 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 22865cde)
-
- 16 Jan, 2020 1 commit
-
-
Dmitriy Rabotyagov authored
Since commit [1] running_mon introduced, it can be not defined which results in fatal error [2]. This patch defines default value which was used before patch [1] Signed-off-by:
Dmitriy Rabotyagov <drabotyagov@vexxhost.com> [1] https://github.com/ceph/ceph-ansible/commit/8dcbcecd713b0cd7769d3b4d04ef5c2f15881377 [2] https://zuul.opendev.org/t/openstack/build/c82a73aeabd64fd583694ed04b947731/log/job-output.txt#14011 (cherry picked from commit 2478a7b9)
-
- 15 Jan, 2020 2 commits
-
-
Guillaume Abrioux authored
Iterating over all monitors in order to delegate a ` {{ container_binary }}` fails when collocating mgrs with mons, because ceph-facts reset `container_exec_cmd` to point to the first member of the monitor group. The idea is to force `container_exec_cmd` to be reset in ceph-mgr. This commit also removes the `container_exec_cmd_mgr` fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1791282 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 8dcbcecd)
-
Guillaume Abrioux authored
the new ceph status registered in `ceph_status` will report `fsmap.up` = 0 when it's the last mds given that it's done after we shrink the mds, it means the condition is wrong. Also adding a condition so we don't try to delete the fs if a standby node is going to rejoin the cluster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787543 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 3d0898aa)
-
- 14 Jan, 2020 6 commits
-
-
Dimitri Savineau authored
The trusted_ip_list parameter for the rbd-target-api service doesn't support ipv6 address with bracket. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit bd87d691)
-
Dimitri Savineau authored
Before this patch, the lvm2 package installation was done during the ceph-osd role. However we were running ceph-volume command in the ceph-config role before ceph-osd. If lvm2 wasn't installed then the ceph-volume command fails: error checking path "/run/lock/lvm": stat /run/lock/lvm: no such file or directory This wasn't visible before because lvm2 was automatically installed as docker dependency but it's not the same for podman on CentOS 8. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit de8f2a9f)
-
Guillaume Abrioux authored
since fd1718f3 , we must use `_devices` when deploying with lvm batch scenario. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 5558664f)
-
Guillaume Abrioux authored
There is no need to run this part of the playbook when upgrading the cluter. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit af687570)
-
Guillaume Abrioux authored
This commit replaces the playbook used for add_osds job given accordingly to the add-osd.yml playbook removal Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit fef1cd4c)
-
Guillaume Abrioux authored
This commit lets add-osd.yml in place but mark the deprecation of the playbook. Scaling up OSDs is now possible using --limit Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 3496a0ef)
-
- 13 Jan, 2020 4 commits
-
-
Dimitri Savineau authored
We don't need to executed the grafana fact everytime but only during the dashboard deployment. Especially for ceph-grafana, ceph-prometheus and ceph-dashboard roles. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790303 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit f940e695)
-
Guillaume Abrioux authored
d6da508a broke the osp/ceph external use case. We must skip these tasks when no monitor is present in the inventory. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790508 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 2592a1e1)
-
Guillaume Abrioux authored
To avoid confusion, let's change the default value from `0.0.0.0` to `x.x.x.x`. Users might think setting `0.0.0.0` will make the daemon binding on all interfaces. Fixes: #4827 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit fc02fc98)
-
Guillaume Abrioux authored
This commit refact the condition in the loop of that task so all potential osd ids found are well started. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1790212 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 58e6bfed)
-
- 10 Jan, 2020 16 commits
-
-
Guillaume Abrioux authored
monitor how long it takes to get all VMs up and running Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 16bcef4f)
-
Guillaume Abrioux authored
Add a script to retry several times to fire up VMs to avoid vagrant failures. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by:
Andrew Schoen <aschoen@redhat.com> (cherry picked from commit 1ecb3a93)
-
Guillaume Abrioux authored
This commit adds a new scenario in order to test docker-to-podman.yml migration playbook. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit dc672e86)
-
Guillaume Abrioux authored
play vars have lower precedence than role vars and `set_fact`. We must use a `set_fact` to reset these variables. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit b0c49180)
-
Guillaume Abrioux authored
This is needed after a change is made in systemd unit files. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 1c2ec9fb)
-
Guillaume Abrioux authored
This commit adds a package installation task in order to install podman during the docker-to-podman.yml migration playbook. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit d746575f)
-
Guillaume Abrioux authored
There is no need to run these tasks n times from each monitor. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit c878e995)
-
Guillaume Abrioux authored
1. set noout and nodeep-scrub flags, 2. upgrade each OSD node, one by one, wait for active+clean pgs 3. after all osd nodes are upgraded, unset flags Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> Co-authored-by:
Rachana Patel <racpatel@redhat.com> (cherry picked from commit 548db78b)
-
Dimitri Savineau authored
When ceph_rbd_mirror_configure is set to true we need to ensure that the required variables aren't empty. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1760553 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 4a065ceb)
-
Dimitri Savineau authored
cf8c6a38 moves the 'wait for all osds' task from openstack_config to the main tasks list. But the openstack_config code was executed only on the last OSD node. We don't need to do this check on all OSD node so we need to add set run_once to true on that task. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 5bd1cf40)
-
Dimitri Savineau authored
When creating crush rules with device class parameter we need to be sure that all OSDs are up and running because the device class list is is populated with this information. This is now enable for all scenario not openstack_config only. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit cf8c6a38)
-
Dimitri Savineau authored
This adds device class support to crush rules when using the class key in the rule dict via the create-replicated sub command. If the class key isn't specified then we use the create-simple sub command for backward compatibility. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1636508 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit ef2cb99f)
-
Dimitri Savineau authored
If we want to create crush rules with the create-replicated sub command and device class then we need to have the OSD created before the crush rules otherwise the device classes won't exist. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit ed36a11e)
-
Dimitri Savineau authored
We only need to have the container_binary fact. Because we're not gathering the facts from all nodes then the purge fails trying to get one of the grafana fact. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786686 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit a09d1c38)
-
Dimitri Savineau authored
There's some tasks using the new container image during the rolling upgrade playbook that needs to execute the registry login first otherwise the nodes won't be able to pull the container image. Unable to find image 'xxx.io/foo/bar:latest' locally Trying to pull repository xxx.io/foo/bar ... /usr/bin/docker-current: Get https://xxx.io/v2/foo/bar/manifests/latest : unauthorized Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 3f344fde)
-
Guillaume Abrioux authored
We must exclude the devices already used and prepared by ceph-disk when doing the lvm batch report. Otherwise it fails because ceph-volume complains about GPT header. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1786682 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit fd1718f3)
-
- 09 Jan, 2020 5 commits
-
-
Dimitri Savineau authored
We don't need to use dev repository on stable branches. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com>
-
Dimitri Savineau authored
Instead of running the ceph roles against localhost we should do it on the first mon. The ansible and inventory hostname of the rgw nodes could be different. Ensure that the rgw instance to remove is present in the cluster. Fix rgw service and directory path. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1677431 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 747555df)
-
Guillaume Abrioux authored
We must pick up a mon which actually exists in ceph-facts in order to detect if a cluster is running. Otherwise, it will state no cluster is already running which will end up deploying a new monitor isolated in a new quorum. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622688 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 86f3eeb7)
-
Dimitri Savineau authored
Only the ipv4 addresses from the nodes running the dashboard mgr module were added to the trusted_ip_list configuration file on the iscsigws nodes. This also add the iscsi gateways with ipv6 configuration to the ceph dashboard. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1787531 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 70eba661)
-
Benoît Knecht authored
RadosGW pools can be created by setting ```yaml rgw_create_pools: .rgw.root: pg_num: 512 size: 2 ``` for instance. However, doing so would create pools of size `osd_pool_default_size` regardless of the `size` value. This was due to the fact that the Ansible task used ``` {{ item.size | default(osd_pool_default_size) }} ``` as the pool size value, but `item.size` is always undefined; the correct variable is `item.value.size`. Signed-off-by:
Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit 3c31b19a)
-