- 11 Dec, 2019 10 commits
-
-
Guillaume Abrioux authored
This commit adds a task to ensure device mappers are well closed when lvm batch scenario is used. Otherwise, OSDs can't be redeployed given that devices that are rejected by ceph-volume because they are locked. Adding a condition `devices | default([]) | length > 0` to remove these dm only when using lvm batch scenario. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 8e6ef818)
-
Guillaume Abrioux authored
Otherwise, sometimes it can take a while for an OSD to be seen as down and causes the `ceph osd purge` command to fail. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 51d60119)
-
Guillaume Abrioux authored
Do not use `--destroy` when zapping a device. Otherwise, it destroys VGs while they are still needed to redeploy the OSDs. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit e3305e6b)
-
Guillaume Abrioux authored
The zap action from ceph_volume module always implies `--destroy`. This commit adds the destroy option support so we can ask ceph-volume to not use `--destroy` when zapping a device. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 0dcacdbe)
-
Guillaume Abrioux authored
This commit adds the non containerized context support to the filestore-to-bluestore.yml infrastructure playbook. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1729267 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 4833b85e)
-
Guillaume Abrioux authored
This commit adds a new job in order to test the filestore-to-bluestore.yml infrastructure playbook. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 40de34fb)
-
Guillaume Abrioux authored
There's no need to enforce PreferredAuthentications by default. Users can still choose to override the ansible.cfg with any additional parameter like this one to fit their infrastructure. Fixes: #4826 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit d682412e)
-
Guillaume Abrioux authored
A recent change in ceph/ceph prevent from having username in the password: `Error EINVAL: Password cannot contain username.` Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 0756fa46)
-
Guillaume Abrioux authored
In containerized context, containers aren't stopped early in the sequence. It means they aren't restarted after the upgrade because the task is just checking the daemon status is started (eg: `state: started`). This commit also removes the task which ensure services are started because it's already done in the role ceph-iscsigw. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit c7708eb4)
-
Guillaume Abrioux authored
when upgrading from RHCS 3, dashboard has obviously never been deployed and it forces us to deploy it later manually. This commit adds the dashboard deployment as part of the upgrade to RHCS 4. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1779092 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 451c5ca9)
-
- 10 Dec, 2019 1 commit
-
-
Guillaume Abrioux authored
This commit isolates and adds an explicit comment about variables not intended to be modified by the user. Fixes: #4828 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit a234338e)
-
- 09 Dec, 2019 2 commits
-
-
Guillaume Abrioux authored
Typical error: ``` type=AVC msg=audit(1575367499.582:3210): avc: denied { search } for pid=26680 comm="node_exporter" name="1" dev="proc" ino=11528 scontext=system_u:system_r:container_t:s0:c100,c1014 tcontext=system_u:system_r:init_t:s0 tclass=dir permissive=0 ``` node_exporter needs to be run as privileged to avoid avc denied error since it gathers lot of information on the host. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762168 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit d245eb7e)
-
Dimitri Savineau authored
The md devices (RAID software) aren't excluded from the devices list in the auto discovery scenario. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1764601 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 014f51c2)
-
- 05 Dec, 2019 1 commit
-
-
Guillaume Abrioux authored
When using `osd_auto_discovery`, `devices` is built multiple times due to multiple runs of `ceph-facts` role. It end up with duplicate instances of a same device in the list. Using `unique` filter when building the list fixes this issue. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 23b1f438)
-
- 04 Dec, 2019 4 commits
-
-
Dimitri Savineau authored
The podman support was added to the purge-container-cluster playbook but containers are always used for the dashboard even on non containerized deployment. This commits adds the podman support on purging the dashboard resources in the purge-cluster playbook. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 89f6cc54)
-
Dimitri Savineau authored
Having max_mds value equals to the number of mds nodes generates a warning in the ceph cluster status: cluster: id: 6d3e49a4-ab4d-4e03-a7d6-58913b8ec00a' health: HEALTH_WARN' insufficient standby MDS daemons available' (...) services: mds: cephfs:3 {0=mds1=up:active,1=mds0=up:active,2=mds2=up:active}' Let's use 2 active and 1 standby mds. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 4a6d19da)
-
Guillaume Abrioux authored
Since we now support podman, let's rename the playbook so it's more generic. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 7bc7e366)
-
Dimitri Savineau authored
If the new mon/osd node doesn't have python installed then we need to execute the tasks from raw_install_python.yml. Closes: #4368 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 34b03d18)
-
- 03 Dec, 2019 11 commits
-
-
Dimitri Savineau authored
The wait_for ansible module doesn't support the backets on IPv6 address so need to remove them. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769710 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 55adc10b)
-
Dimitri Savineau authored
In addition to the grafana container tag change, we need to do the same for the prometheus container stack based on the release present in the OSE 4.1 container image. $ docker run --rm openshift4/ose-prometheus-node-exporter:v4.1 --version node_exporter, version 0.17.0 build user: root@67fee13ed48f build date: 20191023-14:38:12 go version: go1.11.13 $ docker run --rm openshift4/ose-prometheus-alertmanager:4.1 --version alertmanager, version 0.16.2 build user: root@70b79a3f29b6 build date: 20191023-14:57:30 go version: go1.11.13 $ docker run --rm openshift4/ose-prometheus:4.1 --version prometheus, version 2.7.2 build user: root@12da054778a3 build date: 20191023-14:39:36 go version: go1.11.13 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 3e29b8d5)
-
Dimitri Savineau authored
When a container is already running on a non containerized node then the umount ceph partition task is skipped. This is due to the container ps command which always returns 0 even if the filter matches nothing. We should run the umount task when: 1/ the container command is failing (not installed) : rc != 0 2/ the container command reports running ceph-osd containers : rc == 0 Also we should not fail on the ceph directory listing. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 39cfe0aa)
-
Guillaume Abrioux authored
If the container binary is podman, we shouldn't try to stop docker here. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit b18476a1)
-
Guillaume Abrioux authored
in order to be able to call container_binary without having to run the whole ceph-facts role. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit fe5ffe58)
-
Guillaume Abrioux authored
All containers are removed when systemd stops them. There is no need to call this module in purge container playbook. This commit also removes all docker_image task and remove all container images in the final cleanup play. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1776736 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit d23383a8)
-
Guillaume Abrioux authored
When using the shortname, the URL for active alert launches with short hostname and fails to connect to the server. This commit changes the template in order to use the fqdn. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1765485 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit a8d76d72)
-
Guillaume Abrioux authored
This commit makes the ceph-dashboard role only printing ceph-dashboard URL of the nodes present in grafana-server group Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1762163 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit cc0c1ce3)
-
Guillaume Abrioux authored
This is needed to avoid following error: ``` ERROR! The requested handler 'restart ceph mons' was not found in either the main handlers list nor in the listening handlers list ``` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit a43a8721)
-
Guillaume Abrioux authored
let's use `client_group_name` instead of hardcoding the name. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 7fe0d55e)
-
Guillaume Abrioux authored
We must import this role in the first play otherwise the first call to `client_group_name`fails. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1777829 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 6526a25a)
-
- 25 Nov, 2019 1 commit
-
-
Guillaume Abrioux authored
This commit reverts the following change: https://github.com/ceph/ceph-ansible/pull/4510/commits/fcf181342a70b78a355d1c985699028012326b5f#diff-23b6f443c01ea2efcb4f36eedfea9089R7-R14 this is causing CI failures so this commit is intended to unlock the CI. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 5353ab8a)
-
- 20 Nov, 2019 1 commit
-
-
VasishtaShastry authored
Configuration of cephfs with an existing cluster using --limit used to fail at different tasks while running with site-docker.yml This commit addresses both of those tasks Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1773489 Signed-off-by:
VasishtaShastry <vipin.indiasmg@gmail.com> (cherry picked from commit 72c43cc5)
-
- 18 Nov, 2019 2 commits
-
-
Dimitri Savineau authored
If we execute the site-container.yml playbook with specific tags (like ceph_update_config) then we need to be sure to gather the facts otherwise we will see error like: The task includes an option with an undefined variable. The error was: 'ansible_hostname' is undefined This commit also adds missing 'gather_facts: false' to mons plays. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1754432 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit d7fd769b)
-
VasishtaShastry authored
This will prevent failure of site-docker.yml with configs in doc. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1769760 Signed-off-by:
VasishtaShastry <vipin.indiasmg@gmail.com> Co-Authored-By:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 9a1f1626)
-
- 15 Nov, 2019 1 commit
-
-
Guillaume Abrioux authored
when `import_key` is enabled, if the key already exists, it will only be fetched using ceph cli, if the mode specified in the `ceph_key` task is different from what is applied by the ceph cli, the mode isn't restored because we don't call `module.set_fs_attributes_if_different()` before `module.exit_json(**result)` Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1734513 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit b717b5f7)
-
- 14 Nov, 2019 2 commits
-
-
Guillaume Abrioux authored
This commit adds a playbook to be played before we run purge playbook, it first creates an rbd image then map an rbd device on client0 so the purge playbook will try to unmap it. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit db77fbda)
-
Guillaume Abrioux authored
in containerized context, using the binary provided in atomic os won't work because it's an old version provided by ceph-common based on 10.2.5. Using a container could be an idea but for large cluster with hundreds of client nodes, that would require to pull the image of each of them just to unmap the rbd devices. Let's use the sysfs method in order to avoid any issue related to ceph version that is shipped on the host. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1766064 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 3cfcc7a1)
-
- 07 Nov, 2019 2 commits
-
-
Guillaume Abrioux authored
This commit removes the mergify config on stable-4.0 At the moment there is no need to have a mergify config on this branch given that we don't use it. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com>
-
Dimitri Savineau authored
[1] introduced a regression on the fs.aio-max-nr sysctl value condition. The enable key isn't a boolean but a string because the expression isn't evaluated. This string output "(osd_objectstore == 'bluestore')" is always true because item.enable condition only matches non empty string. So the sysctl value was applyied for both filestore and bluestore backend. [2] added the bool filter to the condition but the filter always returns false on string and the sysctl wasn't applyed at all. This commit fixes the enable key value by evaluating the value instead of using the string. [1] https://github.com/ceph/ceph-ansible/commit/08a2b58 [2] https://github.com/ceph/ceph-ansible/commit/ab54fe2 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit ece46d33)
-
- 04 Nov, 2019 1 commit
-
-
Dimitri Savineau authored
The ansible ssh connections are now using the ssh backend instead of paramiko starting testinfra 3.1 and persistent connections too. pytest 4.6 is the latest release to be supported by python 2. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 02df2ab5)
-
- 31 Oct, 2019 1 commit
-
-
Dimitri Savineau authored
The latest grafana container tag is using grafana 6.x release which could cause issue with the ceph dashboard integration. Considering that the grafana container in RHCS 3 is based on 5.x then we should use the same version. $ docker run --rm rhceph/rhceph-3-dashboard-rhel7:3 -v Version 5.2.4 (commit: unknown-dev) Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 2037fb87)
-