- 30 Nov, 2020 1 commit
-
-
Guillaume Abrioux authored
`ceph.target` should be disabled only. Otherwise, in collocation scenario you stop other collocated services in the OSD play which isn't what we want to do. Each daemon has its corresponding play for managing the transition to container. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1901865 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 0b056205)
-
- 27 Nov, 2020 1 commit
-
-
Dimitri Savineau authored
Set the owner/group on alertmanager and prometheus directories and files to nobody and nogroup (uid and gid 65534) to avoid permission issues. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1901543 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit eb452d35)
-
- 26 Nov, 2020 2 commits
-
-
Guillaume Abrioux authored
adding monitor is no longer possible because we generate a new mon keyring each time the playbook is run. Fixes: #5864 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 970c6a4e)
-
Guillaume Abrioux authored
We can achieve this task using `copy` module. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 5ff2ca27)
-
- 25 Nov, 2020 1 commit
-
-
Dimitri Savineau authored
When using a custom pool for iSCSI gateway then we need to set the pool name in the configuration otherwise the default rbd pool name will be used. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 40a87c4b)
-
- 24 Nov, 2020 11 commits
-
-
Guillaume Abrioux authored
Let's use a github workflow instead of travis for this. With this commit we can get rid of Travis. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 94c37b9d)
-
Guillaume Abrioux authored
ignore 302,303 and 505 errors [302] Using command rather than an argument to e.g. file [303] Using command rather than module [505] referenced files must exist they aren't relevant on these tasks. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 195d88fc)
-
Guillaume Abrioux authored
Fix ansible-lint 504 error: [504] Do not use 'local_action', use 'delegate_to: localhost' Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit c948b668)
-
Guillaume Abrioux authored
Fix ansible-lint 201 error: [201] Trailing whitespace Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit dfc7e6e4)
-
Guillaume Abrioux authored
Fix ansible-lint 502 error: [502] All tasks should be named Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 97dd9218)
-
Guillaume Abrioux authored
Fix ansible-lint 305 error: [305] Use shell only when shell functionality is required Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 11b4bf50)
-
Guillaume Abrioux authored
Fix ansible lint 601 error: [601] Don't compare to literal True/False Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 2011e4db)
-
Guillaume Abrioux authored
Fix ansible lint 206 error: [206] Variables should have spaces before and after: {{ var_name }} Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 9fba6eec)
-
Guillaume Abrioux authored
Fix ansible lint 301 error: [301] Commands should not change things if nothing needs doing Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 5450de58)
-
Guillaume Abrioux authored
Fix ansible lint 306 error: [306] Shells that use pipes should set the pipefail option Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 1879c26e)
-
Guillaume Abrioux authored
let's use github workflow instead of travis. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit d4400f91)
-
- 19 Nov, 2020 1 commit
-
-
Guillaume Abrioux authored
This commit ensures that the `/var/lib/ceph/osd/{{ cluster }}-{{ osd_id }}` is present before starting OSDs. This is needed specificly when redeploying an OSD in case of OS upgrade failure. Since ceph data are still present on its devices then the node can be redeployed, however those directories aren't present since they are initially created by ceph-volume. We could recreate them manually but for better user experience we can ask ceph-ansible to recreate them. NOTE: this only works for OSDs that were deployed with ceph-volume. ceph-disk deployed OSDs would have to get those directories recreated manually. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1898486 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 873fc8ec)
-
- 18 Nov, 2020 3 commits
-
-
Dimitri Savineau authored
We don't need to use run_once on that task when having running monitors otherwise the read task could be skip and the set task will fail. The conditional check 'crush_rule_variable.rc == 0' failed. The error was: error while evaluating conditional (crush_rule_variable.rc == 0): 'dict object' has no attribute 'rc' Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1898856 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit e150df78)
-
Dimitri Savineau authored
Move the pytest testing from TravisCI to Github workflow. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 3e79f032)
-
Guillaume Abrioux authored
This commit enforces the pytest-rerunfailures installed so it's <9.0 This is to avoid the following error: ``` ERROR: pytest-rerunfailures 9.0 has requirement pytest>=5.0, but you'll have pytest 4.6.11 which is incompatible. ``` latest version of pytest-rerunfailures isn't compatible with the version of pytest we are using. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 19097026)
-
- 17 Nov, 2020 1 commit
-
-
Guillaume Abrioux authored
This commit changes the bind mount option for the mount point `/var/lib/ceph` in the systemd template for mon and mgr containers. This is needed in case of collocating mon/mgr with osds using dmcrypt scenario. Once mon/mgr got converted to containers, the dmcrypt layer sub mount is still seen in `/var/lib/ceph`. For some reason it makes the corresponding devices busy so any other container can't open/close it. As a result, it prevents osds from starting properly. Since it only happens on the nodes converted before the OSD play, the idea is to bind mount `/var/lib/ceph` on mon and mgr with the `rshared` option so once the sub mount is unmounted, it is propagated inside the container so it doesn't see that mount point. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896392 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit f5ba6d9b)
-
- 16 Nov, 2020 2 commits
-
-
Guillaume Abrioux authored
This is a workaround to avoid error like following: ``` Error: error creating container storage: the container name "ceph-mgr-magna022" is already in use by "4a5f674e113f837a0cc561dea5d2cd55d16ca159a647b7794ab06c4c276ef701" ``` that doesn't seem to be 100% reproducible but it shows up after a reboot. The only workaround we came up with at the moment is to run `podman rm --storage <container>` before starting it. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1887716 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 5ba7824c)
-
Dimitri Savineau authored
fa2bb3af only fix the symlink owner/group issue in the OSD play. If the OSDs are collocated with other services like MONs and MGRs then the chown command will fail. $ find /var/lib/ceph/osd/ceph-0 -not -user 167 -execdir chown 167:167 {} + chown: cannot dereference './block': Permission denied Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1896448 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 35ed9977)
-
- 13 Nov, 2020 2 commits
-
-
Benoît Knecht authored
The `osd_pool_default_crush_rule` is set based on `crush_rule_variable`, which is the output of a `grep` command. However, two consecutive tasks can set that variable, and if the second task is skipped, it still overwrites the `crush_rule_variable`, leading the `osd_pool_default_crush_rule` to be set to `ceph_osd_pool_default_crush_rule` instead of the output of the first task. This commit ensures that the fact is set right after the `crush_rule_variable` is assigned, before it can be overwritten. Closes #5912 Signed-off-by:
Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit c5f7343a)
-
Gaudenz Steinlin authored
The osd_memory_target variable was only used if it was higher than the calculated value based on the number of OSDs. This is changed to always use the value if it is set in the configuration. This allows this value to be intentionally set lower so that it does not have to be changed when more OSDs are added later. Signed-off-by:
Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch> (cherry picked from commit 4d1fdd2b)
-
- 12 Nov, 2020 3 commits
-
-
Dimitri Savineau authored
When deploying the ceph OSD via the packages then the ceph-osd@.service unit is configured as enabled-runtime. This means that each ceph-osd service will inherit from that state. The enabled-runtime systemd state doesn't survive after a reboot. For non containerized deployment the OSD are still starting after a reboot because there's the ceph-volume@.service and/or ceph-osd.target units that are doing the job. $ systemctl list-unit-files|egrep '^ceph-(volume|osd)'|column -t ceph-osd@.service enabled-runtime ceph-volume@.service enabled ceph-osd.target enabled When switching to containerized deployment we are stopping/disabling ceph-osd@XX.servive, ceph-volume and ceph.target and then removing the systemd unit files. But the new systemd units for containerized ceph-osd service will still inherit from ceph-osd@.service unit file. As a consequence, if an OSD host is rebooting after the playbook execution then the ceph-osd service won't come back because they aren't enabled at boot. This patch also adds a reboot and testinfra run after running the switch to container playbook. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1881288 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit fa2bb3af)
-
Guillaume Abrioux authored
This tag can be set at the play level. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 2fa17520)
-
Francesco Pantano authored
There are some use cases where there's a need to skip the execution of the ceph-ansible client role even though the client section of the inventory isn't empty. This can happen in contexts where the services are colocated or when a all-in-one deployment is performed. The purpose of this change is adding a 'ceph_client' tag to avoid altering the ceph-ansible execution flow but at the same time be able to include or exclude a set of tasks using this tag. Signed-off-by:
Francesco Pantano <fpantano@redhat.com> (cherry picked from commit fafd5f87)
-
- 04 Nov, 2020 2 commits
-
-
Guillaume Abrioux authored
This sets the `dashboard_grafana_api_no_ssl_verify` default value according to the length of `dashboard_crt` and `dashboard_key`. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 5cadfea4)
-
Guillaume Abrioux authored
see linked bz for details Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1889426 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 767d3c89)
-
- 03 Nov, 2020 9 commits
-
-
Gaudenz Steinlin authored
If some OSDs are to be created and others already exist the calculation only counted the to be created OSDs. This changes the calculation to take all OSDs into account. Signed-off-by:
Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch> (cherry picked from commit 15044da0)
-
Dimitri Savineau authored
cec994b9 introduced a regression when a mgr is collocated with a mon. During the mon upgrade, the mgr service is masked to avoid to be restarted on packages update. Then the start mgr task is failing because the service is still masked. Instead we should unmask it. Fixes: #5983 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 3d3ce263)
-
Dimitri Savineau authored
bd611a78 introduced the new ceph_fs module but missed some tasks in rolling_update and shrink-mds playbooks. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 16afe908)
-
Dimitri Savineau authored
The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the cluster health, we're using the health structure in the ceph status output. To optimize this, we could use the ceph health command which contains the same needed information. $ ceph status -f json | wc -c 2001 $ ceph health -f json | wc -c 46 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit acddf4fb)
-
Dimitri Savineau authored
The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the rgw/rbdmirror services status, we're only using the servicmap structure in the ceph status output. To optimize this, we could use the ceph service dump command which contains the same needed information. This command returns less information and is slightly faster than the ceph status command. $ ceph status -f json | wc -c 2001 $ ceph service dump -f json | wc -c 1105 $ time ceph status -f json > /dev/null real 0m0.557s user 0m0.516s sys 0m0.040s $ time ceph service dump -f json > /dev/null real 0m0.454s user 0m0.434s sys 0m0.020s Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 3f908193)
-
Dimitri Savineau authored
The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the quorum status, we're only using the quorum_names structure in the ceph status output. To optimize this, we could use the ceph quorum_status command which contains the same needed information. This command returns less information. $ ceph status -f json | wc -c 2001 $ ceph quorum_status -f json | wc -c 957 $ time ceph status -f json > /dev/null real 0m0.577s user 0m0.538s sys 0m0.029s $ time ceph quorum_status -f json > /dev/null real 0m0.544s user 0m0.527s sys 0m0.016s Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 88f91d8c)
-
Dimitri Savineau authored
The ceph status command returns a lot of information stored in variables and/or facts which could consume resources for nothing. When checking the pgs state, we're using the pgmap structure in the ceph status output. To optimize this, we could use the ceph pg stat command which contains the same needed information. This command returns less information (only about pgs) and is slightly faster than the ceph status command. $ ceph status -f json | wc -c 2000 $ ceph pg stat -f json | wc -c 240 $ time ceph status -f json > /dev/null real 0m0.529s user 0m0.503s sys 0m0.024s $ time ceph pg stat -f json > /dev/null real 0m0.426s user 0m0.409s sys 0m0.016s The data returned by the ceph status is even bigger when using the nautilus release. $ ceph status -f json | wc -c 35005 $ ceph pg stat -f json | wc -c 240 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit ee505885)
-
wangxiaotong authored
Improve the checked way of the OSD created checking process. This replaces the ceph status command by the ceph osd stat command. The osdmap structure isn't needed anymore. $ ceph status -f json | wc -c 2001 $ ceph osd stat -f json | wc -c 132 $ time ceph status -f json > /dev/null real 0m0.563s user 0m0.526s sys 0m0.036s $ time ceph osd stat -f json > /dev/null real 0m0.457s user 0m0.411s sys 0m0.045s Signed-off-by:
wangxiaotong <wangxiaotong@fiberhome.com> (cherry picked from commit b9cb0f12)
-
Guillaume Abrioux authored
In addition to f7e2b2c608eef4bbba47586f1e24d6ade1572758 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 371d854a)
-
- 02 Nov, 2020 1 commit
-
-
Gaudenz Steinlin authored
Otherwise this task fails if no permission is set on the item. Previously the code omited the mode parameter if it was not set, but this was lost with commit ab370b6a . Signed-off-by:
Gaudenz Steinlin <gaudenz.steinlin@cloudscale.ch> (cherry picked from commit 79ff79c4)
-