- 16 Oct, 2019 8 commits
-
-
Guillaume Abrioux authored
this task is a leftover and no longer needed. It even causes bug when collocating nfs with mon. Closes: #4609 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit b63bd130)
-
Mike Christie authored
When using python3 the name of the rtslib rpm is python3-rtslib. The packages that use rtslib already have code that detects the python version and distro deps, so drop it from the ceph iscsi gw task list and let the ceph-iscsi rpm dependency handle it. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1760930 Signed-off-by:
Mike Christie <mchristi@redhat.com> (cherry picked from commit ba141298)
-
Dimitri Savineau authored
Due the 'failed_when: false' statement present in the peer task then the playbook continues to ran even if the peer task was failing (like incorrect remote peer format. "stderr": "rbd: invalid spec 'admin@cluster1'" This patch adds a task to list the peer present and add the peer only if it's not already added. With this we don't need the failed_when statement anymore. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1665877 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 0b1e9c07)
-
Guillaume Abrioux authored
Refact the mds cluster upgrade code in order to follow the documented recommandation. See: https://github.com/ceph/ceph/blob/master/doc/cephfs/upgrading.rst Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1569689 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 71cebf80)
-
Dimitri Savineau authored
When the iscsi gateway or the ceph configuration file change then we need to notify the rbd target api/gw services to be restarted. This patch also merges the rbd-target-api and rbd-target-gw handler into a single file and listen. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit bc701860)
-
Dimitri Savineau authored
The common roles don't need to be executed again on each group plays (like mons, osds, etc..). We only need to execute them during the first play. That wat, we will apply the changes on all nodes in parallel instead of doing it once per group. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 68a3dac7)
-
Dimitri Savineau authored
The is_atomic and container_binary facts are already defined in the ceph-facts role so we don't need to have dedicated tasks for that before the ceph-facts role exectution. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 643b50bd)
-
Guillaume Abrioux authored
There is no need to loop over all mgr nodes to set this fact, it's even breaking deployments because it tries to copy all mgr keyring on all mgr. Closes: #4602 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit cb802317)
-
- 15 Oct, 2019 4 commits
-
-
Dimitri Savineau authored
We are using multiple listen topics with the handlers. That means that we are notifying 4 tasks for each handler. Instead we can group the listen on an include_tasks and based on the group condition. Before: NOTIFIED HANDLER ceph-handler : set _mon_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mon restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mon daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mon_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _osd_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy osd restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph osds daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _osd_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _mds_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mds restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mds daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mds_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _rgw_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy rgw restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph rgw daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _rgw_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _mgr_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy mgr restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph mgr daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _mgr_handler_called after restart for mon0 NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called before restart for mon0 NOTIFIED HANDLER ceph-handler : copy rbd mirror restart script for mon0 NOTIFIED HANDLER ceph-handler : restart ceph rbd mirror daemon(s) for mon0 NOTIFIED HANDLER ceph-handler : set _rbdmirror_handler_called after restart for mon0 After: NOTIFIED HANDLER ceph-handler : mons handler for mon0 NOTIFIED HANDLER ceph-handler : osds handler for mon0 NOTIFIED HANDLER ceph-handler : mdss handler for mon0 NOTIFIED HANDLER ceph-handler : rgws handler for mon0 NOTIFIED HANDLER ceph-handler : mgrs handler for mon0 NOTIFIED HANDLER ceph-handler : rbdmirrors handler for mon0 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit fe9c5b8c)
-
Guillaume Abrioux authored
This commit adds some missing `| bool` filters. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit ccc11cfc)
-
Guillaume Abrioux authored
This commit merges the two restart tasks into a single one, this way it's one task less to notify. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 411bd07d)
-
Dimitri Savineau authored
The current ceph-validate role is using both validate action and fail module tasks to validate the ceph configuration. The validate action is based on the notario python library. When one of the notario validation fails then a python stack trace is reported to the ansible task. This output isn't understandable by users. This patch removes the validate action and the notario depencendy. The validation is now done with only fail ansible module. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1654790 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 0f978d96)
-
- 14 Oct, 2019 1 commit
-
-
Dimitri Savineau authored
This is already done in the main playbooks but absent in the dashboard playbook. The facts are already gathered during the first play of the main playbooks so we don't need to doing twice. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 5ae7304a)
-
- 11 Oct, 2019 2 commits
-
-
Guillaume Abrioux authored
Delegating on remote node isn't necessary here since we are already iterating over the right nodes. Closes: #4518 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 16117052)
-
Guillaume Abrioux authored
This commit adds a validation task to prevent from installing an OSD on the same disk as the OS. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1623580 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 80e2d00b)
-
- 10 Oct, 2019 1 commit
-
-
Guillaume Abrioux authored
This commit removes some legacy tasks. These tasks aren't needed, they cause the playbook to fail when collocating daemons. Closes: #4553 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 27341318)
-
- 09 Oct, 2019 2 commits
-
-
Guillaume Abrioux authored
If there is no host available, let's just skip these plays. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1759917 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 0b245bd0)
-
Dimitri Savineau authored
If the mgr dashboard doesn't restart fast enough then the inject dashboard task will fail with a HTTP error 400. Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 914, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/dashboard/module.py", line 450, in handle_command push_local_dashboards() File "/usr/share/ceph/mgr/dashboard/grafana.py", line 132, in push_local_dashboards retry() File "/usr/share/ceph/mgr/dashboard/grafana.py", line 89, in call result = self.func(*self.args, **self.kwargs) File "/usr/share/ceph/mgr/dashboard/grafana.py", line 127, in push grafana.push_dashboard(body) File "/usr/share/ceph/mgr/dashboard/grafana.py", line 54, in push_dashboard response.raise_for_status() File "/usr/lib/python2.7/site-packages/requests/models.py", line 834, in raise_for_status raise HTTPError(http_error_msg, response=self) HTTPError: 400 Client Error: Bad Request Instead we can trigger this task before the module restart. Closes: #4565 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 3f6ff240)
-
- 08 Oct, 2019 2 commits
-
-
Guillaume Abrioux authored
This commit reflects the recent changes in ceph/ceph-build#1406 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit bcaf8ced)
-
Dimitri Savineau authored
When switching from a baremetal deployment to a containerized deployment we only umount the OSD data partition. If the OSD is encrypted (dmcrypt: true) then there's an additional partition (part number 5) used for the lockbox and mount in the /var/lib/ceph/osd-lockbox/ directory. Because this partition isn't umount then the containerized OSD aren't able to start. The partition is still mount by the system and can't be remount from the container. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1616159 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 19edf707)
-
- 07 Oct, 2019 11 commits
-
-
Guillaume Abrioux authored
This commit moves this task in order to stop the nfs server service regardless the deployment type desired (containerized or non containerized). Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 6c6a512a)
-
Guillaume Abrioux authored
The syntax here wasn't working, this refact fixes this task. Also, removing the `ignore_errors: true` which was hidding the failure. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1508506 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 47034eff)
-
Dimitri Savineau authored
We don't need to have dedicated variables for the RGW integration into the Ceph Dashboard and need to be manually filled. Instead we can use the current values from the RGW nodes by using the IP and port from the first RGW instance of the first RGW node via the radosgw_address and radosgw_frontend_port variables. We don't need to specify all RGW nodes, this will be done automatically with one node. The RGW api scheme is using the radosgw_frontend_ssl_certificate variable to determine if the value is http or https. This variable is also reuse as a condition for the ssl verify task. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit b9e93ad7)
-
Guillaume Abrioux authored
This commit refacts the way we set `ceph_uid` fact in `ceph-facts` and removes all `set_fact` tasks for `ceph_uid` in switch-to-containers playbook to avoid duplicated code. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit fa9b42e9)
-
Guillaume Abrioux authored
As per https://github.com/ceph/ceph-ansible/pull/4323#issuecomment-538420164 using `find` command should be faster. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1757400 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> Co-Authored-by:
Giulio Fidente <gfidente@redhat.com> (cherry picked from commit c5d0c90b)
-
Dimitri Savineau authored
This patch moves the https dashboard configuration into a dedicated block to avoid the multiple occurence of the dashboard_protocol condition. It also fixes the dashboard certificate and key variables handling in the condition introduced by ab54fe20 . Those variables aren't boolean but strings so we can test them via the length filter. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 24976404)
-
Guillaume Abrioux authored
Typical error: ``` fatal: [mon0]: FAILED! => msg: |- The conditional check 'not delegate_facts_host | bool or inventory_hostname in groups.get(client_group_name, [])' failed. The error was: error while evaluating conditional (not delegate_facts_host | bool or inventory_hostname in groups.get(client_group_name, [])): 'client_group_name' is undefined ``` Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 8138d419)
-
Guillaume Abrioux authored
these dependencies aren't needed anymore on recent releases of Fedora. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 7fdf8b62)
-
Guillaume Abrioux authored
This commit excludes client nodes from facts gathering, they are not needed and can speed up this task. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 865d2eac)
-
Guillaume Abrioux authored
This commit adds some missing `| bool` filters. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit ccc11cfc)
-
Guillaume Abrioux authored
Add missing tag on ceph-handler role call. Otherwise, we can't use `--tags='ceph_update_config'` for updating the ceph configuration file. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1754432 Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit f59dad62)
-
- 04 Oct, 2019 5 commits
-
-
Dimitri Savineau authored
The secondary vagrant variables didn't have the grafana vm variable set which create an vagrant error. There was an error loading a Vagrantfile. The file being loaded and the error message are shown below. This is usually caused by an invalid or undefined variable. This patch also changes the ssh-extra-args parameter to ssh-common-args to get the same values for ssh/sftp/scp. Otherwise we can see warnings from ansible and some tasks are failing. [WARNING]: sftp transfer mechanism failed on [mon0]. Use ANSIBLE_DEBUG=1 to see detailed information It also updates the ssh-common-args value for the rgw-multisite scenario to reflect the ANSIBLE_SSH_ARGS environment variable value. Finally changing the IP addresses due to the Vagrant refact done in the commit 778c51a0 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 010158ff)
-
Dimitri Savineau authored
The ceph dashboard tasks didn't use the cluster option if the cluster name isn't the default value. Closes: #4529 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit dd526cfe)
-
Dimitri Savineau authored
The block section were used with the dashboard_enabled condition when the code was included in the main playbooks. Because this condition isn't present in the dashboard playbook anymore we can remove the block section. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit cf47594b)
-
Guillaume Abrioux authored
because of the current ip address assignation, it's not possible to deploy more than 9 nodes per daemon type. This commit refact a bit and allows us to get around this limitation. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 778c51a0)
-
Dimitri Savineau authored
When using the ansible --limit option on one or few OSD nodes and if the handler is triggered then we will restart the OSD service on all OSDs nodes instead of the hosts limited by the limit value. Even if the play is limited by the --limit value we are using all OSD nodes from the OSD group. with_items: '{{ groups[osd_group_name] }}' Instead we should iterate only on the nodes present in both OSD group and limit list. Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 0346871f)
-
- 03 Oct, 2019 1 commit
-
-
Dimitri Savineau authored
e695efca introduced a regression in the _radosgw_address fact when using the radosgw_address_block variable. There's no item there because we don't use the items lookup. This is only used for _monitor_address with monitor_address_block. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1758099 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit 780cf36a)
-
- 02 Oct, 2019 2 commits
-
-
Guillaume Abrioux authored
There is no need to get n * number of nodes the different keyrings. Adding a `run_once: true` here avoid running a ceph command too many times which could be impacting large cluster deployment. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 9bad239d)
-
Dimitri Savineau authored
During the rolling_update scenario, the fsid value is retrieve from the current ceph cluster configuration via the ceph daemon config command. This command tries first to resolve the admin socket path via the ceph-conf command. Unfortunately this command won't work if you have a duplicate key in the ceph configuration even if it only produces a warning. As a result the task will fail. Can't get admin socket path: unable to get conf option admin_socket for mon.xxx: warning: line 13: 'osd_memory_target' in section 'osd' redefined Instead of using ceph daemon we can use the --admin-daemon option because we already know what the socket admin path value based on the ceph cluster and mon hostname values. Closes: #4492 Signed-off-by:
Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit ec3b687d)
-
- 01 Oct, 2019 1 commit
-
-
Guillaume Abrioux authored
Check for gpt header when osd scenario is lvm or lvm batch. Signed-off-by:
Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 272d16e1)
-