Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redis-server does not start on Canonical Kubernetes #90

Closed
vmpjdc opened this issue Jul 2, 2024 · 6 comments
Closed

redis-server does not start on Canonical Kubernetes #90

vmpjdc opened this issue Jul 2, 2024 · 6 comments

Comments

@vmpjdc
Copy link

vmpjdc commented Jul 2, 2024

Hi,

I noticed that redis-k8s does not work on Canonical Kubernetes (channel 1.30/beta, revision 64).

Model           Controller                      Cloud/Region            Version  SLA          Timestamp         
stg-netbox-k8s  juju-controller-34-staging-ps6  stg-netbox-k8s/default  3.4.2    unsupported  21:13:30Z         
                                                                                                                        
App        Version  Status   Scale  Charm      Channel        Rev  Address         Exposed  Message             
redis-k8s           waiting      1  redis-k8s  latest/edge     32  10.152.183.100  no       installing agent    
                                                                                                                        
Unit          Workload  Agent  Address     Ports  Message                                                               
redis-k8s/0*  error     idle   10.1.1.63          hook failed: "storage-attached"                                       

Digging into the deployment, I found this:

$ kubectl exec -t -n stg-netbox-k8s redis-k8s-0  -c redis -- head /var/log/redis/redis-server.log
13:C 01 Jul 2024 05:15:06.233 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
13:C 01 Jul 2024 05:15:06.233 * Redis version=7.2.5, bits=64, commit=00000000, modified=0, pid=13, just started
13:C 01 Jul 2024 05:15:06.233 * Configuration loaded
13:M 01 Jul 2024 05:15:06.233 * Increased maximum number of open files to 10032 (it was originally set to 1024).
13:M 01 Jul 2024 05:15:06.233 * monotonic clock: POSIX clock_gettime
13:M 01 Jul 2024 05:15:06.234 * Running mode=standalone, port=6379.
13:M 01 Jul 2024 05:15:06.234 * Server initialized
13:M 01 Jul 2024 05:15:06.240 # Can't open or create append-only dir appendonlydir: Permission denied
18:C 01 Jul 2024 05:15:06.839 * Redis version=7.2.5, bits=64, commit=00000000, modified=0, pid=18, just started
$ _

When the Juju storage is provisioned on a Canonical Kubernetes cluster, the permissions of /var/lib/redis are as follows:

root@redis-k8s-0:/# ls -ld /var/lib/redis/
drwxr-xr-x 4 root root 4096 Jul  2 01:58 /var/lib/redis/
root@redis-k8s-0:/# _

This does not work, because the pebble plan runs redis-server as the redis user.

When deploying on microk8s, the permissions of the provisioned storage are as follows:

root@redis-k8s-0:/# ls -ld /var/lib/redis/
drwxrwxrwx 3 root root 4096 Jul  2 01:36 /var/lib/redis/
root@redis-k8s-0:/# _

and so redis-server is able create the files and directories it needs:

root@redis-k8s-0:/# ls -l /var/lib/redis/
total 16
drwxr-xr-x 2 redis redis 4096 Jul  2 01:36 appendonlydir
-rw------- 1 redis redis 1895 Jul  2 01:36 ca.crt
-rw------- 1 redis redis 1407 Jul  2 01:36 redis.crt
-rw------- 1 redis redis 1679 Jul  2 01:36 redis.key
root@redis-k8s-0:/# _

This difference most likely arises because microk8s uses microk8s.io/hostpath-provisioner, whereas Canonical Kubernetes uses rawfile.csi.openebs.io.

The latter's defaults are more sensible, and probably closer to what e.g. the OpenStack Cinder provisioner does.

I think it would make sense for the charm to ensure that /var/lib/redis is owned by the correct user and group before redis-server becomes startable.

Copy link

github-actions bot commented Jul 2, 2024

@reneradoi
Copy link
Contributor

Hi @faebd7 thank you for reporting the issue.

I have adjusted the user and directory setup in the rock that is used by this charm (see canonical/charmed-redis-rock#7), so hopefully this should work now. Revision 33 of the redis-k8s-operator includes the fix.

Please let us know if other issues arise.

@vmpjdc
Copy link
Author

vmpjdc commented Jul 4, 2024

The problem is still present in revision 33.

The error below from redis-k8s/0 appears to be due to a separate problem that I am currently investigating. The new unit redis-k8s/1 was created after I ran juju refresh redis-k8s.

stg-netbox@is-bastion-ps6:~$ juju status -m admin/stg-netbox-k8s redis-k8s
Model           Controller                      Cloud/Region            Version  SLA          Timestamp
stg-netbox-k8s  juju-controller-34-staging-ps6  stg-netbox-k8s/default  3.4.2    unsupported  22:01:26Z

App        Version  Status   Scale  Charm      Channel      Rev  Address         Exposed  Message
redis-k8s           waiting      2  redis-k8s  latest/edge   33  10.152.183.100  no       waiting for units to settle down

Unit          Workload  Agent  Address     Ports  Message
redis-k8s/0*  error     idle   10.1.1.164         hook failed: "config-changed"
redis-k8s/1   error     idle   10.1.0.224         hook failed: "config-changed"
stg-netbox@is-bastion-ps6:~$ _
stg-netbox@is-bastion-ps6:~$ juju debug-log -m admin/stg-netbox-k8s  --include redis-k8s/1
unit-redis-k8s-1: 21:57:35 ERROR juju.worker.uniter.operation hook "config-changed" (via hook dispatching script: dispatch) failed: exit status 1
unit-redis-k8s-1: 21:57:35 INFO juju.worker.uniter awaiting error resolution for "config-changed" hook
unit-redis-k8s-1: 22:02:25 INFO juju.worker.uniter awaiting error resolution for "config-changed" hook
unit-redis-k8s-1: 22:02:35 INFO juju.worker.uniter awaiting error resolution for "config-changed" hook
unit-redis-k8s-1: 22:02:35 WARNING unit.redis-k8s/1.juju-log 2 containers are present in metadata.yaml and refresh_event was not specified. Defaulting to update_status. Metrics IP may not be set in a timely fashion.
unit-redis-k8s-1: 22:02:35 WARNING unit.redis-k8s/1.juju-log DEPRECATION WARNING - password off, this will be removed on later versions
unit-redis-k8s-1: 22:02:35 INFO unit.redis-k8s/1.juju-log Added updated layer 'redis' to Pebble plan
unit-redis-k8s-1: 22:02:35 ERROR unit.redis-k8s/1.juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/./src/charm.py", line 725, in <module>
    main(RedisK8sCharm)
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/venv/ops/framework.py", line 354, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/venv/ops/framework.py", line 830, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/venv/ops/framework.py", line 919, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/./src/charm.py", line 211, in _config_changed
    self._update_layer()
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/./src/charm.py", line 354, in _update_layer
    container.restart("redis", "redis_exporter")
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/venv/ops/model.py", line 1893, in restart
    self._pebble.restart_services(service_names)
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/venv/ops/pebble.py", line 1638, in restart_services
    return self._services_action('restart', services, timeout, delay)
  File "/var/lib/juju/agents/unit-redis-k8s-1/charm/venv/ops/pebble.py", line 1659, in _services_action
    raise ChangeError(change.err, change)
ops.pebble.ChangeError: cannot perform the following tasks:
- Start service "redis" (cannot start service: exited quickly with code 1)
----- Logs from task 0 -----
2024-07-04T22:02:35Z INFO Service "redis" has never been started.
----- Logs from task 1 -----
2024-07-04T22:02:35Z INFO Service "redis_exporter" has never been started.
----- Logs from task 2 -----
2024-07-04T22:02:35Z INFO Most recent service output:
    
2024-07-04T22:02:35Z ERROR cannot start service: exited quickly with code 1
-----
unit-redis-k8s-1: 22:02:35 ERROR juju.worker.uniter.operation hook "config-changed" (via hook dispatching script: dispatch) failed: exit status 1
unit-redis-k8s-1: 22:02:35 INFO juju.worker.uniter awaiting error resolution for "config-changed" hook
^C
stg-netbox@is-bastion-ps6:~$ _
root@redis-k8s-1:/# head /var/log/redis/redis-server.log 
12:C 04 Jul 2024 21:52:11.069 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
12:C 04 Jul 2024 21:52:11.069 * Redis version=7.2.5, bits=64, commit=00000000, modified=0, pid=12, just started
12:C 04 Jul 2024 21:52:11.069 * Configuration loaded
12:S 04 Jul 2024 21:52:11.070 * Increased maximum number of open files to 10032 (it was originally set to 1024).
12:S 04 Jul 2024 21:52:11.070 * monotonic clock: POSIX clock_gettime
12:S 04 Jul 2024 21:52:11.070 * Running mode=standalone, port=6379.
12:S 04 Jul 2024 21:52:11.070 * Server initialized
12:S 04 Jul 2024 21:52:11.070 # Can't open or create append-only dir appendonlydir: Permission denied
17:C 04 Jul 2024 21:52:11.904 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
17:C 04 Jul 2024 21:52:11.904 * Redis version=7.2.5, bits=64, commit=00000000, modified=0, pid=17, just started
root@redis-k8s-1:/# ls -ld /var/lib/redis/
drwxr-xr-x 3 root root 4096 Jul  4 21:52 /var/lib/redis/
root@redis-k8s-1:/# ls -l /var/lib/redis/
total 28
-rw------- 1 redis redis  1895 Jul  4 21:52 ca.crt
drwx------ 2 root  root  16384 Jul  4 21:51 lost+found
-rw------- 1 redis redis  1407 Jul  4 21:52 redis.crt
-rw------- 1 redis redis  1679 Jul  4 21:52 redis.key
root@redis-k8s-1:/# _

The Juju-created storage volume is mounted on top of the image's own /var/lib/redis and so the chown during image build has no effect on the final state of the deployment.

@mthaddon
Copy link
Contributor

mthaddon commented Jul 5, 2024

Reopening the issue, per the above comment.

@reneradoi
Copy link
Contributor

reneradoi commented Jul 5, 2024

Hi @faebd7 thank you for the feedback. I've now changed the charm itself to make sure the required directories are there, so hopefully this time it works. The new revision nr. 34 has been published to charmhub. Please let me know if it works.

@vmpjdc
Copy link
Author

vmpjdc commented Jul 7, 2024

@reneradoi Looks good in my testing, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants