Skip to content

Common production tasks issues

David edited this page Dec 20, 2016 · 4 revisions

This page lists common production tasks/issues

Issue: Group membership or content/discussion libraries are not up to date

User's libraries such as content, discussion and group memberships are created from denormalised data that is optimised for the various visibility permutation and to achieve high throughput. In very unlikely circumstances it's possible that these get out of sync.

For example, if a user cannot see the groups of which they are a member in their “My groups” list their memberships library has probably been corrupted.

This can be fixed by completely wiping the user's group memberships library. The next time the library is retrieved, it will be rebuilt on the fly.

Run the following in CQLSH:

delete from "LibraryIndex" where "bucketKey" = 'principals:memberships#u:OAE:foobar#public';
delete from "LibraryIndex" where "bucketKey" = 'principals:memberships#u:OAE:foobar#loggedin';
delete from "LibraryIndex" where "bucketKey" = 'principals:memberships#u:OAE:foobar#private';

Change a tenancy to use a custom hostname

For example, the University of Fun has an OAE hostname uof.unity.ac but want to use unity.uof.ac. In these example commands unity.uof.ac is a placeholder for the tenancy's new hostname.

1. The client needs to update DNS for their new hostname like so;

unity.uof.ac CNAME b3f35dd8ae0a0414116b0f178b09cf9a.unity.ac

2. Generate an SSL key and CSR.

# cd /path/to/somewhere/safe/for/keys
# hostname=unity.uof.ac
# openssl genrsa -out ${hostname}.key 2048
# openssl req new -sha256 -key ${hostname}.key -out ${hostname}.csr

Send the CSR to the client, which they can use to buy an SSL certificate, which the client then sends to us. This is often a zip file with both the certificate and any required intermediate certificates.

# cd /path/to/somewhere/safe/for/keys
# unzip 244306.zip
Archive:  244306.zip
  inflating: unity_uof_ac.crt    
  inflating: RootCertificates/IntermediateCertificate.crt  
  inflating: RootCertificates/RootCertificate.crt  

The root certificate is not required but the intermediates need to be added to the final certificate file.

# cat unity_uof_ac.crt RootCertificates/IntermediateCertificate.crt > server.crt

At this point we should make certain that the certificate is correct for the key as sometimes mistakes are made. If there is more than 1 line of output from this command then something is wrong.

# (openssl x509 -in server.crt -noout -modulus | openssl md5 ; openssl rsa -in ${hostname}.key -noout -modulus | openssl md5) | uniq

3. Place the certificate and key files into the puppetmaster config (on the puppet server itself, not the puppet-hilary repo) and rename the key to server.key.

# ls -la /etc/puppet/puppet-hilary/environments/production/modules/localconfig/files/ssl/unity.uof.ac/
drwxr-xr-x  2 root root 4096 Dec 15 16:19 ./
drwxr-xr-x 11 root root 4096 Dec 20 16:34 ../
-r--r--r--  1 root root 3659 Dec 15 16:19 server.crt
-r--r--r--  1 root root 1675 Dec 15 16:19 server.key

4. Now update puppet-hilary.

You'll likey want to redirect the old hostname to the new one so edit modules/nginx/files/redirect_map.conf. It should be obvious how to make this edit.

Update "web_domains_external" in environments/production/hiera/common.json to include the new hostname in the array.

5. Once the PR has been merged into master stop the puppet service on the live web node.

service puppet stop

Pull the updated puppet-hilary repo on to puppet server.

cd /etc/puppet/puppet-hilary
git pull

6. Ensure the puppet applies the changes to the backup web node, then restart nginx, ensuring you check for errors.

# service nginx restart
 * Stopping Nginx Server...                                                                 [ OK ] 
 * Starting Nginx Server...                                                                 [ OK ] 

Sadly you can't test the hostname change without updating the hostname within OAE admin console, which obviously changes it for live users too.

Even so you can test the redirect and SSL by setting your own hosts file to point the new and old hostnames to the backup server. DO NOT forget to remove the hosts file changes right after the test.

7. Update the tenant's hostname using the admin UI.

Then fail over to the backup web node, on puppet server use the dyn_failover tool;

# dyn_failover -s
Applying change: good 37.153.97.142
Live webserver: web0
Live response:  not checked
statelastcheck: ok
statetimestamp: 164401

Wait until DNS is updated, which should be less than 30s;

# dig +short admin.unity.ac
b3f35dd8ae0a0414116b0f178b09cf9a.unity.ac.
37.153.97.142

The matching IPs from those commands indicate DNS is ready.

8. Test to confirm the redirect, the new hostname and it's SSL certificate are correct.

9. Now start puppet agent to apply changes to live web node, wait for it to apply the config and restart nginx, checking for errors.

# service puppet start
 * Starting puppet agent                                                                    [ OK ] 
# ls -l /etc/nginx/oae.conf.d/unity.uof.ac.conf
-rw-r----- 1 nginx nginx 12620 Dec 20 16:31 /etc/nginx/oae.conf.d/unity.uof.ac.conf
# service nginx restart
 * Stopping Nginx Server...                                                                 [ OK ] 
 * Starting Nginx Server...                                                                 [ OK ] 

10. Finally fail over back to the live web node, on puppet server;

# dyn_failover -s
Applying change: good 37.153.99.196
Live webserver: web1
Live response:  not checked
statelastcheck: ok
statetimestamp: 164701

That should be it.

Clone this wiki locally