Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replication failure after 389 ds upgrade from 2.0.14 to 2.2.8 #6371

Open
bhr83 opened this issue Oct 20, 2024 · 1 comment
Open

Replication failure after 389 ds upgrade from 2.0.14 to 2.2.8 #6371

bhr83 opened this issue Oct 20, 2024 · 1 comment
Labels
needs triage The issue will be triaged during scrum

Comments

@bhr83
Copy link

bhr83 commented Oct 20, 2024

Initially server is deployed with host OS SLES SP4 with 389 ds 2.0.17 and created replication successfully. After upgrading host OS as SLES SP5 with 389-ds 2.2.8 replication got broken.

Before upgrade:
sudo zypper info 389-ds
Information for package 389-ds:

Repository : hostos_sp4
Name : 389-ds
Version : 2.0.17git91.37da5ec-150400.3.34.1
Arch : x86_64
Vendor : SUSE LLC <
https://www.suse.com/>
Support Level : unknown
Installed Size : 15.1 MiB
Installed : Yes
Status : up-to-date
Source package : 389-ds-2.0.17
git91.37da5ec-150400.3.34.1.src
Upstream URL :
https://pagure.io/389-ds-base
Summary : 389 Directory Server

After upgrade
zypper info 389-ds
Information for package 389-ds:

Repository : hostos_sp5
Name : 389-ds
Version : 2.2.8git65.347aae6-150500.3.17.1
Arch : x86_64
Vendor : SUSE LLC https://www.suse.com/
Support Level : unknown
Installed Size : 13.3 MiB
Installed : Yes
Status : up-to-date
Source package : 389-ds-2.2.8
git65.347aae6-150500.3.17.1.src
Upstream URL : https://pagure.io/389-ds-base
Summary : 389 Directory Server

After upgrade, when I run
sudo dsconf -Z -D cn=admin -w ***************** infra1 repl-agmt get --suffix=ou=,o=******** agreement_with_infra3_*********

I am seeing the following error intermittently:

nsds5replicaLastUpdateStatus: Error (10) Problem connecting to replica - LDAP error: Referral (connection error)
nsds5replicaLastUpdateStatusJSON: {"state": "red", "ldap_rc": "10", "ldap_rc_text": "Referral", "repl_rc": "16", "repl_rc_text": "connection error", "date": "2024-10-20T03:19:40Z", "message": "Error (10) Problem connecting to replica - LDAP error: Referral (connection error)"}
nsds5replicaUpdateInProgress: FALSE

The steps followed post upgrade are
`we are goint repl-agmt init with names agreement_with_infra2_k2_**** and agreement_with_infra3_k2_****
dsconf -Z -D cn=admin -w ******** infra1 repl-agmt init --suffix=ou=cee,o=**** agreement_with_infra2_k2_****
dsconf -Z -D cn=admin -w ******** infra1 repl-agmt init --suffix=ou=cee,o=**** agreement_with_infra3_k2_****

post that executing poke command as below
infra3 node
dsconf -Z -D cn=admin -w ******** infra3 repl-agmt poke --suffix=ou=cee,o=**** agreement_with_infra2_k2_****
infra1 node
dsconf -Z -D cn=admin -w ******** infra1 repl-agmt poke --suffix=ou=cee,o=**** agreement_with_infra2_k2_****

infra1 node
dsconf -Z -D cn=admin -w ******** infra1 repl-agmt poke --suffix=ou=cee,o=**** agreement_with_infra3_k2_****
infra2 node
dsconf -Z -D cn=admin -w ******** infra2 repl-agmt poke --suffix=ou=cee,o=**** agreement_with_infra3_k2_****

last poke(infra2 repl-agmt poke) failed with Error (10) Problem connecting to replica - LDAP error: Referral (connection error)

post that if I do ldapsearch on infra1 and infra2 it's working fine but in infra3 it's failed

sudo ldapsearch -x -H ldap://infra3.k2.****:3389 -D "cn=admin" -w *************** -ZZ'
Warning: Permanently added 'infra3,193.168.2.27' (ED25519) to the list of known hosts.

Attention! Prototype system. For sure, you are not authorized to login to this system.

extended LDIF

LDAPv3

base <> (default) with scope subtree

filter: (objectclass=*)

requesting: ALL

search result

search: 3
result: 10 Referral
matchedDN: ou=cee,o=****
ref: ldap://infra1.k2.:3389/ou%3Dcee%2Co%3D
ref: ldap://infra2.k2.:3389/ou%3Dcee%2Co%3D

numResponses: 1

@bhr83 bhr83 added the needs triage The issue will be triaged during scrum label Oct 20, 2024
@bhr83
Copy link
Author

bhr83 commented Oct 20, 2024

After each poke command, verifying repl-agmt get command to see replication status, if it green we are continuing with next one...
dsconf -Z -D cn=admin -w ******** infra3 repl-agmt poke --suffix=ou=cee,o=**** agreement_with_infra2_k2_****

dsconf -Z -D cn=admin -w ******** infra3 repl-agmt get --suffix=ou=cee,o=******* agreement_with_infra2_k2_****** | grep nsds5replicaLastUpdateStatusJSON
got output like
nsds5replicaLastUpdateStatusJSON: {"state": "green", "ldap_rc": "0", "ldap_rc_text": "Success", "repl_rc": "0", "repl_rc_text": "replica acquired", "date": "2024-10-19T08:42:06Z", "message": "Error (0) Replica acquired successfully: agreement disabled

But when I run the same repl-agmt get command
nsds5replicaLastUpdateStatusJSON: {"state": "red", "ldap_rc": "-5", "ldap_rc_text": "Timed out", "repl_rc": "16", "repl_rc_text": "connection error", "date": "2024-10-19T08:43:54Z", "message": "Error (-5) Problem connecting to replica - LDAP error: Timed out (connection error)"

almost after 1minute 45 seconds replication status became red..... Initially it shown as green and then turned as red.
Is this correct approach to verify replication status?
Is there any way effectively I can verify replication status?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage The issue will be triaged during scrum
Projects
None yet
Development

No branches or pull requests

1 participant