Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EnableMD5 is set true with HA (FOFB) saw the upload error with Error: mismatch part etag #8510

Open
rkomandu opened this issue Nov 6, 2024 · 0 comments
Assignees
Labels

Comments

@rkomandu
Copy link
Collaborator

rkomandu commented Nov 6, 2024

Environment info

  • NooBaa Version: VERSION
  • Platform: Kubernetes 1.14.1 | minikube 1.1.1 | OpenShift 4.1 | other: specify

noobaa d/s rpm = noobaa-core-5.17.1-20241104.el9 (standalone Noobaa)

Actual behavior

Ran the upload of an object with enablemd5 is set to true, then with HA functionality from one node to other CES IP moved , IO continued while it is in HA process but at the end it reported as shown below

"upload failed: ./file_50G to s3://newbucket-ha-reg/file_50G-obj An error occurred (InternalError) when calling the CompleteMultipartUpload operation (reached max retries: 4): We encountered an internal error. Please try again."

However in the noobaa.logs on the node (after all the parts are uploaded), the following mismatch of etag has been logged


Nov  5 03:31:57 node-gui0 [3516845]: [nsfs/3516845]    [L0] core.endpoint.s3.ops.s3_put_object_uploadId:: PUT OBJECT PART newbucket-ha-reg file_50G-obj 5098
Nov  5 03:31:58 node-gui0 [3516845]: [nsfs/3516845]    [L0] core.endpoint.s3.ops.s3_put_object_uploadId:: PUT OBJECT PART newbucket-ha-reg file_50G-obj 5108
Nov  5 03:33:07 node-gui0 [3516845]: [nsfs/3516845]    [L0] core.endpoint.s3.ops.s3_put_object_uploadId:: PUT OBJECT PART newbucket-ha-reg file_50G-obj 6390
Nov  5 03:33:07 node-gui0 [3516845]: [nsfs/3516845]    [L0] core.endpoint.s3.ops.s3_put_object_uploadId:: PUT OBJECT PART newbucket-ha-reg file_50G-obj 6400
...
Nov  5 03:33:08 node-gui0 [3516845]: [nsfs/3516845] [ERROR] core.sdk.namespace_fs::  Error: mismatch part etag: {  num: 164,  etag: '96995b58d4cbf6aaa9041b4f00c7f6ae',  md_part_path: '/ibm/fvt_fs/s3user-17001-dir/newbucket-ha-reg/.noobaa-nsfs_6729d23a52a14216974196a6/multipart-uploads/27421b2a-f3ba-4755-9f8e-32cb11d85e85/part-164',  md_part_stat: { dev: 45, ino: 1273543, mode: 33200, nlink: 1, uid: 17001, gid: 17000, rdev: 0, size: 0, blksize: 4194304, blocks: 0, atimeMs: 1730795197969.724, ctimeMs: 1730795197969.724, mtimeMs: 1730795197969.724, birthtimeMs: 1730795197969.724, atime: 2024-11-05T08:26:37.970Z, mtime: 2024-11-05T08:26:37.970Z, ctime: 2024-11-05T08:26:37.970Z, birthtime: 2024-11-05T08:26:37.970Z, atimeNsBigint: 1730795197969723904n, ctimeNsBigint: 1730795197969723904n, mtimeNsBigint: 1730795197969723904n, xattr: { 'security.selinux': 'system_u:object_r:unlabeled_t:s0\x00' } },  params: {    obj_id: '27421b2a-f3ba-4755-9f8e-32cb11d85e85',    bucket: 'newbucket-ha-reg',    key: 'file_50G-obj',    md_conditions: undefined,    multiparts: [      { num: 1, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 2, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 3, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 4, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 5, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 6, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 7, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 8, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 9, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 10, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 11, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 12, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 13, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 14, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 15, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 16, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 17, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 18, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 19, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 20, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 21, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 22, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 23, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 24, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 25, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 26, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 27, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 28, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 29, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 30, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 31, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 32, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 33, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 34, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 35, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 36, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 37, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 38, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 39, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 40, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 41, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 42, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 43, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 44, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 45, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 46, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 47, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 48, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 49, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 50, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 51, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 52, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 53, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 54, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 55, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 56, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 57, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 58, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 59, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 60, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 61, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 62, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 63, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 64, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 65, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 66, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 67, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 68, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },
...
m: 84, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 85, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 86, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 87, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 88
Error</Code><Message>We encountered an internal error. Please try again.</Message><Resource>/newbucket-ha-reg/file_50G-obj?uploadId=27421b2a-f3ba-4755-9f8e-32cb11d85e85</Resource><RequestId>m347079p-2anugj-1c5s</RequestId></Error> POST /newbucket-ha-reg/file_50G-obj?uploadId=27421b2a-f3ba-4755-9f8e-32cb11d85e85 {"host":"gpfs-p10-s3-ces.rtp.raleigh.ibm.com:6443","accept-encoding":"identity","user-agent":"aws-cli/1.29.62 md/Botocore#1.31.62 ua/2.0 os/linux#5.14.0-427.42.1.el9_4.ppc64le md/arch#ppc64le lang/python#3.9.18 md/pyimpl#CPython cfg/retry-mode#legacy botocore/1.31.62","x-amz-date":"20241105T083308Z","x-amz-content-sha256":"b5dd75d3efc4e6519e5cb8f1964de4384b98e6c525d13f90b534b46c45420530","authorization":"AWS4-HMAC-SHA256 Credential=KCxP4AN9937kVqoCrNIs/20241105/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=0aea3ca2daf5c83d13dda43f3894be499a63e300117e67239cb291aa670dd79a","amz-sdk-invocation-id":"65e60f6d-eeb5-4ad9-87ab-999b44c7acd4","amz-sdk-request":"attempt=1","content-length":"568592"} Error: mismatch part etag:
{  num: 164,  etag: '96995b58d4cbf6aaa9041b4f00c7f6ae',  md_part_path: '/ibm/fvt_fs/s3user-17001-dir/newbucket-ha-reg/.noobaa-nsfs_6729d23a52a14216974196a6/multipart-uploads/27421b2a-f3ba-4755-9f8e-32cb11d85e85/part-164',  md_part_stat: { dev: 45, ino: 1273543, mode: 33200, nlink: 1, uid: 17001, gid: 17000, rdev: 0, size: 0, blksize: 4194304, blocks: 0, atimeMs: 1730795197969.724, ctimeMs: 1730795197969.724, mtimeMs: 1730795197969.724, birthtimeMs: 1730795197969.724, atime: 2024-11-05T08:26:37.970Z, mtime: 2024-11-05T08:26:37.970Z, ctime: 2024-11-05T08:26:37.970Z, birthtime: 2024-11-05T08:26:37.970Z, atimeNsBigint: 1730795197969723904n, ctimeNsBigint: 1730795197969723904n, mtimeNsBigint: 1730795197969723904n, xattr: { 'security.selinux': 'system_u:object_r:unlabeled_t:s0\x00' } },  params: {    obj_id: '27421b2a-f3ba-4755-9f8e-32cb11d85e85',    bucket: 'newbucket-ha-reg',    key: 'file_50G-obj',    md_conditions: undefined,    multiparts: [      { num: 1, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 2, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 3, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 4, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 5, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 6, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 7, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 8, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 9, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },  { num: 10, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 11, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 12, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 13, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 14, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 15, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 16, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 17, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 18, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 19, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 20, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },      { num: 21, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' }, { num: 22, etag: '96995b58d4cbf6aaa9041b4f00c7f6ae' },

Expected behavior

What is the reason for the etag mismatch (the system is for a RR setup DNS) so the IO continues on the HA failover mechanism , it shouldn't get the error

Steps to reproduce

Upload a large object
generate an assert for gpfs daemon (it stop all services, starts gpfs daemon, Start Services back)
upload should be successful

More information - Screenshots / Logs / Other output

I am posting the logs of noobaa on the protocol nodes (2 of them) and gpfs logs as well.

https://ibm.ent.box.com/folder/292622193321

@rkomandu rkomandu added the NS-FS label Nov 6, 2024
@naveenpaul1 naveenpaul1 self-assigned this Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants