Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk release requests are not working if using relative path #7635

Open
ageorget opened this issue Aug 12, 2024 · 4 comments
Open

Bulk release requests are not working if using relative path #7635

ageorget opened this issue Aug 12, 2024 · 4 comments
Assignees

Comments

@ageorget
Copy link

Hi,

I found that release process is not working when the release is using relative path (without prefix) and this could explain why our Atlas staging buffer is full most of the time.

To reproduce it, I send a staging request of this file /atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1

cat stageAtlas.json
{
"files": [
{"path": "/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1","diskLifetime":"PT1H"}
]
}

curl --capath /etc/grid-security/certificates --cacert $X509_USER_PROXY --cert $X509_USER_PROXY -X POST "https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/stage" -H  "accept: application/json" -H  "content-type: application/json" -d @stageAtlas.json
{
  "requestId" : "1b72f21e-d66a-4af7-a784-6178a3c3a35c"
}%        

level=INFO ts=2024-08-12T16:07:39.771+0200 event=org.dcache.frontend.request request.method=POST request.url=https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/stage response.code=201 response.reason=Created location=https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/stage/1b72f21e-d66a-4af7-a784-6178a3c3a35c socket.remote=[2001:660:5009:84:134:158:239:7]:35504 user-agent=curl/7.29.0 user.dn="CN=1855496286,CN=GEORGET Adrien [email protected],O=Centre national de la recherche scientifique,C=FR,DC=tcs,DC=terena,DC=org" user.mapped=3327:124 request.entity="{\"files\":[{\"path\"[...]fetime\":\"PT1H\"}]}" response.entity="{\n  \"requestId\" : \"1b72f21e-d66a-4a[...]" duration=15

Staging is OK and file is pinned on disk cache :

\s pool-atlas-read-li425a rep sticky ls 000098FBFE5589274CABB284DA5BBB379C4B
self : expires 8/12/24, 4:12 PM
PinManager-0649a68f-2bc8-48e6-8138-c40d0b4bf130 : expires 8/14/24, 4:37 PM

Then I release the file using his relative path :

archiveinfo.json 
{
"paths": ["/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1"]
}

curl --capath /etc/grid-security/certificates --cacert $X509_USER_PROXY --cert $X509_USER_PROXY -X POST "https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/release/1b72f21e-d66a-4af7-a784-6178a3c3a35c" -H  "accept: application/json" -H  "content-type: application/json" -d @archiveinfo.json

level=INFO ts=2024-08-12T16:10:47.568+0200 event=org.dcache.frontend.request request.method=POST request.url=https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/release/1b72f21e-d66a-4af7-a784-6178a3c3a35c response.code=200 response.reason=OK socket.remote=[2001:660:5009:84:134:158:239:7]:35512 user-agent=curl/7.29.0 user.dn="CN=1855496286,CN=GEORGET Adrien [email protected],O=Centre national de la recherche scientifique,C=FR,DC=tcs,DC=terena,DC=org" user.mapped=3327:124 request.entity="{\"paths\":[\"/atlas[...]68.pool.root.1\"]}" duration=11

After 30min, pin is always active :

\s pool-atlas-read-li425a rep sticky ls 000098FBFE5589274CABB284DA5BBB379C4B
PinManager-0649a68f-2bc8-48e6-8138-c40d0b4bf130 : expires 8/14/24, 4:37 PM

And if I try to release the file using his full path, the file is instantly unpin from the disk :

cat archiveinfo.json
{
"paths": ["/pnfs/in2p3.fr/data/atlas/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1"]
}

[16:17]:curl --capath /etc/grid-security/certificates --cacert $X509_USER_PROXY --cert $X509_USER_PROXY -X POST "https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/release/1b72f21e-d66a-4af7-a784-6178a3c3a35c" -H  "accept: application/json" -H  "content-type: application/json" -d @archiveinfo.json

level=INFO ts=2024-08-12T16:17:19.146+0200 event=org.dcache.frontend.request request.method=POST request.url=https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/release/1b72f21e-d66a-4af7-a784-6178a3c3a35c response.code=200 response.reason=OK socket.remote=[2001:660:5009:84:134:158:239:7]:35522 user-agent=curl/7.29.0 user.dn="CN=1855496286,CN=GEORGET Adrien [email protected],O=Centre national de la recherche scientifique,C=FR,DC=tcs,DC=terena,DC=org" user.mapped=3327:124 request.entity="{\"paths\":[\"/pnfs/[...]68.pool.root.1\"]}" duration=27

In PinManager :
Aug 12 16:17:20 ccdcamcli08 dcache@PinManagerDomain[129736]: 12 Aug 2024 16:17:20 (PinManager) [BackgroundUnpinner-201460] Unpining [955776409] 000098FBFE5589274CABB284DA5BBB379C4B (1b72f21e-d66a-4af7-a784-6178a3c3a35c) by 3327:124 2024-08-12 16:07:39 to 2024-08-14 16:07:45 is READY_TO_UNPIN on pool-atlas-read-li425a:PinManager-0649a68f-2bc8-48e6-8138-c40d0b4bf130

[ccdcamcli06] (bulk@bulkDomain) ageorget > \s pool-atlas-read-li425a rep sticky ls 000098FBFE5589274CABB284DA5BBB379C4B
[ccdcamcli06] (bulk@bulkDomain) ageorget > 

Bulk service also doesn't report when a release request is not done.
Can you check this please?

Adrien

@DmitryLitvintsev
Copy link
Member

Likely the same patch I did for staging needs to be applied to release.

@DmitryLitvintsev DmitryLitvintsev self-assigned this Aug 12, 2024
@DmitryLitvintsev
Copy link
Member

OK. Like last time. Here I have built an RPM with a patch:

https://drive.google.com/file/d/1mgXibWbUUnqM0WsRclAKh-K8x3awIBkx/view?usp=sharing

Could you deploy it on you frontend door. Before doing so, make sure you tried it on our test system.

@ageorget
Copy link
Author

ageorget commented Aug 13, 2024

Thank you Dmitry for your quick fix.
I just copied the frontend jar from the RPM like last time and Unpinning seems to work now :

Aug 13 10:03:04 ccdcamcli08 dcache@PinManagerDomain[129736]: 13 Aug 2024 10:03:04 (PinManager) [bulk PinManagerUnpin] Unpinned 0000D63CB52D45B0404F9AF60A8F8F8DDDE9 (955788379)
Aug 13 10:03:06 ccdcamcli08 dcache@PinManagerDomain[129736]: 13 Aug 2024 10:03:06 (PinManager) [bulk PinManagerUnpin] Unpinned 0000235E8344B2FA40B1A56C2E5F002231C4 (955788682)
Aug 13 10:03:06 ccdcamcli08 dcache@PinManagerDomain[129736]: 13 Aug 2024 10:03:06 (PinManager) [bulk PinManagerUnpin] Unpinned 0000A9031930F46E4E8D8540F22030C07F62 (955789107)
Aug 13 10:03:06 ccdcamcli08 dcache@PinManagerDomain[129736]: 13 Aug 2024 10:03:06 (PinManager) [bulk PinManagerUnpin] Unpinned 000040FBBED815A44A759CE710C56EF80CA3 (955789383)
Aug 13 10:03:07 ccdcamcli08 dcache@PinManagerDomain[129736]: 13 Aug 2024 10:03:07 (PinManager) [bulk PinManagerUnpin] Unpinned 0000536A5798A7474E109E7D02D2DD9D8683 (955788506)

@DmitryLitvintsev
Copy link
Member

yes. Sorry for all this. This should have been fixed in one go.

DmitryLitvintsev added a commit that referenced this issue Aug 13, 2024
Motivation:
commit 922ea44 was
incomplete - it did not fix release API

Modification:
apply similar patch to release resoure.

Result:
Release by relative path works
Issue  #7635
addressed

Target: trunk
Request: 10.x
Request: 9.x
DmitryLitvintsev added a commit to DmitryLitvintsev/dcache that referenced this issue Aug 13, 2024
Motivation:
commit 922ea44 was
incomplete - it did not fix release API

Modification:
apply similar patch to release resoure.

Result:
Release by relative path works
Issue  dCache#7635
addressed

Target: trunk
Request: 10.x
Request: 9.x
DmitryLitvintsev added a commit to DmitryLitvintsev/dcache that referenced this issue Aug 13, 2024
Motivation:
commit 922ea44 was
incomplete - it did not fix release API

Modification:
apply similar patch to release resoure.

Result:
Release by relative path works
Issue  dCache#7635
addressed

Target: trunk
Request: 10.x
Request: 9.x
DmitryLitvintsev added a commit to DmitryLitvintsev/dcache that referenced this issue Aug 13, 2024
Motivation:
commit 922ea44 was
incomplete - it did not fix release API

Modification:
apply similar patch to release resoure.

Result:
Release by relative path works
Issue  dCache#7635
addressed

Target: trunk
Request: 10.x
Request: 9.x
lemora pushed a commit that referenced this issue Aug 14, 2024
Motivation:
commit 922ea44 was
incomplete - it did not fix release API

Modification:
apply similar patch to release resoure.

Result:
Release by relative path works
Issue  #7635
addressed

Target: trunk
Request: 10.x
Request: 9.x
lemora pushed a commit that referenced this issue Aug 14, 2024
Motivation:
commit 922ea44 was
incomplete - it did not fix release API

Modification:
apply similar patch to release resoure.

Result:
Release by relative path works
Issue  #7635
addressed

Target: trunk
Request: 10.x
Request: 9.x
lemora pushed a commit that referenced this issue Aug 14, 2024
Motivation:
commit 922ea44 was
incomplete - it did not fix release API

Modification:
apply similar patch to release resoure.

Result:
Release by relative path works
Issue  #7635
addressed

Target: trunk
Request: 10.x
Request: 9.x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants