-
Notifications
You must be signed in to change notification settings - Fork 41
RMA WG 04 12 2018
- Collect opens, assign note taker
- Dave O.: Test/Wait Some Proposal Review
- No roll this meeting
- Follow-up from 3/1: Get feedback/input from users on
shmem_put_signal
w.r.t. the type of the signal word (e.g.,size_t
,uint64_t
, something else?) and the type of the signal operation (e.g., atomic write, atomic add).-
[Nick] Unfortunately, I'm unable to join the WG this week, but I wanted to provide feedback for the WG's consideration:
For the signal word, I think
uint64_t
is the preferred type. For the signal operation, I think an atomic write is the primary form of interest, but there is also interest in atomic add. If the WG is open to considering addition of both operations but wants to minimize API expansion, we could consider using asignal_op
argument to specify the operation; e.g.,void shmem_put_signal(TYPE *dest, const TYPE *source, size_t nelems, int signal_op, uint64_t *signal_word, int pe);
where
signal_op
may beSHMEM_SIGNAL_WRITE
orSHMEM_SIGNAL_ADD
.
-
The operation of greatest interest is put with signal -> put followed by atomic write Keeping the same API, but adding an argument for the operation would allow for put with atomic increment, which would also be very useful.
Adding the operation argument would add a branching instruction in the put call Is this acceptable overhead?
- Since put is slow and unlikely to be called in a tight loop, the branch overhead should be acceptable
- The total number of supportable operations will be limited based on network support
Open Question: Since OpenSHMEM defines atomic operations to be atomic only in relation to other atomic operations, do I need to use atomic get on the recv side to get the signal word?
- On one hand, we may just get garbage if we try a regular read on the signal word if it is using atomic operation for put but not for get.
- On the other hand, put with signal itself is not defined as an atomic operation, so a regular read should potentially be fine.
Discussion did not reach a satisfactory answer, so this will be taken up again at the next meeting
Table 7 in section 9.9 indicates both size_t and ptrdiff_t to be supported types for wait_until* Does anyone implement this? Should we drop support?
Cons: size_t and ptrdiff_t are more loosely defined types and might be high overhead to implement Pros: In reality though, these are probably limited to 64bit or unsigned 64bit, and people do tend to use these (?)
Cray does not currently implement these as supported types in wait_until, but willing to add if others also add
Do we want an array of values to replace the single cmp_value so that each value in ivars can be tested against a separate condition?
Cons: memory consumption, more complicated user code to setup array Pros: API may not be useful without separate values if users seldom need to test many values with a single comparison
How can we consider the tradeoff between looping over many compare values vs. fencing between multiple calls to wait_until?
- In-cache vs. out-of-cache will matter, and this depends on actual use cases
- Need feedback about the use cases for wait_until to determine if the API needs multiple compare values and if so, what kind of performance hit will it take to add them
What should be done about the odd initialization and output rules for indices?
- Currently, indices must be initialized to anything > nelems
- Out value is not strictly defined, is it left as the in value? Set to nelems exactly?
General consensus seems to be change this to a mask array with values of 0/1. The original rationale for the in/out values is no longer the case, so this can be changed. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-
Working Groups
-
Errata