Skip to content

RMA WG 09 27 2018

James Dinan edited this page Sep 28, 2018 · 3 revisions

Agenda

  1. Memory model update (Anshuman)
  2. Committee meeting follow-up

Attendees

  • David Ozog, Jim Dinan (Intel)
  • Anshuman Goswami (Nvidia)
  • Naveen, Bob (Cray)
  • Shamis, Pavel (Arm)
  • Gorentla Venkata, Manjunath (Mellanox)
  • Min Si, Huansong Fu (ANL)
  • Grossman, Max (Rice U)

Notes

Dave’s proposal on wait-until-some and test-some APIs

  • (Jim) We could do a practice reading in the next RMA WG meeting.
  • (Min) Sharing PDF before the meeting would be great.

Naveen’s proposal on put-with-signal APIs

  • (Manju) We should not use union for both int64_t uint64_t.
  • (Pasha) No need to change to int64_t if not using increment, and union makes it complicated.
  • (Naveen) Plan to remove the strict qualifier. In future when we want to describe how dest and src buffer can overlap, we can just remove the current description for put-signal and add a generic one.
  • (Jim) We could do a reading about this in the next RMA WG meeting.
  • (Manju) Do not see the upside of combining the two current proposals as Naveen asks.
  • (Jim) We can do a special ballot or read the proposal again if the two are combined.

Anshuman’s ticket on correct pt-to-pt synchronization:

  • (Anshuman introduces the background of the ticket) The goal of the ticket is to agree on the list of APIs that are allowed to signal to wait-until and test for p2p synchronization.
  • (Pasha) Why MPI operation is excluded?
  • (Pasha) Should better distinguish atomic operations and single-copy put.
  • (Pasha) Different memory fabrics have different single-copy atomicity support.
  • (Anshuman) Should we allow A and C (see issue #248) to conflict?
  • (Pasha) Agree that B should be excluded.
  • (Manju) A needs single-copy atomicity.
  • (Pasha) More homework is needed on the network spec before putting single-copy guarantee on A.
  • (Jim) If we mix other operation with AMO on the signal memory location, the behavior should be undefined.
  • (Jim) We do not actually want to choose A/B/C/D to be used simultaneously since we already have spec that prevents the need to support that which says the behavior is undefined.
  • (Min) Why not let the implementation choose to optimize the current atomic_set if it wants single-put atomicity?
  • (Anshuman) Then atomic set may no longer be read-modify-write.
  • (Pasha) Single-copy atomicity is not really atomic.
  • (Jim) Should do a numerate on what combinations have well-defined behavior and see if a shmem_wait can deal with each of those.
  • (Jim) A primary concern with B is that network might write again to the same buffer before the operation actually completes, like the case of retransmission when CRC fails.
  • (Jim) An argument to keep A in the list is that A has been used for signaling for a long time.
Clone this wiki locally