RMA WG 05 10 2018

Agenda

Collect opens, assign note taker
Updates from proposal leads

Attendees

Did not take roll

Non-blocking AMO's (Naveen):

    * Naveen: Still have no performance numbers - maybe could get some by next meeting
              Do we need performance numbers? We've removed non-fetching already, but would users benefit from having non-blocking fetching AMOs?
      Jim: The argument is pretty good without supporting data.

    * Manju questions: 
         Do you have a handle with each NB AMO?  No
         How does fetching work?  Separate API, context argument, needs a quiet
         Where is value stored?  Similar to NB get with implicit handle, destination argument has been added
         Hardware support?  Aries has some support, 8 bytes only.

    * Advantages:
        Naveen: Overlap is acheivable with NB AMOs, API is good for chaining multiple AMOs together
        Jim: higher issue rate, better pipelining

    * Manju: Might not need NB AMOs, don't need to change current API much if there's no real hardware support
      Jim: All libfabric is implicitly non-blocking, when blocking we wait right after calling, returns to location in stack. 
      Manju: Could move written value to arguments instead of return value
      Jim: User does supply arg value, need to complete with quiet before user can read
      Manju: In NB case, can't reuse buffer, need to wait for quiet
      Jim: NBE stuff could work too, but not with 1.4 spec.

    * Manju: Not convinced there will be a performance difference with non-blocking AMO's... internally doing the same thing.
      Jim: Definitely expect it in our implementation
      Manju: Where is the performance benefit vs current API?
      Jim: Issuing all NB AMOs in a loop then waiting would be far better than waiting on each round-trip
      Manju: Doing a copy, in NB, cannot reuse the buffer until quiet, has to wait anyway
      Jim: Blocking fetch AMO depends on round-trip latency, NB doesn't need it, blocking ops are locally completed upon return (may not be visible at target). A simple experiment could show the benefit.
             
    * Plan for this PR:
      Naveen: DMAPP has a few difficulties - for instance, passing by reference / by value
              will look at libfabric as well
              could do informal reading next RMA WG (May 24th)

Wait/testsome (Dave):

    * Need to make updates to the semantic of the return value in current draft
    * Have new example which does all-to-all task processing
    * completion "status" argument array will be 0/1 of type _Bool
    * Will email Manju about informal reading for the 21st meeting

Wait then set (Nick):

    * Still in the concept stage, need Nick to know more

Put with signal (Naveen/Bob):

    * Naveen: Bob is working on it, planning to have a reading soon in WG
              Put with signal as well as put with increment requested?

      Jim: Put with the signal as an atomic write
           Put with inc could be added to proposal in the future? 
    * Pass signal operator as an input argument?
    * Signal is an atomic op, and wait should block on that.  Does spec allow that?  Yes, in Jim's proposal.

    * Manju: Does wait need flush every read?
      Jim: No, do need a read fence before returning from wait
      Manju: Potential problem operating in 2 different atomic domains
      Jim: One place is consumer, one place is processor (cpu/nic)
      Manju: consistency between different domains could be an issue
      Jim: not coherence problem, a consistency issue (write needs to be visible to read).  If atomics cache not write-through - shmem_wait needs to do something to update NIC
      Manju: Specification's memory model may not specify this.  There's also an ordering assumption - put *then* signal.
      Jim: Worst case; put/fence/blocking AMO.  Better case: register both operations simultaneously, makes put_with_signal non-blocking

    * Naveen: do we need the new shmem_wait semantic on atomics?
      Jim: Hopefully what we need is posted in proposal #204

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RMA WG 05 10 2018

Agenda

Attendees

Non-blocking AMO's (Naveen):

Wait/testsome (Dave):

Wait then set (Nick):

Put with signal (Naveen/Bob):

Other topics

Clone this wiki locally