Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-incorporate the teams example/tests from OpenSHMEM specification #907

Merged
merged 10 commits into from
Dec 12, 2019
1 change: 1 addition & 0 deletions configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -821,6 +821,7 @@ AC_CONFIG_FILES([Makefile
test/Makefile
test/unit/Makefile
test/shmemx/Makefile
test/spec-example/Makefile
test/include/Makefile
test/performance/Makefile
test/performance/shmem_perf_suite/Makefile
Expand Down
4 changes: 2 additions & 2 deletions mpp/shmemx.h4
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ include(shmemx_c_func.h4)dnl
} /* extern "C" */

define(`SHMEM_CXX_WAIT_UNTIL_ALL',
`static inline void shmemx_wait_until_all($2 *ivars, size_t nelems, int cmp, $2 cmp_value) {
shmemx_$1_wait_until_all(ivars, nelems, cmp, cmp_value);
`static inline void shmemx_wait_until_all($2 *ivars, size_t nelems, int *status, int cmp, $2 cmp_value) {
shmemx_$1_wait_until_all(ivars, nelems, status, cmp, cmp_value);
}')dnl
SHMEM_CXX_DEFINE_FOR_SYNC(`SHMEM_CXX_WAIT_UNTIL_ALL')

Expand Down
2 changes: 1 addition & 1 deletion mpp/shmemx_c_func.h4
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ SHMEM_FUNCTION_ATTRIBUTES void SHPRE()shmemx_pcntr_get_all(shmem_ctx_t ctx, shme

/* Wait_until and test any/all/some point-to-point API, proposed for OpenSHMEM 1.5 */
define(`SHMEM_C_WAIT_UNTIL_ALL',
`SHMEM_FUNCTION_ATTRIBUTES void SHPRE()shmemx_$1_wait_until_all($2 *vars, size_t nelems, int cmp, $2 value);')dnl
`SHMEM_FUNCTION_ATTRIBUTES void SHPRE()shmemx_$1_wait_until_all($2 *vars, size_t nelems, int *status, int cmp, $2 value);')dnl
SHMEM_BIND_C_SYNC(`SHMEM_C_WAIT_UNTIL_ALL')

define(`SHMEM_C_WAIT_UNTIL_ANY',
Expand Down
2 changes: 2 additions & 0 deletions sandia-openshmem.spec.in
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ install -d %{testdir}/unit
install test/unit/.libs/* %{testdir}/unit
install -d %{testdir}/shmemx
install test/shmemx/.libs/* %{testdir}/shmemx
install -d %{testdir}/spec-example
install test/spec-example/.libs/* %{testdir}/spec-example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am impressed that you remembered to update the spec file! 💯

install -d %{testdir}/apps
install test/apps/.libs/* %{testdir}/apps
install -d %{testdir}/performance/shmem_perf_suite
Expand Down
18 changes: 14 additions & 4 deletions src/synchronization_c.c4
Original file line number Diff line number Diff line change
Expand Up @@ -183,20 +183,30 @@ SHMEM_BIND_C_SYNC(`SHMEM_DEF_WAIT_UNTIL')

#define SHMEM_DEF_WAIT_UNTIL_ALL(STYPE,TYPE) \
void SHMEM_FUNCTION_ATTRIBUTES \
shmemx_##STYPE##_wait_until_all(TYPE *vars, size_t nelems, int cond, TYPE value) \
shmemx_##STYPE##_wait_until_all(TYPE *vars, size_t nelems, \
int *status, int cond, TYPE value) \
{ \
SHMEM_ERR_CHECK_INITIALIZED(); \
SHMEM_ERR_CHECK_SYMMETRIC(vars, sizeof(TYPE)); \
SHMEM_ERR_CHECK_CMP_OP(cond); \
\
size_t i; \
size_t i = 0, num_ignored = 0; \
\
if (nelems == 0) { \
if (status) { \
for (i = 0; i < nelems; i++) { \
if (status[i]) num_ignored++; \
} \
} \
if (nelems == 0 || num_ignored == nelems) { \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a unit test that exercises the num_ignored == nelems case?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I don't think so.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In such a unit test, we should just check whether wait_until_all returns, right? There is no need to check the completion of the AMO since the wait set is empty.

Looking at the spec, there is one sentence at the end of the API description for wait_until_all which may not hold true in these cases.

Implementations must ensure that shmem_wait_until_all does not return before the update of the memory indicated by ivars is fully complete.

Do we have to add an unless clause indicating if the status array does not include all PEs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this sentence cover the issue?

If all elements in status are set to 1 or nelems is 0, the wait set is empty and this routine returns immediately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does. But the last sentence in the API description is not covering the special case. I guess it was confusing to me since that line is written as a separate paragraph. It might be easier to merge this sentence to the end of the first paragraph.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see the confusion. It might be correct, since there is no "update of the memory" in this special case right? But yeah, it could use some clarification - if I understand correctly, your suggestion could be a doc edit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely doc edit should be sufficient. Similar edits may go to any/some/vector APIs as well.

shmem_transport_probe(); \
return; \
} \
\
for (i = 0; i < nelems; i++) SHMEM_INTERNAL_WAIT_UNTIL(&vars[i], cond, value); \
for (i = 0; i < nelems; i++) { \
if (status == NULL || !status[i]) { \
SHMEM_INTERNAL_WAIT_UNTIL(&vars[i], cond, value); \
} \
} \
\
shmem_internal_membar_load(); \
shmem_transport_syncmem(); \
Expand Down
2 changes: 1 addition & 1 deletion test/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@
# information, see the LICENSE file in the top level directory of the
# distribution.

SUBDIRS = unit shmemx performance apps include
SUBDIRS = unit shmemx performance apps spec-example include
13 changes: 0 additions & 13 deletions test/shmemx/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -24,30 +24,17 @@ check_PROGRAMS += \
endif

if SHMEMX_TESTS

check_PROGRAMS += \
perf_counter \
fadd_nbi \
atomic_nbi \
put_signal \
put_signal_nbi \
shmemx_wait_until_all \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shmemx_wait_until_all looks like it's a little different than the spec example (it uses shmem_p instead of shmem_atomic_set). Do we want to update that in this PR or separately?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it looks like all the wait/test any/all/some routines have this problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was updated upstream in the spec (wait/test clarification proposal). One motivation for separating out the spec tests is that this should make it easier for us to sync with the upstream sources.

shmemx_wait_until_any \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shmemx_wait_until_any also looks like it uses shmem_p instead of shmem_atomic_set...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed upstream in the OpenSHMEM spec. One reason for this change is to make it easier for us to pull in updates from the spec.

shmemx_wait_until_some \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shmemx_wait_until_some also looks like it uses shmem_p instead of shmem_atomic_set...

shmemx_test_all \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe shmemx_test_all is actually a part of the spec, if that matters...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it looks like shmemx_test_all is another case of using shmem_p instead of shmem_atomic_set.

shmemx_test_any \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shmemx_test_any is another case of using shmem_p instead of shmem_atomic_set...

shmemx_test_some \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shmemx_test_some is another case of using shmem_p instead of shmem_atomic_set.

(Yup, they all have that problem - my bad).

c11_test_shmemx_wait_until \
c11_test_shmemx_test \
shmemx_team_split_strided \
shmemx_team_translate \
shmemx_team_reuse_teams \
shmemx_team_sync \
shmemx_team_broadcast \
shmemx_team_collect \
shmemx_team_collect_active_set \
shmemx_team_alltoall \
shmemx_team_alltoalls \
shmemx_team_shared \
shmemx_team_b2b_collectives \
c11_shmemx_team_collective_types \
Expand Down
3 changes: 2 additions & 1 deletion test/shmemx/c11_test_shmemx_wait_until.c
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,9 @@
static TYPE remote = 0; \
const int mype = shmem_my_pe(); \
const int npes = shmem_n_pes(); \
int status = 0; \
shmem_p(&remote, (TYPE)mype+1, (mype + 1) % npes); \
shmemx_wait_until_all(&remote, 1, SHMEM_CMP_NE, 0); \
shmemx_wait_until_all(&remote, 1, &status, SHMEM_CMP_NE, 0); \
if (remote != (TYPE)((mype + npes - 1) % npes)+1) { \
printf("PE %i received incorrect value with " \
"TEST_SHMEM_WAIT_UNTIL_ALL(%s)\n", mype, #TYPE); \
Expand Down
68 changes: 68 additions & 0 deletions test/spec-example/Makefile.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# -*- Makefile -*-
#
# Copyright 2011 Sandia Corporation. Under the terms of Contract
# DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government
# retains certain rights in this software.
#
# Copyright (c) 2017 Intel Corporation. All rights reserved.
# This software is available to you under the BSD license.
#
# This file is part of the Sandia OpenSHMEM software package. For license
# information, see the LICENSE file in the top level directory of the
# distribution.

check_PROGRAMS = \
shmem_ctx_pipelined_reduce

if HAVE_OPENMP
check_PROGRAMS += \
shmem_ctx
endif

if SHMEMX_TESTS
check_PROGRAMS += \
shmemx_wait_until_all \
shmemx_wait_until_any \
shmemx_wait_until_some \
shmemx_test_any \
shmemx_test_some \
shmemx_team_split_strided \
shmemx_team_split_2D \
shmemx_team_translate \
shmemx_team_context \
shmemx_team_sync \
shmemx_team_broadcast \
shmemx_team_collect \
shmemx_reduce \
shmemx_team_alltoall \
shmemx_team_alltoalls

endif SHMEMX_TESTS

TESTS = $(check_PROGRAMS)

NPROCS ?= 2
LOG_COMPILER = $(TEST_RUNNER)

AM_LDFLAGS = $(LIBTOOL_WRAPPER_LDFLAGS)

if EXTERNAL_TESTS
bin_PROGRAMS = $(check_PROGRAMS)
AM_CPPFLAGS = -I$(top_srcdir)/test/include
AM_FCFLAGS =
LDADD =
else
AM_CPPFLAGS = -I$(top_builddir)/mpp -I$(top_srcdir)/mpp -I$(top_srcdir)/test/include
AM_FCFLAGS = -I$(top_builddir)/mpp
LDADD = $(top_builddir)/src/libsma.la
endif

if USE_PMI_SIMPLE
LDADD += $(top_builddir)/pmi-simple/libpmi_simple.la
endif

if SHMEMX_TESTS
AM_CPPFLAGS += -DENABLE_SHMEMX_TESTS
endif

shmem_ctx_CFLAGS = $(AM_OPENMP_CFLAGS)
File renamed without changes.
64 changes: 64 additions & 0 deletions test/spec-example/shmemx_reduce.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
/*
* This test program is derived from an example program in the
* OpenSHMEM specification.
*/

#include <stdio.h>
#include <stdlib.h>
#include <shmem.h>
#include <shmemx.h>

long recv_a_value(unsigned seed, int npes);
unsigned char is_valid(long value, int npes);

/* As if we receive some value from external source */
long recv_a_value(unsigned seed, int npes) {
srand(seed);
return rand() % npes;
}

/* Validate the value we recieved */
unsigned char is_valid(long value, int npes) {
if (value > (npes-1))
return 0;
return 1;
}

int main(void)
{

shmem_init();
int me = shmem_my_pe();
int npes = shmem_n_pes();
size_t num = 32;

long *values = shmem_malloc(num * sizeof(long));
long *sums = shmem_malloc(num * sizeof(long));

unsigned char *valid_me = shmem_malloc(num * sizeof(unsigned char));
unsigned char *valid_all = shmem_malloc(num * sizeof(unsigned char));

values[0] = recv_a_value((unsigned)me, npes);
valid_me[0] = is_valid(values[0], npes);

for (size_t i=1; i < num; i++) {
values[i] = recv_a_value((unsigned)values[i-1], npes);
valid_me[i] = is_valid(values[i], npes);
}

/* Wait for all PEs to initialize reductions arrays */
shmemx_sync(SHMEMX_TEAM_WORLD);

shmemx_uchar_and_reduce(SHMEMX_TEAM_WORLD, valid_all, valid_me, num);
shmemx_long_sum_reduce(SHMEMX_TEAM_WORLD, sums, values, num);

for (size_t i=0; i < num; i++) {
if (valid_all[i]) {
printf ("[%zu] = %ld\n", i, sums[i]);
}
else {
printf ("[%zu] = invalid on one or more pe\n", i);
}
}
shmem_finalize();
}
132 changes: 132 additions & 0 deletions test/spec-example/shmemx_team_context.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
/*
* This test program is derived from an example program in the
* OpenSHMEM specification.
*/

#include <shmem.h>
#include <shmemx.h>
#include <stdio.h>

int isum, ival2, ival3;

int my_ctx_translate_pe(shmem_ctx_t, int, shmem_ctx_t);
shmem_ctx_t my_team_create_ctx(shmemx_team_t);
void my_send_to_neighbor(shmem_ctx_t, int *);

int my_ctx_translate_pe(shmem_ctx_t src_ctx, int src_pe, shmem_ctx_t dest_ctx)
{
if (src_ctx == SHMEMX_CTX_INVALID) {
return -1;
}
if (dest_ctx == SHMEMX_CTX_INVALID) {
return -1;
}

shmemx_team_t src_team, dest_team;
shmemx_ctx_get_team(src_ctx, &src_team);
shmemx_ctx_get_team(dest_ctx, &dest_team);
return shmemx_team_translate_pe(src_team, src_pe, dest_team);
}

shmem_ctx_t my_team_create_ctx(shmemx_team_t team) {
if (team == SHMEMX_TEAM_INVALID) {
return SHMEMX_CTX_INVALID;
}

shmem_ctx_t ctx;
if (shmemx_team_create_ctx(team, 0, &ctx) != 0) {
fprintf (stderr, "Failed to create context for PE team\n");
return SHMEMX_CTX_INVALID;
}
return ctx;
}

void my_send_to_neighbor(shmem_ctx_t ctx, int *val)
{
if (ctx == SHMEMX_CTX_INVALID) {
fprintf (stderr, "Send to neighbor fail due to invalid context\n");
return;
}

shmemx_team_t team;
shmemx_ctx_get_team(ctx, &team);
int pe = shmemx_team_my_pe(team);
int npes = shmemx_team_n_pes(team);
int rpe = (pe + 1) % npes;

// put my pe number in the buffer on my right hand neighbor
shmem_ctx_int_put(ctx, val, &pe, 1, rpe);
}



int main(int argc, char** argv)
{
shmem_init();

int npes = shmem_n_pes();
isum = 0;

shmemx_team_t team_2s, team_3s;
shmem_ctx_t ctx_2s, ctx_3s;
shmemx_team_config_t conf;
conf.num_contexts = 1;
long cmask = SHMEMX_TEAM_NUM_CONTEXTS;

if (npes < 4) {
fprintf(stderr, "ERR - Requires at least 4 PEs\n");
shmem_finalize();
return 0;
}

// Create team with PEs numbered 0, 2, 4, ...
shmemx_team_split_strided(SHMEMX_TEAM_WORLD, 0, 2, npes / 2, &conf, cmask, &team_2s);
// Sync between splits from same parent team into teams with overlapping membership
shmemx_team_sync(SHMEMX_TEAM_WORLD);
// Create team with PEs numbered 0, 3, 6, ...
shmemx_team_split_strided(SHMEMX_TEAM_WORLD, 0, 3, npes / 3, &conf, cmask, &team_3s);

ctx_2s = my_team_create_ctx(team_2s);
ctx_3s = my_team_create_ctx(team_3s);

// Send some values using the two team contexts contexts
ival2 = 2;
ival3 = 3;
my_send_to_neighbor(ctx_2s, &ival2);
my_send_to_neighbor(ctx_3s, &ival3);

// Quiet all contexts and synchronize all PEs to complete the data transfers
shmem_ctx_quiet(ctx_2s);
shmem_ctx_quiet(ctx_3s);
shmemx_team_sync(SHMEMX_TEAM_WORLD);

// We will add up some results on pe 4 of team_3s using ctx_2s
if ((team_3s != SHMEMX_TEAM_INVALID) && (team_2s != SHMEMX_TEAM_INVALID)) {
int _pe4_of_3s_in_2s = my_ctx_translate_pe(ctx_3s, 4, ctx_2s);

if (_pe4_of_3s_in_2s < 0) {
fprintf (stderr, "Fail to translate pe 4 from 3s context to 2s context\n");
}
else {
// Add up the results on pe 4 of the 3s team, using the 2s team context
shmem_ctx_int_atomic_add(ctx_2s, &isum, ival2 + ival3, _pe4_of_3s_in_2s);
}
}

// Quiet the context and synchronize PEs to complete the operation
shmem_ctx_quiet(ctx_2s);
shmemx_team_sync(SHMEMX_TEAM_WORLD);

if (shmemx_team_my_pe(team_3s) == 4) {
printf ("The total value on PE 4 of the 3s team is %d\n", isum);
}

// Destroy contexts before teams
shmem_ctx_destroy(ctx_2s);
shmemx_team_destroy(team_2s);

shmem_ctx_destroy(ctx_3s);
shmemx_team_destroy(team_3s);

shmem_finalize();
}
Loading