Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the compiler stop supporting type constructors for types whose fields have generic type? #21993

Closed
bradcray opened this issue Mar 28, 2023 · 12 comments

Comments

@bradcray
Copy link
Member

bradcray commented Mar 28, 2023

[This is spun off from discussion on #16508 ] (Lydia edit: to fix the link display)

Currently, when declaring a type:

record R {
  var x;
}

the compiler generates a type constructor with the signature:

proc type R.init(type x) {
  this.x.type == x;  // this is fairly bogus, but hopefully you take my meaning
}

and we have found this type constructor signature to be problematic/confusing/odd in some cases (TODO: link to previous issues). This issue proposes that the compiler essentially not create a well-formed type constructor for such types (or: create an un-callable type constructor) such that the only legal type constraints for such a type are R(?) and R (see #21988 for related discussion of the meaning of these two expressions).

Do this not just for untyped fields like x here, but for all fields whose declared types are generic.

The goal in doing so is to avoid getting into difficult questions like "What should the compiler-generated type initializer be for types like:

record R {
  var c: MyClassType;  // this has a generic management style — what are the implications for the type constructor?
  var a: [] real;  // this has a generic rank, idxType, and domain map — what are the implications for the type constructor?
}

while also serving to improve the user's ability to reason about type signatures that has led to issues like #19120, #21404, and #21989.

Open question: For a type like:

record R {
  type t;
  var x;
}

should the only legal type constructor call be R(?) or should R([t=]int, ?) also be supported? What about:

record R {
  type t;
  var x;
  param p: int;
}

should R(int, ?, 3) be supported? R(int, 3, ?)? R(int, p=3, ?)?

That is, in these examples, should the compiler-generated type constructor accept arguments for all well-formed generic fields and just skip variable fields with generic types? Or perhaps only support the fields up to the point where the first variable field with generic type is encountered?

@bradcray
Copy link
Member Author

Note that we could consider such types to generate a warning or requirement for users to supply a type constructor as proposed in #21992. Or we could take the approach suggested here and just say that there is not usable / complete type constructor unless the user does (which is not the worst thing in the world if one leans on type inference heavily and doesn't want to declare the types of things [precisely]).

@mppf
Copy link
Member

mppf commented Mar 29, 2023

I think this approach is interesting, especially if it applies to both fields like var x; and also fields like var y: myGenericType;. I'd like to understand more about the implications of doing this, though (e.g. by trying to implement it or by studying some examples).

Also, I think that we will need to either apply this strategy — or outright ban fields declared like var x; or var y: myGenericType; — in order to be able to offer language stability until a broader solution is ready (for example #21992).

@mppf
Copy link
Member

mppf commented Jul 28, 2023

I've been looking at not allowing type construction for records like record R { var x; }. This idea appeals to me in theory, but when implementing it, I'm seeing some patterns that give me pause.

Details of patch:

diff --git a/compiler/AST/AggregateType.cpp b/compiler/AST/AggregateType.cpp
index 66a3c5000f..f0bf91353b 100644
--- a/compiler/AST/AggregateType.cpp
+++ b/compiler/AST/AggregateType.cpp
@@ -1280,6 +1280,19 @@ static void checkTypesForInstantiation(AggregateType* at, CallExpr* call, const
     USR_PRINT(call, "type specifier did not match: %s", typeSignature);
     USR_PRINT(call, "generic field '%s' must be instantiated with a type-expression", field->name);
     USR_STOP();
+  } else {
+    // we know field is generic (otherwise this code should not be run)
+    // the conditions above cover 'type' and 'param' fields,
+    // so all this is left here is a field like 'var x;' or 'var y: integral;'
+    if (at->symbol->hasFlag(FLAG_REF)) {
+      // OK, ignore this implementation detail
+    } else {
+      USR_FATAL_CONT(call, "'%s' does not support type construction",
+                     toString(at));
+      USR_PRINT(field, "field '%s' is generic and not a 'type' or 'param'",
+                field->name);
+      USR_STOP();
+    }
   }
 
   if (Type* fieldType = resolveFieldTypeForInstantiation(field, call, callString)) {
diff --git a/modules/packages/LinearAlgebra.chpl b/modules/packages/LinearAlgebra.chpl
index 9615fac6ef..6fa7912f31 100644
--- a/modules/packages/LinearAlgebra.chpl
+++ b/modules/packages/LinearAlgebra.chpl
@@ -1226,7 +1226,7 @@ record _wrap {
 
 @chpldoc.nodoc
 /* Exponentiate by squaring recursively */
-private proc _expBySquaring(x: ?t, n): _wrap(t) {
+private proc _expBySquaring(x: ?t, n): _wrap(?) {
   // TODO -- _expBySquaring(pinv(x), -n);
   if n < 0  then halt("Negative powers not yet supported");
   else if n == 0  then return new _wrap(eye(x.domain, x.eltType));
diff --git a/modules/packages/Sort.chpl b/modules/packages/Sort.chpl
index 7142b95ce4..b89c5a01d9 100644
--- a/modules/packages/Sort.chpl
+++ b/modules/packages/Sort.chpl
@@ -273,7 +273,7 @@ defaultComparator = new DefaultComparator();
    default sort order.
 
  */
-const reverseComparator: ReverseComparator(DefaultComparator);
+const reverseComparator: ReverseComparator(?);
 reverseComparator = new ReverseComparator();
 
 /* Private methods */
diff --git a/test/release/examples/primers/genericClasses.chpl b/test/release/examples/primers/genericClasses.chpl
index 30cc71f316..f1ee40f151 100644
--- a/test/release/examples/primers/genericClasses.chpl
+++ b/test/release/examples/primers/genericClasses.chpl
@@ -47,13 +47,15 @@ writeln("uf = ", uf, ", uf2 = ", uf2);
 //
 // To specify a generic class type (without creating an instance),
 // don't use the ``new`` keyword and just specify the generic arguments.
-// For fields that have no types, specify a type for that field,
-// instead of a value.
+// For value fields that have no type, or a generic type, type construction
+// is not currently supported.
 //
 var taf3: borrowed TypeAliasField(real)?;
 var pf3: borrowed ParamField(3)?;
-var uf3: UntypedField(complex);
 
+// var uf3: UntypedField(complex); -- currently an error
+var uf3: uf.type; // use the same type as uf
+                  // note: var uf3: UntypedField(complex) doesn't compile
 taf3 = taf;
 pf3 = pf;
 uf3 = uf;

In particular, this pattern came up in ChapelHashtable:

  record chpl_TableEntry {
    var status: chpl__hash_status = chpl__hash_status.empty;
    var key;
    var val;
  }
  
  record chpl__hashtable {
    type keyType;
    type valType;

    ...

    var tableSize: int;
    var table: _ddata(chpl_TableEntry(keyType, valType)); // 0..<tableSize

    ...
    
    proc init(type keyType, type valType, ...) { ... }
  }

Now, if we go with the proposal, we can no longer write _ddata(chpl_TableEntry(keyType, valType)), even though writing that type does not imply default initialization of the chpl_TableEntry (and in fact, we avoid default initialization of it, to support non-nilable owned etc).

I thought about several ways to adapt the code, but I found each of them pretty unsatisfying:

  • We could add type keyType; type valType; to record chpl_TableEntry. This is the most straightforward option. However, I am concerned that doing so will add runtime overhead when keyType or valType is an array type, because these have a runtime component. In particular, the table entries will now require space to store the runtime type for each hashtable slot, which seems wasteful.
  • We could seek to avoid the type construction and use a generic type instead -- e.g. with var table: _ddata(chpl_TableEntry(?)). However, that leaves us with chpl__hashtable(int, int) being a generic type, which I find really unsatisfying, since it causes an implementation detail (the type of the _ddata) to leak to users of the chpl__hashtable type.
  • I thought about creating a function to serve as the type constructor for chpl_TableEntry. For example, proc foo(type keyType, type valType) { var key: keyType; var val: valType; return (new chpl_TableEntry(key, val)).type; }. The problem with this is that it assumes that keyType and valType are default-initializable, which is not the case.

I am thinking about alternative solutions to #16508 and I will post about them there.

@bradcray
Copy link
Member Author

bradcray commented Aug 7, 2023

I am thinking about alternative solutions to #16508 and I will post about them there.

I'm writing this comment before moving on to those solutions, just to think about the problem in this context in isolation.

This is what I was imagining we'd do:

We could add type keyType; type valType; to record chpl_TableEntry. This is the most straightforward option.

But I didn't anticipate this:

However, I am concerned that doing so will add runtime overhead when keyType or valType is an array type, because these have a runtime component. In particular, the table entries will now require space to store the runtime type for each hashtable slot, which seems wasteful.

Huh, interesting... I'd always assumed that when we saw record R { var x; }, we were essentially converting it to record R { type _x_type; var x: _x_type; } within the compiler such that any overhead with storing a runtime type using an explicit type field would be there as well when using the shorthand. But doing a quick test, I see you're correct:

record R {
  var x;
}

results in:

typedef struct chpl_R__array_DefaultRectangularArr_1_int64_t_one__real64_int64_t_chpl_s {
  _array_DefaultRectangularArr_1_int64_t_one__real64_int64_t_chpl x;
} R__array_DefaultRectangularArr_1_int64_t_one__real64_int64_t_chpl;

while:

record R {
  type t;
  var x: t;
}

results in:

typedef struct chpl_R__array_DefaultRectangularArr_1_int64_t_one__real64_int64_t_chpl_s {
  chpl___RuntimeTypeInfo3 t;
  _array_DefaultRectangularArr_1_int64_t_one__real64_int64_t_chpl x;
} R__array_DefaultRectangularArr_1_int64_t_one__real64_int64_t_chpl;

Thinking about ways to proceed with the proposal here, could we consider this to be motivation for being able to write a user-defined type initializer for chpl__TableEntry as it exists today? E.g., under dyno, if we had some way of writing:

record R {
  var x;
}

proc type R.init(type x) {
  x.type = x;
}

// or `proc type R.init(type x.type) { }` using John's idea

then would we be back on firm ground? The obvious downside is that, until we supported such type initializers, you'd have to either live with or squash the warning; or adopt the less-space-efficient approach as a means of shoving it aside.

One other observation here that seems important: If I'm writing a hash table over 3-element Chapel arrays (say), I probably don't want to make it take arrays of a given/shared domain (what the runtime type suggests to me), but to take any 3-element array—particularly if we're copying the array value into the map. In which case, the dynamic type of the array isn't really what we want to be characterizing this type by anyway, it's the static type. That brings to mind past conversations about distinguishing between static vs. dynamic type queries using myVar.stype and myVar.dtype and/or potentially having stype keyType; stype valType; type aliases within the record?

[edit: The point being that the type author who prefers documenting their parameterized types directly probably should have a way to opt out of storing these fields as well. Or, put another way, I don't think we should be relying on typeless fields as a way to avoid runtime type storage overheads, or defining the language in a way that requires writing it in that pattern to avoid the overhead].

@mppf
Copy link
Member

mppf commented Aug 8, 2023

The point being that the type author who prefers documenting their parameterized types directly probably should have a way to opt out of storing these fields as well. Or, put another way, I don't think we should be relying on typeless fields as a way to avoid runtime type storage overheads, or defining the language in a way that requires writing it in that pattern to avoid the overhead

Agreed. However I'm not so sure that we need to resolve this for the fall release. The changes you are describing would be additive / non-breaking. And, hypothetically speaking, we could use some kind of pragma or something to opt back in to the type constructor specifically for record chpl_TableEntry in the near term, to avoid impact on hashtable performance.

Thinking about ways to proceed with the proposal here, could we consider this to be motivation for being able to write a user-defined type initializer for chpl__TableEntry as it exists today?

I think that #16508 is better solved by preventing default initialization with the compiler-generated default initializer. (see this comment). Is there some other reason that it would be beneficial to limit type constructors in cases like record R { var x; } ?

The other thing I can imagine is that we'd like to limit type constructors in this way so that record RR { var x: MyGenericType; } or record RRR { var x: MyClassType; } require a user-defined type constructor, so that these types are more obviously generic. Is that the real motivation here, and #16508 is a side issue? I thought that by asking people to write record RR { var x: MyGenericType(?); } starting in the Fall release, we were solving that problem well enough, and then other changes can be additive. Note that, at this point, for record RRR { var x: MyClassType; }, we could start to require a user-defined type constructor as a non-breaking change (because this pattern already emits a warning & will indicate it is unstable after PR #22877).

@bradcray
Copy link
Member Author

bradcray commented Aug 8, 2023

Is there some other reason that it would be beneficial to limit type constructors in cases like record R { var x; } ?

For me, the other main reason is that (over time) it has just seemed a bit odd to support R(x=int) as a valid type signature for record R when it has no type field named x. It feels a bit more like a hack / workaround in the language than something principled.

I'm slightly less offended by R(int) as a type signature if I think of it as matching an anonymous and unnameable type field, as in: record R { type _youCantNameMe; var x: _youCantNameMe; }. But as you point out above, this isn't actually how such cases are translated/interpreted.

So I think my concern beyond #16508 in opening this issue was just that the compiler-generated type constructors for such cases have increasingly seemed pretty weird and funky all around, such that maybe we should try to avoid considering anything that relies upon them as being stable until user-defined type constructors come on-line (where I still see value in supporting typeless fields and calls that create new instances of them like var myR = new R(x=0);... it's just the type constructors that seem particularly odd to me).

@mppf
Copy link
Member

mppf commented Aug 8, 2023

One issue is that the compiler is currently allowing passing named arguments to type constructors even to non-generic fields. For example:

record S {
  type t1;
  type t2;
  var x: (t1, t2);
}

type t = S(x=(int, real));
writeln(t:string);

Getting an error in this case would be good.

@bradcray

This comment was marked as resolved.

@bradcray
Copy link
Member Author

@mppf opened https://github.com/Cray/chapel-private/issues/5228 to explore closing the buggy behavior he mentioned above.

Back on the OP, after discussions with Michael, we thought that maybe the best way forward on this issue was to continue to support positional argument in type constructors/signatures for fields whose types were not complete (such that one could still write R(int) for a record record R { var; } ) but to not accept named arguments for such cases, at least for the time being (e.g., R(x=int); would be illegal). The reasons being: there isn't a member of R named x that takes a type argument and there's too much potential for confusion giving the insinuation that there's both a type and value argument named x. I wondered whether we're relying on this shared name in some way that would break things if we made that change, but Michael's guess was that it would probably only remove existing bugs that we have.

Ultimately, I think having some way of passing types using some form of named argument in cases like this would be attractive, but that could be done post-2.0. One idea I had was to write R(x.type=int) which on one hand feels like the best/clearest way to say what we're trying to say, but on the other hand is a little bit of a syntactic mismash (in that we don't usually permit dot expressions in formal argument lists like this).

@mppf
Copy link
Member

mppf commented Aug 24, 2023

I've created a PR that rules out named argument passing to var / const fields. It passes testing & there were no surprises in testing.

Please see #23067

if we can merge that PR, I think we can close this issue.

mppf added a commit that referenced this issue Aug 28, 2023
…param (#23067)

For #21993 
Resolves Cray/chapel-private#5228

This PR adds a compilation error when a type construction call uses
named argument passing to set the type of a field that is not a `type`
or `param` field. For example:

``` chapel
record R {
  var x;
  var y: integral;
}

type t1 = R(x=real, y=int); // error
```

The reason for this error is twofold:
1. We have observed some nonsensical programs being able to compile, and
this avoids bugs in this area. See below for an example.
2. We are uncertain if the type of `x` should be named `x` in the type
constructor. Perhaps it should have a different name.

Of course, this error can be relaxed in the future as a non-breaking
change.

Here is an example program that this PR causes to fail with an error but
(nonsensically) compiles and runs before this PR:

``` chapel
record S {
  type t1;
  type t2;
  var x: (t1, t2);
}

type t = S(x=(int, real));
writeln(t:string);
```

Reviewed by @vasslitvinov - thanks!

- [x] full comm=none testing
@lydia-duncan
Copy link
Member

The PR mentioned in Michael's most recent comment has been merged. Can we close this issue?

@bradcray
Copy link
Member Author

bradcray commented Apr 3, 2024

I think so, thanks!

@bradcray bradcray closed this as completed Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants