Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should happen when casting a c_ptr to string? #22901

Open
arezaii opened this issue Aug 7, 2023 · 8 comments
Open

What should happen when casting a c_ptr to string? #22901

arezaii opened this issue Aug 7, 2023 · 8 comments

Comments

@arezaii
Copy link
Contributor

arezaii commented Aug 7, 2023

Specifically, when casting a c_ptr or c_ptrConst of element type c_char, c_uchar, or c_schar.

Current behavior for c_string is to convert to a string, however in the past we've tried to deprecate casting c_string to string[https://github.com//pull/13942], and recently deprecated the universal casting of anything to string[https://github.com//pull/22068].

In recent work to deprecate c_string[https://github.com//pull/22622], the question once again came up as to what the behavior should be. What should the result of casting a c_ptr or c_ptrConst of element type c_char, c_uchar, or c_schar to string be?

  1. an error
  2. a warning but will print a memory address (they probably meant to create a string, not print the address)
  3. no warning, prints a memory address (what we do today)
  4. converting to a chapel string (what would happen for c_string in the past)

example program:

// foo.chpl
use CTypes;
var x = c"hello";
writeln(x:string);
var y = c_ptrToConst("goodbye");
writeln(y:string);
chpl -s cPtrToLogicalValue=true foo.chpl
$ ./foo
hello
0x7f2c08b46ac8

Where I am leaning towards options 2 or 3. Option 1 would be a breaking change and seems extreme. Option 4 is obviously my least favorite and unlikely to be very popular based on #21052

@mppf
Copy link
Member

mppf commented Aug 7, 2023

I'd lean towards it being a compilation error, but how feasible that is depends on what tests fail when we try it.

@bradcray
Copy link
Member

bradcray commented Aug 8, 2023

If we made it an error (option 1), is the thing a user would need to do be to use one of the string.create*() factory functions to say who owns the memory buffer?

If we supported it (option 4), is the idea that we would make a deep copy of the buffer as we create the string?

From a productivity standpoint, option 1 seems appealing, but from a safety/sanity standpoint, option 4 seems like the better thing to do, at least in the short-term (assuming my guess above is correct).

@arezaii
Copy link
Contributor Author

arezaii commented Aug 8, 2023

If we made it an error (option 4), is the thing a user would need to do be to use one of the string.create*() factory functions to say who owns the memory buffer?

I think you have the option order reversed here, maybe? I think you swapped 1 and 4 based on the descriptions of the behavior.

If we made it an error (option 1), then assuming the user meant to print a string they would use one of the string.create*ingBuffer() methods. If they actually meant to print the memory address we could allow them to cast it to a c_ptr(void) first, and then cast that to a string.

If we supported it (option 1), is the idea that we would make a deep copy of the buffer as we create the string?

Speaking to option 4 (allowing the cast), yes, the cast would basically just be sugar for string.createCopyingBuffer().

However, in the issue I mentioned in the OP #21052, @e-kayrakli expressed a pretty strong aversion to allowing the cast to string to succeed, but maybe I read too much into the comments there.

@bradcray
Copy link
Member

bradcray commented Aug 8, 2023

I think you have the option order reversed here, maybe?

Oops, yes, I edited it to fix.

assuming the user meant to print a string

This issue is just focused on what should happen with the cast itself, though, isn't it? (this was also confusing to me in options 2 and 3 which talk about printing rather than the cast).

In terms of casting, personally, I don't think there's a world in which casting a c pointer to a Chapel string should give you a string containing the pointer's address.

Are there compelling cases in the test/module code that would suggest taking one approach or the other, and how widespread are they?

With respect to printing, note that we could potentially support write()ing a c_ptr*(c_char)) independently of requiring a cast for it. I.e.,

const myCharStar = getCharStarFromC();
writeln(myCharStar);

which might be a nice productivity boost (no need to convert it to a string manually). OTOH, it might be a little weird to do this so specially for char*s...

@arezaii
Copy link
Contributor Author

arezaii commented Aug 9, 2023

This issue is just focused on what should happen with the cast itself, though, isn't it? (this was also confusing to me in options 2 and 3 which talk about printing rather than the cast).

Yes, sorry for mixing the printing in with it, that was just the example where it tends to come up the most... I have this thing, I want to print it to the console, I'll cast it to a string! The result of that cast is what we're really talking about here.

Are there compelling cases in the test/module code that would suggest taking one approach or the other, and how widespread are they?

The majority have been replaced already with the PR to deprecate c_string, because the cast from c_string to string gives a deprecation warning about c_string. The replacement is typically a combination of using string.create*ingBuffer() while casting the c_string to a c_ptrConst(c_char) to avoid a deprecation warning from string.createCopyingBuffer, so from myCstring:string to string.createCopyingBuffer(myCstring:c_ptrConst(c_char)).

I think it would be convenient if we could write a c_ptr[Const] directly like that, but I'm not sure if anyone else agrees that the convenience outweighs the weird specialization.

@mppf
Copy link
Member

mppf commented Aug 9, 2023

If we made it an error (option 1), then assuming the user meant to print a string they would use one of the string.create*ingBuffer() methods. If they actually meant to print the memory address we could allow them to cast it to a c_ptr(void) first, and then cast that to a string.

I don't think myVoidPtr:string should create a string containing the address. I think it should result in compilation error. If you want a string containing the address, IMO you should do something like "%?".format(myVoidPtr).

@bradcray
Copy link
Member

bradcray commented Aug 9, 2023

I'm not sure if anyone else agrees that the convenience outweighs the weird specialization.

I wouldn't describe it as a weird specialization per se. When defining a new type in Chapel, a type author can define how it's printed out by writeln() (formerly by defining .writeThis() for it; now by using the ser/deser framework), and arguably they should to preserve the "anything can be printed out with writeln()" property that Chapel strives for. So arguably the CTypes module, by defining types, should define how they're printed out.

Most of CType's types (c_int, etc.) have Chapel equivalents that Chapel knows about, so automatically get this behavior. c_array and c_ptr don't, so it's reasonable to have them define writeThis overloads (or equivalents in the new world, which I haven't learned yet). And I think it's also reasonable to have some c_ptr types print differently than others (or not support printing at all) since that seems similar to C's special treatment of char* (so natural for a module that's all about making C types easy to use within Chapel).

In saying all this, I'm not saying you need to own it, just saying that I think it's a logical and reasonable thing for CTypes and c_ptr(c_char) to do.

@dlongnecke-cray
Copy link
Contributor

dlongnecke-cray commented Sep 15, 2023

I'm fine with the notion of casting a pointer to a string and having it yield an address. Whatever we choose in that department, I don't think c_ptr(c_char)/c_ptrConst(c_char) should be treated differently than any other type of C pointer (it should still print the address / do whatever else we decide, and not create a string with the contents of the buffer).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants