Unable to use torch.Generator on CUDA #1194

K024 · 2023-12-20T03:29:58Z

Minimal reproduction:

    var generator = new torch.Generator(42, torch.device("cuda"));
    Console.WriteLine(generator.device);
    Console.WriteLine(generator.get_state());
    var distribution = torch.tensor(new float[] {0.1f, 0.2f, 0.3f, 0.4f}, device: torch.device("cuda"));
    var output = torch.multinomial(distribution, num_samples: 1, generator: generator);
    Console.WriteLine(output.ToString(true));

Output:

cuda
[5056], type = Byte, device = cpu
Unhandled exception. System.Runtime.InteropServices.ExternalException (0x80004005): Expected a 'cuda' device type for generator but found 'cpu'
Exception raised from check_generator at /opt/conda/conda-bld/pytorch_1695392067780/work/aten/src/ATen/core/Generator.h:156 (most recent call first):
...(call stack omitted)

This also won't work:

    var generator = new torch.Generator(42, torch.device("cuda"));
    generator.manual_seed(42);
    Console.WriteLine(generator.device);
    Console.WriteLine(generator.get_state());
    generator.set_state(generator.get_state().cuda());

Output:

cuda
[5056], type = Byte, device = cpu
terminate called after throwing an instance of 'c10::TypeError'
  what():  RNG state must be a torch.ByteTensor
Exception raised from check_rng_state at /opt/conda/conda-bld/pytorch_1695392067780/work/aten/src/ATen/core/Generator.h:181 (most recent call first):
...(call stack omitted)

TorchSharp: 0.101.4
libtorch loaded from conda: pytorch 2.1.0 py3.10_cuda12.1_cudnn8.9.2_0

Update:

This issue may be more complicated. The equivalent code works in python/pytorch, and the device of state tensor is exactly cpu with shape [16]. A rolling offset is also used in pytorch.

The text was updated successfully, but these errors were encountered:

NiklasGustafsson · 2023-12-20T16:18:19Z

Okay, thank you for the issue! I'll be taking the rest of the year off after today, and there's no chance of getting a fix into a release before January. A temporary workaround may be to generate random values on CPU and then move the resulting tensor to the CUDA device.

NiklasGustafsson · 2024-01-04T17:45:31Z

@K024:

The bug is pretty obvious -- this was a TODO in the C++ code. I hadn't discovered how to create a CUDA generator, but I believe I know how to, now.

That said, it's going to be more involved than I had hoped. Here's why:

When building the TorchSharp packages, LibTorchSharp (the native / .NET interop layer) is included in the TorchSharp package, not the backend packages, so it has only the APIs that are cross-backend available. The native interop layer links only against torch.dll and torch_cpu.dll (and the corresponding .so and .dylibs), which are available for all backends. There is a certain amount of device generality in those libraries, but most CUDA-specific APIs are not available.

So, for example, the general APIs will allow us to test whether CUDA is available, and it will allow usto get the default CUDA RNG, but not create new ones. There are other CUDA-specific APIs we would like to get to, as well.

In order to address this, LibTorchSharp will have to be built separately for each device type (CPU, CUDA, AMD in the future) and bundled with the backend packages, instead. It is certainly something we can do, but it will take time and effort.

In the meantime, we can have the Generator constructor hook everything up to the default CUDA generator, but that will share state between all such generators. The alternative is what I outlined above: create random tensors on CPU with a custom CPU generator and then move the output to GPU.

K024 changed the title ~~Unable to create a torch.Generator on CUDA~~ Unable to use torch.Generator on CUDA Dec 20, 2023

NiklasGustafsson mentioned this issue Feb 12, 2024

rng_state on cuda #1230

Closed

yueyinqiu mentioned this issue Feb 27, 2024

missing stuffs in torch.nn.utils #1243

Open

16 tasks

yueyinqiu mentioned this issue Jun 9, 2024

The function torch.cuda.empty_cache() is missing. #1327

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to use torch.Generator on CUDA #1194

Unable to use torch.Generator on CUDA #1194

K024 commented Dec 20, 2023 •

edited

Loading

NiklasGustafsson commented Dec 20, 2023

NiklasGustafsson commented Jan 4, 2024 •

edited

Loading

Unable to use torch.Generator on CUDA #1194

Unable to use torch.Generator on CUDA #1194

Comments

K024 commented Dec 20, 2023 • edited Loading

NiklasGustafsson commented Dec 20, 2023

NiklasGustafsson commented Jan 4, 2024 • edited Loading

K024 commented Dec 20, 2023 •

edited

Loading

NiklasGustafsson commented Jan 4, 2024 •

edited

Loading