Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception not thrown with CPU Accelerator #965

Open
pavlexander opened this issue Mar 13, 2023 · 2 comments
Open

Exception not thrown with CPU Accelerator #965

pavlexander opened this issue Mar 13, 2023 · 2 comments
Labels

Comments

@pavlexander
Copy link

pavlexander commented Mar 13, 2023

description

Exception does not seems to be thrown when executing the code on CPU.

For comparison, here is what it looks like when exception thrown on CPU accelerator

image

note that no error message is written to the console, nor there is an indication of any exception thrown at all. From user's perspective it looks like a debug point hit.

and here is with CUDA accelerator

image

the exception is thrown on both the console and the Synchronize method.

The only way that I have found to see the exception is to disable "Just my code" debugging in visual studio configuration:

image

I disable this option by default as it loads a bunch of symbols and it takes a lot of time to debug the program. The result is:

image

See that there is a local variable that contains an error message. However the code runs and finished just fine with no errors in the console.

suggestion

I would like to suggest 2 improvements:

  1. make CPU accelerator "Just my code" debug option agnostic. I would like to be able to see the exception without this option being enabled.
  2. start writing CPU thrown error message to the console as well? It's just for consistency. Since it's being pushed to the console while running the code on GPU then shouldn't it be the same behavior on CPU?
  3. This third is not really a suggestion but a question. Is it possible to execute the CPU code in parallel on multiple threads? Similar to how the parallel.for loop allows you to define MaxDegreeOfParallelism. i.e. new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount - 1 }. I have not found a way how to do it in ILGPU. But then again I guess this is not really required since the CPU version of the code is for the debugging only.. or maybe not.

code

using ILGPU;
using ILGPU.Runtime;
using ILGPU.Runtime.CPU;
using ILGPU.Runtime.Cuda;
using System;

namespace TestingILGpuException
{
    internal class Program
    {
        static void Kernel(Index1D i, ArrayView<int> data, ArrayView<int> output)
        {
            output[i + 1] = data[i % data.Length];
        }

        static void Main(string[] args)
        {
            Console.WriteLine("Starting");

            // Initialize ILGPU.
            Context context = Context.CreateDefault();
            //Accelerator accelerator = context.CreateCudaAccelerator(0);
            Accelerator accelerator = context.CreateCPUAccelerator(0);

            // Load the data.
            MemoryBuffer1D<int, Stride1D.Dense> deviceData = accelerator.Allocate1D(new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 });
            MemoryBuffer1D<int, Stride1D.Dense> deviceOutput = accelerator.Allocate1D<int>(10_000);

            // load / precompile the kernel
            Action<Index1D, ArrayView<int>, ArrayView<int>> loadedKernel =
                accelerator.LoadAutoGroupedStreamKernel<Index1D, ArrayView<int>, ArrayView<int>>(Kernel);

            // finish compiling and tell the accelerator to start computing the kernel
            loadedKernel((int)deviceOutput.Length, deviceData.View, deviceOutput.View);

            // wait for the accelerator to be finished with whatever it's doing
            // in this case it just waits for the kernel to finish.
            accelerator.Synchronize();

            // moved output data from the GPU to the CPU for output to console
            int[] hostOutput = deviceOutput.GetAsArray1D();

            for (int i = 0; i < 50; i++)
            {
                Console.Write(hostOutput[i]);
                Console.Write(" ");
            }

            accelerator.Dispose();
            context.Dispose();

            Console.WriteLine();
            Console.WriteLine("Press any key to exit");
            Console.ReadKey();
        }
    }
}

environment

Microsoft Visual Studio Community 2022
Version 17.5.1
VisualStudio.17.Release/17.5.1+33424.131
Microsoft .NET Framework
Version 4.8.04084

Installed Version: Community

Visual C++ 2022   00482-90000-00000-AA961
Microsoft Visual C++ 2022

ASP.NET and Web Tools   17.5.317.37931
ASP.NET and Web Tools

AvaloniaPackage Extension   1.0
AvaloniaPackage Visual Studio Extension Detailed Info

Azure App Service Tools v3.0.0   17.5.317.37931
Azure App Service Tools v3.0.0

Azure Functions and Web Jobs Tools   17.5.317.37931
Azure Functions and Web Jobs Tools

C# Tools   4.5.0-6.23123.11+271ccd71554f7d28d2f90551aafd0bdeb5d327aa
C# components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Common Azure Tools   1.10
Provides common services for use by Azure Mobile Services and Microsoft Azure Tools.

Edit Project   1.7.72
An open source Visual Studio extension to add the context menu for editing project/solution file.

File Icons   2.7
Adds icons for files that are not recognized by Solution Explorer

Microsoft JVM Debugger   1.0
Provides support for connecting the Visual Studio debugger to JDWP compatible Java Virtual Machines

NuGet Package Manager   6.5.0
NuGet Package Manager in Visual Studio. For more information about NuGet, visit https://docs.nuget.org/

Razor (ASP.NET Core)   17.5.2.2307502+8b3141d86b738daf2ca3ed9c15b12513071fc676
Provides languages services for ASP.NET Core Razor.

SQL Server Data Tools   17.2.40118.0
Microsoft SQL Server Data Tools

Test Adapter for Boost.Test   1.0
Enables Visual Studio's testing tools with unit tests written for Boost.Test.  The use terms and Third Party Notices are available in the extension installation directory.

Test Adapter for Google Test   1.0
Enables Visual Studio's testing tools with unit tests written for Google Test.  The use terms and Third Party Notices are available in the extension installation directory.

TypeScript Tools   17.0.20105.2003
TypeScript Tools for Microsoft Visual Studio

Visual Basic Tools   4.5.0-6.23123.11+271ccd71554f7d28d2f90551aafd0bdeb5d327aa
Visual Basic components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Visual F# Tools   17.5.0-beta.23053.5+794b7c259d9646a7eb685dad865aa27da7940a21
Microsoft Visual F# Tools

Visual Studio IntelliCode   2.2
AI-assisted development for Visual Studio.

Visual Studio Tools for Unity   17.5.1.0
Visual Studio Tools for Unity

project is in .Net 7 LTS console application.

@MoFtZ MoFtZ added the question label Mar 14, 2023
@MoFtZ MoFtZ changed the title Exception now thrown with CPU Accelerator Exception not thrown with CPU Accelerator Mar 14, 2023
@MoFtZ
Copy link
Collaborator

MoFtZ commented Mar 14, 2023

hi @pavlexander. The behavior you are seeing is because ILGPU uses Trace.Assert and Debug.Assert rather than throw new IndexOutOfRangeException().

ILGPU converts C# code to the instructions that will run on the GPU - in this case, PTX instructions for CUDA. This applies for user code, as well as large parts of ILGPU code. GPU kernels do not support throwing exceptions. In fact, if ILGPU encounters a throw instruction, it will fail to convert the code.

For the CPU accelerator, you should see the assertion failure written to the Debug window of Visual Studio.

@MoFtZ
Copy link
Collaborator

MoFtZ commented Mar 14, 2023

Regarding your other question about running the CPU accelerator in parallel, the CreateCPUAccelerator method can take an additional parameter of CPUAcceleratorMode.

By default, ILGPU will use CPUAcceleratorMode.Auto. If a debugger is attached, ILGPU will run in sequential mode to make debugging easier. Otherwise, the CPU accelerator will run in parallel.

The number of threads is determined by the GPU device that the CPU accelerator is trying to emulate. This can be changed when configuring the Context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants