Skip to content
Compare
Choose a tag to compare
@skallweitNV skallweitNV released this 23 Oct 06:06
· 8 commits to master since this release
95b5163

Overview

This release of Falcor provides the following significant improvements and new features:

  • Falcor as a Python extension, allowing Falcor to be used directly from Python.
  • CUDA interop, including sharing buffers and synchronization between Falcor and CUDA.
  • PyTorch interop, allowing Falcor to be used for implementing PyTorch functions.
  • Differentiable Slang and various examples including a BSDF optimizer and a differentiable path tracer.
  • Support for Shader Execution Reordering (SER) in the path tracer.

Dependencies

  • Update slang to version 2023.3.20.
  • Update nvapi to version R535.
  • Update DLSS to version 3.5.

Build System

  • Enable more MSVC warnings.
  • Add FALCOR_ENABLE_ASSERTS CMake option.
  • Remove FALCOR_REPORT_EXCEPTION_AS_ERROR CMake option.

Assets

  • Fix media files to use relative paths.
  • Move volume test scenes to media package.
  • Move grey_and_white_room scene to falcor_media.
  • Replace Cerberus with CesiumMan.

Examples

  • Add a simple example showing interop between Falcor and pytroch, learning to represent an image from a set of gaussians.
  • Add Slang based BC7 compressor.

DiffSlang

  • Add a BSDFOptimizer that can run simple inverse rendering for solving material parameters without any path tracing.
  • Add a differentiable evalAD function to IMaterialInstance and a differentiable setupDiffMaterialInstance to IMaterial.
  • Implement evalAD and setupDiffMaterialInstance for PBRTDiffuseMaterial. Other types of materials have naive placeholders for now.
  • Add a helper class GradientIOWrapper to set up the dataflow of scene gradients during backpropagation.
  • Mark ShadingData and ShadingFrame as IDifferentiable.
  • Add a unit test to check evalAD of PBRTDiffuseMaterial.
    Add helper classes for differentiable rendering:
  • SceneManager: get/set scene parameters (only apply to the albedo value of PBRTDiffuse for now).
  • SceneGradients: store scene gradients from backpropagation.

Python API

  • Add interface for passing in an already created Device to Testbed.
  • Add Python bindings for accessing Buffer as pytorch tensor.
  • Add Python bindings for CUDA/Falcor sync.
  • Add Python bindings for Program, ProgramDesc and ComputePass.
  • Add a simple example using compute shaders from Python.
  • Fix EnvMap Python constructor (they can't return null which we did when the file isn't found).
  • Improve Testbed:
    • Add shader reloading on F5
    • Fix profiler toggling and render profiler even if UI is disabled
    • Fix show_ui
    • Add render_texture for setting a texture to be rendered on the window
    • Add should_close and close method for handling shutdown
    • Cleanup + comments
  • Add Python bindings for program reflection types.
  • Add Python bindings for ShaderVar.
  • Add Python bindings to Texture for reading/writing subresources as numpy arrays.
  • Add Python bindings to Buffer class:
    • Buffer properties
    • to_numpy/from_numpy to convert to/from numpy arrays
  • Add Python bindings to Device class:
    • create_buffer, create_typed_buffer and create_structured_buffer
  • Add initial Python bindings to CopyContext class.
  • Add Python unit tests for buffer creation, writing and reading from/to numpy arrays.
  • Add pybind11::falcor_enum helper to allow binding enums that already have string infos (using FALCOR_ENUM_INFO).
  • Include pybind11/functional.h to get correct typing information on std::function types.
  • Add Python bindings for Sampler class.
  • Add [] accessor to RenderGraph Python bindings, allowing to access a pass through g["PathTracer"] for example.
  • Add basic ImGUI wrapper with Python bindings in falcor.ui submodule.
  • Only render UI for render graph / scene when they one is loaded.
  • Add implicit conversion from python lists to vector types (allow assigning [10, 20, 30] instead of float3(10, 20, 30)).
  • Python bindings for creating profiler events.

Cleanup

  • Cleanup use of $for.. loops in shaders.
  • Add comment about usage of SlangCompilationFlags::DumpIntermediates.
  • Rename computeNewRayOrigin() to computeRayOrigin().
  • Rename setShaderData to bindShaderData.
  • Add FALCOR_EXPORT_D3D12_AGILITY_SDK to all sample applications.
  • Apply clang-format on most render passes.
  • Remove PythonDictionary class.
  • Rename InternalDictionary to Dictionary.
  • Disable clang-format argument bin packing.

Error Handling

  • Fix fstd::source_location::current() on MSVC.
  • Take std::string_view as message on Exception and descendants. Simplify the exception classes as they don't need to do string formatting anymore.
  • FALCOR_GFX_CALL (gfxReportError handler) now handles NVIDIA Aftermath crash dumps before calling reportFatalErrorAndTerminate.
  • msgBox creates a window that is always top-most (otherwise we got hidden windows if the main window was not open / already closed).
  • Fix getStackTrace (need to create the TraceResolver first).
  • Assertions now throw AssertionError exceptions which translate to Python.
  • Add support for string message (and formatting) with FALCOR_ASSERT.
  • Add ErrorDiagnosticFlags to control how errors are reported:
    • BreakOnThrow enables breaking into attached debugger when calling FALCOR_THROW (break on call-site).
    • BreakOnAssert enables breaking into attached debugger when calling FALCOR_ASSERT (break on call-site).
    • AppendStackTrace enables appending a stack trace to the exception message when calling FALCOR_THROW and FALCOR_ASSERT.
    • ShowMessageBoxOnError enables showing a message box when calling the reportError family of functions.
  • Change reportError into reportErrorAndContinue to better describe what it is doing. This function no longer terminates the application.
  • Change reportErrorAndAllowRetry to return true if user clicked Retry. This function no longer terminates the application.
  • Change reportFatalError into reportFatalErrorAndTerminate to better describe what it is doing.
  • Add catchAndReportAllExceptions that is now used in all sample applications to globally catch errors.
  • Remove the local exception catching in SampleApp::run.
  • Fix places where before we relied on application termination when using reportErrorAndAllowRetry. We now throw an exception.
  • Replace calls to reportError with either one of these: exceptions, logging, show message box.
  • Consolidate Core/Assert.h, Core/Errors.h and Core/ErrorHandling.h into Core/Error.h.
  • Simplify error handling conventions in Falcor:
    • Use static_assert and FALCOR_ASSERT for assertions.
    • Use FALCOR_THROW and FALCOR_CHECK to throw RuntimeError exception.
    • Remove ArgumentError.
    • Remove checkArgument and checkInvariant and just use FALCOR_CHECK.
    • Replace all use of the FALCOR_CHECK_ARG_XXX macros with just FALCOR_CHECK.
    • Make FALCOR_UNIMPLEMENTED and FALCOR_UNREACHABLE throw exceptions instead of being assertions.
  • Adjust conventions in error-handling.md.

Core

  • Rename RenderContext::flush to RenderContext::submit.
  • Rename Device::flushAndSync to Device::wait.
  • Move state object creation to Device and simplify description structs.
  • Remove profiler calls around Swapchain::present() (these calls crash on Vulkan and we probably don't need them so let's just get rid of them).
  • Remove ProgramDesc::languagePrelude and Program::setLanguagePrelude.
  • Add error checks for texture resource view creation that the right bind flags are set.
  • Rename TextureManager::TextureHandle to TextureManager::CpuTextureHandle to avoid name clash with GPU-side TextureHandle.
  • Add convenience functions to convert between CPU and GPU texture handles.
  • Update TextureManager for safe resolve of UDIMs.
  • Disable logging to a file when using Falcor from Python. Without this we litter the runtime folder with python.exe.X.log files.
  • Add native AdapterLUID and AdapterInfo classes in Device.h.
  • Rename GpuFence to Fence.
  • Introduce FenceDesc and Fence::getDesc.
  • Add CopyContext::signal and CopyContext::wait to handle fence signaling and waiting on the command queue.
  • Remove GpuFence::syncGpu, instead clients need to call CopyContext::wait.
  • Rename GpuFence::syncCpu to Fence::wait.
  • Introduce signaled value which represents the last signaled value of the fence (this replaces the CPU value, which was always the last signaled value + 1).
  • Introduce Fence::kAuto which replaces the std::optional we used before to differentiate between signaling specific fence values or auto-incrementing.
  • Add Fence::updateSignaledValue which any signaler (host, device, external) can use to update the signaled value and/or get the auto-incremented value to signal.
  • Introduce timeout when waiting on fence on host.
  • Improve error handling for setting variables in ParameterBlock and through ShaderVar.
    • Throw exception when trying to bind a resource to a variable of a different type.
    • Throw exception when trying to bind a resource to a SRV that is not created with the ShaderResource flag.
    • Throw exception when trying to bind a resource to a UAV that is not created with the UnorderedAccess flag.
    • Throw exception when trying to get a resource from a variable of a different type.
    • Throw exception when trying to set a uniform variable with a different size/type.
  • Reduce reference counting overhead for type information.
    • Lifetime is tied to the ParameterBlockReflection object. The TypedShaderVarOffset has a non-owning pointer to the type.
    • ShaderVar does not own either the pointer to ParameterBlock (same as before) or the type information.
  • Add FALCOR_GFX_CALL checks to GFX dispatch calls.
  • Use std::string_view for shader variable and reflection lookups, avoiding a lot of heap allocations for constructing temporary std::string objects.
  • Falcor uses SM6_6 by default now, so the explicit calls for 6_5 are no longer required
  • Add ShaderDesc::fromFile and ShaderDesc::fromString to reduce temporary copies (and improve readability).
  • Cleanup ShaderModule constructors and usage.
  • Rename downstreamCompilerArgs back to compilerArguments as this is the better name (it's command line options passed to slang, not the downstream compiler, it's just that we most often use it for that).
  • Remove ProgramDesc::addShaderSource.
  • Remove ProgramDesc::getMaxTraceRecursionDepth.
  • Use Shader Model 6.6 by default, or most recent supported one by the device.
  • Add checks in render passes that need minimum shader model.
  • Throw in Program constructor if the requested shader model is not supported.
  • Create Types.h for common graphics types.
  • Move Device::ShaderModel to ShaderModel in Types.h.
  • Move ShaderType to Types.h and remove ShaderType.h.
  • Use ShaderModel in ProgramDesc.
  • Remove RtProgram (merge functionality into Program).
    • Add raytracing pipeline properties to ProgramDesc: maxTraceRecursionDepth, maxPayloadSize, maxAttributeSize, rtPipelineFlags
  • Move remaining code in RtProgram onto Program for now (getRtso)
    • Will be refactored later
  • Replace Program::CompilerFlags with SlangCompilerFlags.
  • Replace Program::Desc with ProgramDesc.
  • Replace Program::ShaderModule with ProgramDesc::ShaderModule.
  • Replace Program::ShaderModuleList with ProgramDesc::ShaderModuleList.
  • Replace Program::TypeConformanceList with TypeConformanceList.
  • Remove ComputeProgram and GraphicsProgram and use Program instead.
    • Add Program::createCompute and Progam::createGraphics helpers.
  • Remove ComputeVars and GraphicsVars and use ProgramVars instead.
  • Refactor Program::Desc into ProgramDesc that has a more reasonable structure:
    • A ShaderModule is a list of sources and a module name that gets compiled into a separate translation unit (using same terminology as slang).
    • Get rid of the createNewTranslationUnit flag, which was just a weird way to split the global source list into multiple modules.
    • A EntryPointGroup is a list of entry points in a specific shader module. Before the Program::Desc had a flat list of entry points and various other places (sources, groups) pointing to it.
    • All fields on the ProgramDesc are public. In theory one could create a description directly, without using the builder helper functions.
    • Get rid of internal state for building (active group index, etc.).
  • Move various Texture::create functions to Device::createTexture functions.
  • Move various Buffer::create functions to Device::createBuffer functions.
  • Remove createStructured function that takes a pProgram.
  • Add Device::createStructuredBuffer that takes a ReflectionType.
  • Set global defines in ProgramManager constructor.
  • Rename Buffer::CpuAccess to MemoryType:
    • CpuAccess::None -> MemoryType::DeviceLocal
    • CpuAccess::Write -> MemoryType::Upload
    • CpuAccess::Read -> MemoryType::ReadBack
  • Cleanup GpuMemoryHeap to also use MemoryType.
  • Add a readback memory heap to Device.
  • Make CpuAccess enum have different semantics:
    • CpuAccess::None means DeviceLocal type memory (no access from CPU).
    • CpuAccess::Write means Upload type memory (write access from CPU).
    • CpuAccess::Read means Readback type memory (read access from CPU).
    • These will be renamed to MemoryType::DeviceLocal, MemoryType::Upload and MemoryType::Readback in a later MR.
    • The Buffer class now represents a fixed piece of memory and not some potentially transient piece of memory on a heap.
  • Buffer::map now only works on buffers that have the correct memory type. MapType::WriteDiscard is deprecated.
  • Add Buffer::getBlob to read back memory from a buffer.
  • Add Buffer::getElement<T> and Buffer::getElements<T> helper functions.
  • Replace lots of map/unmap code on device local buffers (which is now illegal) to use Buffer::getElement(s).
  • Use rotating vertex buffers / vaos on TextRenderer and Gui to avoid stalling on writes.
  • Implement manual constant buffer handling in NRD.
    • Add D3D12ConstantBufferView constructor taking a memory address + size.
  • Rework BufferAccessTests to test the new semantics.
  • Replace Sampler::create with Device::createSampler.
  • Move Sampler::Filter to TextureFilteringMode.
  • Move Sampler::AddressMode to TextureAddressingMode.
  • Move Sampler::ReductionMode to TextureReductionMode.
  • Remove Sampler::ComparisonMode and use ComparisonFunc instead.
  • Rename Sampler::Desc::comparisonMode to Sampler::Desc::comparisonFunc.
  • Remove DepthStencilState::Func alias and use ComparisonFunc instead.
  • Deprecate Resource::BindFlags and use ResourceBindFlags instead.
  • Refactor asset path resolution:
    • Add AssetResolver class for resolving asset paths.
    • Remove global data file directories in OS.h/OS.cpp.
    • Remove path resolution in all of the createFromFile functions.
    • Move asset path resolution to the Python bindings that are used in .pyscene files.
  • Use absolute paths for loading data files in both application code and unit tests.
  • Report downstream shader compilation time.
  • Add getProjectDirectory() that returns the absolute path to the root of the project directory.
  • Rename _PROJECT_DIR_ to FALCOR_PROJECT_DIR and use CMAKE_SOURCE_DIR.
  • Cleanup getInitialShaderDirectories() and getInitialDataDirectories().
  • Add FALCOR_ENABLE_PROFILER CMake option.
  • Remove FalcorConfig.h.
  • Remove FALCOR_ENABLE_LOGGER configuration option.

CUDA

  • Add CUDA shared memory holder to Buffer.
  • Make cuda_utils::ExternalBuffer and cuda_utils::ExternalSemaphore keep non-owning pointer to the Falcor resource/fence. In the future we should replace that with weak_ref.
  • Add CopyContext::waitForCuda and CopyContext::waitForFalcor to synchronize CUDA to Falcor and vice-versa.
  • Use new CUDA synchronization in OptixDenoiser (improves perf with the denoised WavefrontPathTracer from 110fps to 130fps).
  • Use new CUDA synchronization in TestPyTorchPass and make sure test_pytorch.py still succeeds.
  • GFX doesn't support shared fences with Vulkan yet, so this new synchronization method currently only works with D3D12.
  • Add support for importing Vulkan buffers.
  • Remove cuda_utils::initCuda and cuda_utils::setCudaContext.
  • Add cuda::utils::CudaDevice class for creating a CUDA device sharing the same adapter as the graphics device.
  • Add Device::initCudaDevice() and Device::getCudaDevice() functions for initializing/getting a CUDA device sharing the same adapter as the graphics device.
  • Switch to using initCudaDevice for all code that currently requires a CUDA device.
  • Minor cleanup in CudaUtils.h/CudaUtils.cpp.
  • Add CudaRuntime.h wrapper for fixing the vector type name clashes.
  • Refactor CudaUtils.h and put them into cuda_utils namespace.
  • Move CudaBuffer helper class to OptixDenoiser which is the only client (the buffer class isn't great, so let's not encourage to use it in other places).
  • Consolidate the two FalcorCUDA modules with CudaUtils and use that.

Utilities

  • Refactor PixelDebug:
    • Use ParameterBlock for keeping PixelDebug data.
    • Use RWByteAddressBuffer to manage buffer counters (works in Vulkan).
    • Use combined readback buffer for all data (counters + record buffer).
    • General cleanup.

Scene

  • Remove old nullTracePass workaround.
  • Replace getParameterBlock with setShaderData.
  • Add a fallback tangent generation to ensure that valid degenerate inputs (collapsed triangles without nans) always produce valid outputs (no NaNs).
  • Fix a bug where the base mesh was assumed to have triangleCount faces rather than its actual count.
  • Parallelized UV mesh tiles creation.
  • Fixed a regression in skinning normal computation.
  • Change the setCacheMeshes to incrementally adding them, allowing having more than one source of CachedMeshes.
  • Fixed Loop subdiv safeguards incorrectly checking for all-triangle meshes.
  • Remove pybind11 dependency in importers.
  • Split general USD utility from USDImporter into their own libraries, so they could be reused.
  • Fix enabling/disabling animations.
  • Use upload heap for writing instance descs for TLAS build/update.
  • Add debug prints of SceneBuilder content, used to compare whether various importers loaded the same data.
  • Add small efficiencies in ingesting cached data (allowing to ingest them incrementally).
  • Changed skeletal matrices from doubles to floats.
  • Fix a bug where curves of length 2 could access out of bounds memory.
  • Fix a bug where the curve frame would accumulate error in normal and binormal, causing the frame to stop being orthonormal.
  • Add checks to make sure the curve frame is unit length.
  • Update Scene to recreate its parameter block when scene defines change.
  • Update rendering code to always bind the latest scene block.
  • Fix a bug where curves that taper to 0 width would break if converted to polygons. We now clamp the width to min float16 value.
  • Fix a bug where all USD curves always had animation, even if they had just one keyframe

Materials

  • Material parameter serialization and reflection (used for inverse rendering).
  • Make StandardMaterial differentiable.
  • Make PBRTConductor differentiable.
  • Introduce EmissiveMaterialsChanged material update flag to notify when emissive materials change.
  • Recreate LightCollection if emissive materials change.
  • Refactor MaterialSystem::update() to cleanup how updates are tracked and handled for some types of changes.
  • Add support for dynamic materials and call update() unconditionally each frame for such.
  • Update path tracers to recompile shaders less often on material changes.
  • Point of entry subsurface for StandardMaterial.
  • Fix desc count in MaterialSystem.
  • Update material system to handle replacing materials at runtime.
  • Add Material::getMaterialLayout().

Render Passes

  • Implement setProperties for PathTracer render pass.
  • Propagate options to subsystems when calling setProperties.
  • Add reset methods to PathTracer for resetting frame counter.
  • Adjust image test scripts to make use of reset and set_properties.
  • Disable warning 30056 when compiling NRD shaders (short-circuit ? being deprecated).
  • Update render passes and modules to handle Scene::UpdateFlags::RecompileNeeded.

Pathtracer

  • Extend PathTracer with SER support.
  • Fix detection of total internal reflection.

Testing

  • Enable more image tests for Vulkan.
  • Make run_unit_tests and run_image_tests work if called from any directory other than tests.
  • Add --run-only option to run_image_tests for slang testing.
  • Cleanup StructuredBufferMatrix unit test.
  • Break debugger on failure if in debug mode.
  • Add EXPECT_THROW and EXPECT_THROW_AS checks.
  • Allow running test scripts outside of a git clone.