Skip to content

Frame looping fix#6

Merged
olehkuznetsov merged 5 commits into
android-graphics:devfrom
olehkuznetsov:frame-looping-fix
Jun 25, 2026
Merged

Frame looping fix#6
olehkuznetsov merged 5 commits into
android-graphics:devfrom
olehkuznetsov:frame-looping-fix

Conversation

@olehkuznetsov

Copy link
Copy Markdown

Various fixes to support more games

    This CL introduces a robust state-restoration mechanism for Vulkan frame
    looping to resolve deadlocks caused by fence state drift between loop
    iterations (e.g., in Fortnite).

    To keep the core framework clean and avoid intrusive changes to the
    generic FileProcessor class, a "dual-capture" architecture is introduced:
    1. For N = 1 (looping Frame 1): Fence states are captured at the end of
       the setup phase via the consumer's ProcessStateEndMarker().
    2. For N > 1 (looping Frame N): Fence states are captured immediately at
       the Frame N boundary via the decoder's OnLoopStart() event.

    To ensure stability, the consumer waits for all Vulkan devices to be idle
    (reusing the inherited WaitDevicesIdle() helper) before querying and
    storing the initial fence states.

    At the loop boundary, we perform a spec-compliant, highly optimized fixup:
    - Fences that drifted from UNSIGNALED -> SIGNALED are collected in
      'fences_to_reset' and reset back to UNSIGNALED via vkResetFences.
    - Fences that drifted from SIGNALED -> UNSIGNALED are collected in
      'fences_to_signal' and synthetically signaled via vkQueueSubmit (without
      any redundant reset calls).

    All redundant logging overrides in the frame loop consumer have been
    purged to keep the implementation minimal and efficient.
    This CL resolves a SIGSEGV crash occurring during the final iteration of
    Vulkan frame looping, particularly on traces with frequent descriptor pool
    resets.

    Problem: Asymmetric Skipping of Resource Lifecycle
    - To optimize repetitions, the frame loop consumer skips resource
      creations/allocations (e.g., descriptor sets). However, it previously
      did NOT skip destructions/resets (like vkResetDescriptorPool) during the
      final iteration, in an attempt to clean up.
    - This created a fatal asymmetry: at the start of the final iteration,
      replayed pool resets would execute and destroy the sets, but the
      subsequent allocation calls in the frame would be skipped.
    - This resulted in invalid/null descriptor set handles being passed to draw
      calls during the final iteration, causing a driver-level SIGSEGV.

    Fix: Symmetric Skipping
    - Removed the `!IsFinalIteration()` constraint from descriptor pool
      and set destructions/resets.
    - We now symmetrically skip destructions/resets throughout the entire
      looping range, including the final iteration.
    - This keeps the reused resources alive and valid on the GPU until the
      end of the replay. Since the replay process exits shortly after, the
      OS/driver safely cleans up all leaked resources.

    Tested:
    - Verified that traces with frequent pool resets now loop successfully
      without crashing on the second (final) iteration.
Add vkCreateCommandPool to the generated frame-looping overrides so that
command pools created within the loop range are correctly tracked and
subsequently destroyed on the final iteration of the loop.

Also update the manual VulkanReplayFrameLoopConsumer override to delegate
to the generated base class version to ensure proper handle tracking,
while maintaining the necessary command buffer reset flag modifications.

TAG=agy
CONV=a0864ce4-0126-408e-ad0b-5302cffe1ff3
…ooping

Override Process_vkCmdWriteTimestamp in VulkanReplayFrameLoopConsumer to
skip writing timestamps on loop repetitions. This avoids "query not reset"
validation errors when replaying traces that do not reset queries within
the looped frame.

TAG=agy
CONV=a0864ce4-0126-408e-ad0b-5302cffe1ff3
Apply official clang-format guidelines to all hand-written files modified in our frame-looping fixes:
- api_decoder.h
- preload_file_processor.cpp
- vulkan_decoder_base.h
- vulkan_replay_frame_loop_consumer.cpp
- vulkan_replay_frame_loop_consumer.h

TAG=agy
CONV=a0864ce4-0126-408e-ad0b-5302cffe1ff3
Comment thread framework/decode/vulkan_replay_frame_loop_consumer.cpp
Comment thread framework/decode/vulkan_replay_frame_loop_consumer.cpp
@olehkuznetsov olehkuznetsov merged commit 7ba24d0 into android-graphics:dev Jun 25, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants