Skip to content

Commit

Permalink
Fix stack overflow for large projects
Browse files Browse the repository at this point in the history
When large projects have a lot of missing objects in the first pass of parsing Ruby files we would run into stack overflows. This happens because, when there was a missing object in a parsed Ruby file, we would recursively call `#parse_remaining_files`. When this happens a lot the stack would get huge.

We fix this by instead keeping a list of files that we want to retry and re-parse them in another pass. When we can no longer resolve any more files we break the loop.
  • Loading branch information
Andrew Szczepanski authored and Andrew Szczepanski committed Jul 26, 2023
1 parent 698b22c commit 8a14e13
Show file tree
Hide file tree
Showing 4 changed files with 33 additions and 42 deletions.
11 changes: 3 additions & 8 deletions lib/yard/handlers/base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -562,14 +562,9 @@ def ensure_loaded!(object, max_retries = 1)
return if object.root?
return object unless object.is_a?(Proxy)

retries = 0
while object.is_a?(Proxy)
raise NamespaceMissingError, object if retries > max_retries
log.debug "Missing object #{object} in file `#{parser.file}', moving it to the back of the line."
parser.parse_remaining_files
retries += 1
end
object
log.debug "Missing object #{object} in file `#{parser.file}', moving it to the back of the line."
globals.ordered_parser.files_to_retry << parser.file if globals.ordered_parser
raise NamespaceMissingError, object
end

# @group Macro Support
Expand Down
17 changes: 7 additions & 10 deletions lib/yard/handlers/c/base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -75,20 +75,17 @@ def namespace_for_variable(var)
end

def ensure_variable_defined!(var, max_retries = 1)
retries = 0
object = nil
object = namespace_for_variable(var)
return object unless object.is_a?(Proxy)

loop do
object = namespace_for_variable(var)
break unless object.is_a?(Proxy)
log.debug "Missing object #{object} in file `#{parser.file}', moving it to the back of the line."

raise NamespaceMissingError, object if retries > max_retries
log.debug "Missing namespace variable #{var} in file `#{parser.file}', moving it to the back of the line."
parser.parse_remaining_files
retries += 1
if globals.ordered_parser
retryable_file = parser.file == "(stdin)" ? StringIO.new("void Init_Foo() { #{statement.source} }") : parser.file
globals.ordered_parser.files_to_retry << retryable_file
end

object
raise NamespaceMissingError, object
end

def namespaces
Expand Down
13 changes: 0 additions & 13 deletions lib/yard/handlers/processor.rb
Original file line number Diff line number Diff line change
Expand Up @@ -131,19 +131,6 @@ def process(statements)
end
end

# Continue parsing the remainder of the files in the +globals.ordered_parser+
# object. After the remainder of files are parsed, processing will continue
# on the current file.
#
# @return [void]
# @see Parser::OrderedParser
def parse_remaining_files
if globals.ordered_parser
globals.ordered_parser.parse
log.debug("Re-processing #{@file}...")
end
end

# Searches for all handlers in {Base.subclasses} that match the +statement+
#
# @param statement the statement object to match.
Expand Down
34 changes: 23 additions & 11 deletions lib/yard/parser/source_parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,14 @@ class UndocumentableError < RuntimeError; end
# Raised when the parser sees a Ruby syntax error
class ParserSyntaxError < UndocumentableError; end

# Responsible for parsing a list of files in order. The
# {#parse} method of this class can be called from the
# {SourceParser#globals} globals state list to re-enter
# parsing for the remainder of files in the list recursively.
#
# @see Processor#parse_remaining_files
# Responsible for parsing a list of files in order.
class OrderedParser
# @return [Array<String>] the list of remaining files to parse
attr_accessor :files

# @return [Array<String>] files to parse again after our first pass
attr_accessor :files_to_retry

# Creates a new OrderedParser with the global state and a list
# of files to parse.
#
Expand All @@ -33,19 +31,33 @@ class OrderedParser
def initialize(global_state, files)
@global_state = global_state
@files = files.dup
@files_to_retry = []
@global_state.ordered_parser = self
end

# Parses the remainder of the {#files} list.
#
# @see Processor#parse_remaining_files
# Parses the remainder of the {#files} list. Any files that had issues
# parsing will be done in multiple passes until the size of the files
# remaining does not change.
def parse
until files.empty?
file = files.shift
files.each do |file|
log.capture("Parsing #{file}") do
SourceParser.new(SourceParser.parser_type, @global_state).parse(file)
end
end

loop do
prev_length = @files_to_retry.length
files_to_parse_now = @files_to_retry.dup
@files_to_retry = []

files_to_parse_now.each do |file|
log.capture("Re-processing #{file}") do
SourceParser.new(SourceParser.parser_type, @global_state).parse(file)
end
end

break if @files_to_retry.length >= prev_length
end
end
end

Expand Down

0 comments on commit 8a14e13

Please sign in to comment.