Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement iterator API for functional methods (.each/filter/map) to improve performance #61

Open
Xuerian opened this issue Nov 16, 2016 · 6 comments

Comments

@Xuerian
Copy link

Xuerian commented Nov 16, 2016

Here: https://github.com/Afforess/Factorio-Stdlib/wiki#avoiding-iteration-and-general-stdlib-tools

Which is an excellent suggestion for low-weight code, but have you compared them in hot paths?

General consensus in WoW modding is that it's worth avoiding as much overhead as possible in such areas, even things like for i/# vs ipairs. Maybe worth adding a note there, if the same is true here?

@Choumiko
Copy link
Contributor

That's difficult to measure i think since in the case of table.each you're mostly trying to meassure the cost of 1 function call (+1 if statement and passing varargs). I think it's more benefical if one tries to follow broader optimisations, like e.g. here: https://springrts.com/wiki/Lua_Performance#TEST_9:_for-loops
I've run some speed tests involving stdlib before, where i applied some of these: #59

What concerns me most in the code you linked is the fact that it's titled avoiding iteration but actually the improved code ends up (mostly) doubling the iterations without it being obvious (table.filter and table.each both iterate over a possibly large table). Guess something like table.each_filtered() would be handy.

@Xuerian
Copy link
Author

Xuerian commented Nov 20, 2016

The page you link says this:

Don't use pairs() or ipairs() in critical code! Try to save the table-size somewhere and use for i=1,x do!

Using a fori loop when possible (Which can include keeping two arrays instead of a hash table, and doing the fori that way) is a pretty clear gain in critical code.

I'm all for using the functional approach when it's not a critical path, but the evidence is pretty clear that even the overhead of using pairs/ipairs is big, not counting having the user abstract the body of the loop into yet another call.

Really it seems like I'm trying to make a big deal out of it, but even just putting a link to the page you linked here below that section in the docs it is the kind of thing I'm suggesting.

@Afforess
Copy link
Owner

Afforess commented Nov 20, 2016

I've delayed responding not because I didn't notice the issue but because I wanted to think about reply a bit.

The core complaint you've raised is that functions / iterations can be expensive, and stdlib recommends practices which may exacerbate this problem. I think this is an entirely fair criticism. The standard library is first and foremost concerned with encouraging quality code, precision, and utility. I think if we ever reach the day where many modders, having adopted the stdlib, find that certain stdlib recommendations hurt their performance, then I will be happy in succeeding in my earlier goals and having found a new challenge to face.

In regards to @Choumiko I think what you are noticing is that the stdlib table functions are all independent operators, unaware of the existence of iteration. What you are really asking for is a set of functions around iteration. For example, consider the example from the wiki:

require 'stdlib/table'
require 'stdlib/game'
require 'stdlib/event/event'
require 'stdlib/area/position'

Event.register(defines.events.on_player_created, function(event)
    local player = game.players[event.player_index].character
    local force = player.force
    local valid_enemies = table.filter(player.surface.find_entities_filtered(area = Position.expand_to_area(player.position, 25), type = 'unit', force = 'enemy'}), Game.VALID_FILTER)
    table.each(valid_enemies, function(entity)
         entity.damage(100, force)
    end)
end)

You're right, it does iterate the list of biters twice. If there were an iterator API, it might look like this:

require 'stdlib/iterator'
require 'stdlib/game'
require 'stdlib/event/event'
require 'stdlib/area/position'

Event.register(defines.events.on_player_created, function(event)
    local player = game.players[event.player_index].character
    local force = player.force
    local iter = iterator.new(player.surface.find_entities_filtered(area = Position.expand_to_area(player.position, 25), type = 'unit', force = 'enemy'}))
    iter:filter(Game.VALID_FILTER)
    iter:each(function(entity) entity.damage(100, force) end)
    iter:apply()
end)

iterator.new would create an iterator object, holding an array of objects. Then, filter/each/mapping operations could be added to the iterator, but adding these would not cause the iterator to execute until iterator.apply was called. So in this example, the iteration stream was given the list of biters, then told how to filter it, and then told to execute a function for each, and then told to execute all the operations in order. This stream of operations would only result in iterating the initial list once, satisfying your performance concerns. Other languages have this sort of API (Ruby Blocks, Python Generators, Java Streams, etc), and there is no reason it can't be given to lua as well. (Actually, Lua has coroutines but the API for coroutines is so obtuse as to be unreadable.)

@Xuerian
Copy link
Author

Xuerian commented Nov 20, 2016

The core complaint you've raised is that functions / iterations can be expensive, and stdlib recommends practices which may exacerbate this problem.

I didn't mean to come across as complaining about it, I was just hoping to make sure users are aware of potential pitfalls to the mostly-preferable stdlib approach if they're either iterating large sets of data or at a high frequencies, which is certainly not going to be the case most of the time. @Choumiko's link seems pretty good, but even then you're likely right and it's unlikely to be the issue.

I think if we ever reach the day where many modders, having adopted the stdlib, find that certain stdlib recommendations hurt their performance, then I will be happy in succeeding in my earlier goals and having found a new challenge to face.

Hah, that's fair.

@Afforess Afforess changed the title .each/Mapping suggested over iteration in documentation Implement iterator API for functional methods (.each/filter/map) to improve performance Nov 20, 2016
@Afforess
Copy link
Owner

Retitled with @Xuerian's approval.

@Afforess
Copy link
Owner

Afforess commented Nov 21, 2016

I wrote an early proof-of-concept but its performance is actually worse than straight tables (see the busted tests for a benchmark) and won't make the cut as is.

https://github.com/Afforess/Factorio-Stdlib/blob/95bf8434f7b6a13c40ce3196eda3034b0977ef64/stdlib/iterator.lua

@Afforess Afforess added this to the Future Direction milestone May 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants