Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a splitString funtion to textutils #1768

Open
wants to merge 10 commits into
base: mc-1.20.x
Choose a base branch
from

Conversation

tkbstudios
Copy link

textutils.splitString(string_to_split, pattern)

I suggested this PR because I think that implementing a native way of splitting strings in ComputerCraft could be helpful instead of needing to add an extra function to a program or creating an extra library.

I think I've edited the documentation I could find in the codebase to add documentation for textutils.splitString

I don't really know what I could explain here because it's kinda self-explaining 😅

@tkbstudios
Copy link
Author

On it to fix the linting

Copy link
Contributor

@fatboychummy fatboychummy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There also exists tests for this module, located at projects/core/src/test/resources/test-rom/spec/apis/textutils_spec.lua. Having tests for split would ensure it can continue working in the future!

@Lupus590
Copy link
Contributor

Should we avoid adding to APIs and instead use the simular in purpose module that exists cc.strings?

@fatboychummy
Copy link
Contributor

Should we avoid adding to APIs and instead use the simular in purpose module that exists cc.strings?

I would agree that it'd likely be better put in the newer required cc.strings module.

@tkbstudios
Copy link
Author

Thanks for all the reviews :) I'll start modifying it later when I can.

Copy link
Member

@SquidDev SquidDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! As Lupus says, would you be able to move this to the cc.strings module? I think that's probably a better place for new string functions — textutils is a bit of a hodge-podge.

I think it's probably worth adding a "plain" argument, to treat the supplied delimiter as a string rather than a pattern. It might also be nice to have a parameter to limit the number of splits that occur — metis's split function has both of these, so feel free to look at that for inspiration.

If you feel comfortable, would you be able to add some tests for this function (see the contributing guide)? There's always some funky edge cases for split functions (e.g. empty strings, empty delimiters) which I think it's worth capturing here.

@SquidDev SquidDev added enhancement An extension of a feature or a new feature. area-CraftOS This affects CraftOS, or any other Lua portions of the mod. labels Mar 29, 2024
@tkbstudios
Copy link
Author

Thanks for your interest @SquidDev ! I'll look forward to adding and moving those to the correct places. I'll let you guys know when I'm done (I don't think I might be able to do it today :/ )

@tkbstudios
Copy link
Author

splitting test:

local expect = require("cc.expect").expect
os.loadAPI("logging")
logging.init(0, true, "split_logs.txt")

local function split_str(inputstr, sep, split_count)
    expect(1, inputstr, "string")
    expect(2, sep, "string", "nil")
    expect(3, split_count, "number", "nil")
    if sep == nil then
        -- If the separator is nil, we default to whitespace
        sep = "%s"
    end
    if split_count == nil then
        split_count = -1
    end
    local splitted_table = {}
    local splitted_amount = 0
    for str in string.gmatch(inputstr, "([^" .. sep .. "]+)") do
        if splitted_amount == split_count then
            break
        else
            table.insert(splitted_table, str)
            splitted_amount = splitted_amount + 1
        end
    end
    return splitted_table
end



-- Testing split_str
logging.debug("===== Testing split_str =====")

-- Testing split_str with default separator (whitespace)
logging.debug("Input: \"Hello World\", split_count: -1")
for _, str in ipairs(split_str("Hello World", nil, -1)) do
    logging.debug(str)
end

-- Testing split_str with default separator (whitespace) and with specified split_count
logging.debug("Input: \"Hello World\", split_count: 1")
for _, str in ipairs(split_str("Hello World", nil, 1)) do
    logging.debug(str)
end

-- Testing split_str with custom separator and with specified split_count
logging.debug("Input: \"Hello/World/Foo/Bar\", sep: \"/\", split_count: 2")
for _, str in ipairs(split_str("Hello/World/Foo/Bar", "/", 2)) do
    logging.debug(str)
end

-- Testing split_str with empty string
logging.debug("Input: \"\", sep: \"/\", split_count: -1")
for _, str in ipairs(split_str("", "/", -1)) do
    logging.debug(str)
end

-- Testing split_str with nil separator
logging.debug("Input: \"Hello World\", sep: nil, split_count: -1")
for _, str in ipairs(split_str("Hello World", nil, -1)) do
    logging.debug(str)
end

-- Testing split_str with negative split_count
logging.debug("Input: \"Hello World\", sep: \"/\", split_count: -1")
for _, str in ipairs(split_str("Hello World", "/", -1)) do
    logging.debug(str)
end

-- Testing split_str with split_count of 0
logging.debug("Input: \"Hello World\", sep: \"/\", split_count: 0")
for _, str in ipairs(split_str("Hello World", "/", 0)) do
    logging.debug(str)
end

-- Testing split_str with empty string and nil separator
logging.debug("Input: \"\", sep: nil, split_count: -1")
for _, str in ipairs(split_str("", nil, -1)) do
    logging.debug(str)
end


-- Testing split_str without separator and max splits
logging.debug("Input: \"Hello World\"")
for _, str in ipairs(split_str("Hello World")) do
    logging.debug(str)
end

-- Testing split_str without max splits
logging.debug("Input: \"Hello/World/Foo/Bar\", sep: \"/\"")
for _, str in ipairs(split_str("Hello/World/Foo/Bar", "/")) do
    logging.debug(str)
end

-- Testing split_str without inputting max splits
logging.debug("Input: \"Hello/World/Foo/Bar\"")
for _, str in ipairs(split_str("Hello/World/Foo/Bar", "/", nil)) do
    logging.debug(str)
end

splitting test result:

[2024-03-29 15:23:50]: [DEBUG] ===== Testing split_str =====
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello World", split_count: -1
[2024-03-29 15:23:50]: [DEBUG] Hello
[2024-03-29 15:23:50]: [DEBUG] World
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello World", split_count: 1
[2024-03-29 15:23:50]: [DEBUG] Hello
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello/World/Foo/Bar", sep: "/", split_count: 2
[2024-03-29 15:23:50]: [DEBUG] Hello
[2024-03-29 15:23:50]: [DEBUG] World
[2024-03-29 15:23:50]: [DEBUG] Input: "", sep: "/", split_count: -1
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello World", sep: nil, split_count: -1
[2024-03-29 15:23:50]: [DEBUG] Hello
[2024-03-29 15:23:50]: [DEBUG] World
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello World", sep: "/", split_count: -1
[2024-03-29 15:23:50]: [DEBUG] Hello World
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello World", sep: "/", split_count: 0
[2024-03-29 15:23:50]: [DEBUG] Input: "", sep: nil, split_count: -1
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello World"
[2024-03-29 15:23:50]: [DEBUG] Hello
[2024-03-29 15:23:50]: [DEBUG] World
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello/World/Foo/Bar", sep: "/"
[2024-03-29 15:23:50]: [DEBUG] Hello
[2024-03-29 15:23:50]: [DEBUG] World
[2024-03-29 15:23:50]: [DEBUG] Foo
[2024-03-29 15:23:50]: [DEBUG] Bar
[2024-03-29 15:23:50]: [DEBUG] Input: "Hello/World/Foo/Bar"
[2024-03-29 15:23:50]: [DEBUG] Hello
[2024-03-29 15:23:50]: [DEBUG] World
[2024-03-29 15:23:50]: [DEBUG] Foo
[2024-03-29 15:23:50]: [DEBUG] Bar

@tkbstudios
Copy link
Author

😭 all checks have passed is beautiful

@tkbstudios
Copy link
Author

Seems to work fine

image
image

@SquidDev
Copy link
Member

SquidDev commented Apr 7, 2024

Thank you for looking at this!

Could you change split_count to return the remainder of the string as the last value, rather than the last "split" item. For instance:

split("Hello/World/Foo/Bar", ",", 2) should return { "Hello", "World/Foo/Bar" }, rather than { "Hello", "World" }.

I think it's probably worth adding a "plain" argument, to treat the supplied delimiter as a string rather than a pattern

Did you manage to look at this at all?

For the tests, you can check tables are equal with same (rather than the tableEq you were using).

@tkbstudios
Copy link
Author

tkbstudios commented Apr 7, 2024

Thank you for looking at this!

Could you change split_count to return the remainder of the string as the last value, rather than the last "split" item. For instance:

split("Hello/World/Foo/Bar", ",", 2) should return { "Hello", "World/Foo/Bar" }, rather than { "Hello", "World" }.

I think it's probably worth adding a "plain" argument, to treat the supplied delimiter as a string rather than a pattern

Did you manage to look at this at all?

For the tests, you can check tables are equal with same (rather than the tableEq you were using).

I don't think this is a great idea, in python it splits the string on the amount asked and doesn't return a remainder.

For example if you want to check the amount of args provided that it doesn't give an extra item.

@SquidDev
Copy link
Member

SquidDev commented Apr 7, 2024

So there's slightly different semantics of what the "limit" argument should mean. Some languages (e.g. Python), it specifies the number of splits that occur:

>>> "foo,bar,baz,qux".split(",", maxsplit=2)
['foo', 'bar', 'baz,qux']

Other languages (Rust, Java), it specifies the number of values that should be returned (so $n - 1$ splits should occur):

> "foo,bar,baz,qux".split(",", 3)
==> String[3] { "foo", "bar", "baz,qux" }

I personally prefer the second of these (I think in terms of "split this string into three pieces", rather than "split this string three times"), but definitely happy to be convinced either way!

@tkbstudios
Copy link
Author

So, should I change it then?

@SquidDev
Copy link
Member

SquidDev commented Apr 7, 2024

If that's okay, yes please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CraftOS This affects CraftOS, or any other Lua portions of the mod. enhancement An extension of a feature or a new feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants