Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector to string conversion adds an extra character #486

Open
Caid11 opened this issue Apr 6, 2024 · 1 comment
Open

Vector to string conversion adds an extra character #486

Caid11 opened this issue Apr 6, 2024 · 1 comment

Comments

@Caid11
Copy link

Caid11 commented Apr 6, 2024

I think there's an off-by-one error in kk_string_from_chars. When I convert a vector of chars to a string, a space is added to the end of the resulting string.

For example, this snippet:

fun main()
  val some-string = "hello"
  println("some-string = [" ++ some-string ++ "]")
  println("some-string.vector.string = [" ++ some-string.vector.string ++ "]")

Produces this output in Koka v3.1.1:

some-string = [hello]
some-string.vector.string = [hello ]

Looking at kk_string_from_chars, I see Koka adds 1 to the length of the vector's elements:

kk_string_t kk_string_from_chars(kk_vector_t v, kk_context_t* ctx) {
  kk_ssize_t n;
  kk_box_t* cs = kk_vector_buf_borrow(v, &n, ctx);
  kk_ssize_t len = 0;
  for (kk_ssize_t i = 0; i < n; i++) {
    len += kk_utf8_len(kk_char_unbox(cs[i], KK_BORROWED, ctx));
  }
  uint8_t* p;
  kk_string_t s = kk_unsafe_string_alloc_buf(len + 1, &p, ctx); // <= NOTE THE ADDED 1
  for (kk_ssize_t i = 0; i < n; i++) {
    kk_ssize_t count;
    kk_utf8_write(kk_char_unbox(cs[i], KK_BORROWED, ctx), p, &count);
    p += count;
  }
  kk_assert_internal(kk_string_buf_borrow(s, NULL, ctx) + n == p);
  kk_vector_drop(v, ctx);
  return s;
}

I suspect the intent is to add a terminating character, but perhaps Koka's string implementation adds that implicitly? If I remove the added 1, it seems to produce the correct output.

Here's a complete reproducer:

vec-str-bug-repro.kk:

extern import
  c file "inline.c"

extern custom-string(v : vector<char>) : string
  c  "kk_custom_string_from_chars"

fun main()
  val some-string = "hello"
  println("some-string = [" ++ some-string ++ "]")
  println("some-string.vector.string = [" ++ some-string.vector.string ++ "]")
  println("some-string.vector.custom-string = [" ++ some-string.vector.custom-string ++ "]")

inline.c:

kk_string_t kk_custom_string_from_chars(kk_vector_t v, kk_context_t* ctx) {
  kk_ssize_t n;
  kk_box_t* cs = kk_vector_buf_borrow(v, &n, ctx);
  kk_ssize_t len = 0;
  for (kk_ssize_t i = 0; i < n; i++) {
    len += kk_utf8_len(kk_char_unbox(cs[i], KK_BORROWED, ctx));
  }
  uint8_t* p;
  // kk_string_t s = kk_unsafe_string_alloc_buf(len + 1, &p, ctx);
  kk_string_t s = kk_unsafe_string_alloc_buf(len, &p, ctx);
  for (kk_ssize_t i = 0; i < n; i++) {
    kk_ssize_t count;
    kk_utf8_write(kk_char_unbox(cs[i], KK_BORROWED, ctx), p, &count);
    p += count;
  }
  kk_assert_internal(kk_string_buf_borrow(s, NULL, ctx) + n == p);
  kk_vector_drop(v, ctx);
  return s;
}

Resulting output:

some-string = [hello]
some-string.vector.string = [hello ]
some-string.vector.custom-string = [hello]
@TimWhiting
Copy link
Collaborator

Yes, it looks like bytes.c already handles the len+1 and terminating byte for all internal variants of strings.

kk_decl_export kk_decl_noinline kk_bytes_t kk_bytes_alloc_len(kk_ssize_t len, kk_ssize_t plen, const uint8_t* p, uint8_t** buf, kk_context_t* ctx) {

I'll commit a quick fix on dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants