Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add char #290

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Add char #290

wants to merge 5 commits into from

Conversation

puripuri2100
Copy link
Contributor

@puripuri2100 puripuri2100 commented Sep 23, 2021

Close #407

I added char type and some functions.

The char type uses Uchar.t type for implementation.

ref: https://github.com/gfngfn/SATySFi/projects/1#card-54377937

List of added functions:

  • char-to-string : char -> string
  • char-to-unicode-point : char -> int
  • char-of-unicode-point : int -> char
  • char-same : char -> char -> bool

See the tests/char.saty file for how to use them.

@puripuri2100
Copy link
Contributor Author

🤔🤔🤔

### output ###
# Error: Unable to set up the cache root directory:
# Unix.Unix_error(Unix.EEXIST, "mkdir",
# "/Users/runner/.cache/dune/db/files/v4")

https://github.com/gfngfn/SATySFi/runs/3682988795?check_suite_focus=true#step:5:5926

@na4zagin3
Copy link
Contributor

na4zagin3 commented Sep 23, 2021

  1. *-unicode-point should be renamed for *-unicode-scalar-value because “Unicode point” is not a term (See “Unicode scalar value”). Did you mean “Unicode code point”)?.
  2. Is *-same conventional? I would expect char-same should be char-equal or something because this is the equality defined on char. (I'd love to hear @gfngfn’s opinion)

@na4zagin3
Copy link
Contributor

na4zagin3 commented Sep 23, 2021

This is my just two cents.

Can this be implemented as a macro like ~(char @`2`) with char : input-position * string -> char?

Otherwise, is it possible to introduce more generic literal syntax (like Scala’s String Interpolation, C++’s User defined literals, Lisp-families’ read macros, SRFI-10) instead of one specific to char?

For example, define a new syntax @⟨ident_tag⟩⟨string-literal⟩ (e.g., @char`a`) which will be parsed as ~(⟨ident_tag⟩ @⟨string-literal⟩) where ⟨ident_tag⟩ should be a function with type input-position * string -> char. If the string is not valid, the function will call abort-with-message. It may be better to introduce another name space for tags that are defined with a new syntax let-literal @⟨ident_tag⟩ ⟨args⟩ = ⟨expr⟩.

@na4zagin3
Copy link
Contributor

na4zagin3 commented Sep 23, 2021

This is another my two cents. Another option would be “not to introduce a literal syntax for char at all”. I believe casual users shouldn’t care what a Unicode Scalar Value is. I'm not a big fan of user-facing APIs requiring char arguments; they will likely not consider Unicode equivalence.

@y-yu y-yu mentioned this pull request Oct 19, 2021
@puripuri2100
Copy link
Contributor Author

Thank you for pointing out the naming of the primitive functions.

I explain why I introduced the literal syntax of the char type.
I would like to use the char type when parsing strings. So I expect the char type to have two properties:

  • Guaranteed to be exactly one character
  • Pattern matching is possible

Currently, I must do pattern matching with Unicode scalar value:
https://github.com/puripuri2100/SATySFi-json/blob/master/src/json.satyg#L78

@na4zagin3
Copy link
Contributor

Hmm, then can we introduce a macro function char : string -> int and make it available at matching clauses?

val f x =
  match x with
  | ~(char `/`) -> lex-string (str-stack^`"`) line (column + 2) ys

or with a new macro syntax @⟨ident_tag⟩⟨string-literal⟩,

val f x =
  match x with
  | @char`/` -> lex-string (str-stack^`"`) line (column + 2) ys

Otherwise, we can extend matching with view patterns

val char c =
  match string-length c with
  | 0 -> ``
  | 1 -> string-sub c 0 1
  | _ -> ``
  end

val f x =
  match x with
  | (char -> `\`) -> lex-string (str-stack^`"`) line (column + 2) ys

or extractors

val match Char c =
  match string-length c with
  | 0 -> None
  | 1 -> Some (string-sub c 0 1)
  | _ -> None
  end

val f x =
  match x with
  | Char(`\`) -> lex-string (str-stack^`"`) line (column + 2) ys

@gfngfn gfngfn added this to the v0.1.0 milestone Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce base type char for Unicode code points
3 participants