Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API consistency review #20402

Closed
3 of 19 tasks
StefanKarpinski opened this issue Feb 2, 2017 · 131 comments
Closed
3 of 19 tasks

API consistency review #20402

StefanKarpinski opened this issue Feb 2, 2017 · 131 comments
Assignees
Labels
deprecation This change introduces or involves a deprecation
Milestone

Comments

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Feb 2, 2017

I'm starting this as a place to leave notes about things to make sure to consider when checking for API consistency in Julia 1.0.

  • Convention prioritization. Listing and prioritizing our what-comes-first conventions in terms of function arguments for do-blocks, IO arguments for functions that print, outputs for in-place functions, etc (official conventional argument precedence #19150).

  • Positional vs keyword arguments. Long ago we didn't have keyword arguments. They're still sometimes avoided for performance considerations. We should make this choice based on what makes the best API, not on that kind of historical baggage (keyword performance issues should also be addressed so that this is no longer a consideration).

  • Metaprogramming tools. We have a lot of tools like @code_xxx that are paired with underlying functions like code_xxx. These should behave consistently: similar signatures, if there are functions with similar signatures, make sure they have similar macro versions. Ideally, they should all return values, rather than some returning values and others printing results, although that might be hard for things like LLVM code and assembly code.

  • IO <=> file name equivalence. We generally allow file names as strings to be passed in place of IO objects and the standard behavior is to open the file in the appropriate mode, pass the resulting IO object to the same function with the same arguments, and then ensure that the IO object is closed afterwards. Verify that all appropriate IO-accepting functions follow this pattern.

  • Reducers APIs. Make sure reducers have consistent behaviors – all take a map function before reduction; congruent dimension arguments, etc.

  • Dimension arguments. Consistent treatment of "calculate across this [these] dimension[s]" input arguments, what types are allowed etc, consider whether doing these as keyword args might be desired.

  • Mutating/non-mutating pairs. Check that non-mutating functions are paired with mutating functions where it makes sense and vice versa.

  • Tuple vs. vararg. Check that there is general consistency between whether functions take a tuple as the last argument or a vararg.

  • Unions vs. nullables vs. errors. Consistent rules on when functions should throw errors, and when they should return Nullables or Unions (e.g. parse/tryparse, match, etc.).

  • Support generators as widely as possible. Make sure any function that could sensibly work with generators does so. We're pretty good about this already, but I'm guessing we've missed a few.

  • Output type selection. Be consistent about whether "output type" API's should be in terms of element type or overall container type (ref Functions that return arrays with eltype as input should use container type instead? #11557 and RFC/WIP: Support array type as input for functions returning AbstractArray instance #16740).

  • Pick a name. There are a few functions/operators with aliases. I think this is fine in cases where one of the names is non-ASCII and the ASCII version is provided so people can still write pure-ASCII code, but there are also cases like <: which is an alias for issubtype where both names are ASCII. We should pick one and deprecated the other. We deprecated is in favor of === and should do similarly here.

  • Consistency with DataStructures. It's somewhat beyond the scope of Base Julia, but we should make sure that all of collections in DataStructures have consistent APIs with those provided by Base. The connection in the other direction is that some of those types may inform how we end up designing the APIs in Base since we want them to extend smoothly and consistently.

  • NaNs vs. DomainErrors. See NaN vs wild (or, what's a DomainError, really?) #5234 – have a policy for when to do which and make sure it is followed consistently.

  • Collection <=> generator. Sometimes you want a collection, sometimes you want a generator. We should go through all our APIs and make sure there's an option for both where it makes sense. Once upon a time, there was a convention to use an uppercase name for the generator version and a lowercase name for the version that's eager and returns a new collection. But no one ever paid any attention to that, so maybe we need a new convention.

  • Higher order functions on associatives. Currently some higher order functions iterate over associative collections with signature (k,v) – e.g. map, filter. Others iterate over pairs, i.e. with signature kv, requiring the body to explicitly destructure the pair into k and v – e.g. all, any. This should be reviewed and made consistent.

  • Convert vs. construct. Allow conversion where appropriate. E.g. there have been multiple issues/questions about convert(String, 'x'). In general, conversion is appropriate when there is a single canonical transformation. Conversion of strings into numbers in general isn't appropriate because there are many textual ways to represent numbers, so we need to parse instead, with options. There's a single canonical way to represent version numbers as strings, however, so we may convert those. We should apply this logic carefully and universally.

  • Review completeness of collections API. We should look at the standard library functions for collections provided by other languages and make sure we have a way of expressing the common operations they have. For example, we don't have a flatten function or a concat function. We probably should.

  • Underscore audit.

@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Feb 2, 2017
@StefanKarpinski StefanKarpinski self-assigned this Feb 2, 2017
@ararslan
Copy link
Member

ararslan commented Feb 2, 2017

Apologies if this isn't the appropriate place to mention this, but it would be nice to be more consistent with underscores in function names going forward.

@StefanKarpinski
Copy link
Sponsor Member Author

No, this is a good place for that. And yes, we should strive to eliminate all names where underscores are necessary :)

@tkelman
Copy link
Contributor

tkelman commented Feb 2, 2017

  • consistent treatment of "calculate across this [these] dimension[s]" input arguments, what types are allowed etc, consider whether doing these as keyword args might be desired
  • listing and prioritizing our what-comes-first conventions in terms of function arguments for do-blocks, IO arguments for functions that print, outputs for in-place functions, etc (edit: thought there might already be one open for this)

@ararslan
Copy link
Member

ararslan commented Feb 2, 2017

For @tkelman's second point, see #19150

@ararslan
Copy link
Member

ararslan commented Feb 2, 2017

There was also a recent Julep regarding the API for find and related functions: https://github.com/JuliaLang/Juleps/blob/master/Find.md

@shashi
Copy link
Contributor

shashi commented Feb 3, 2017

Should we deprecate put! and take! on channels (and maybe do the same for futures) since we have push! and shift! on them? Just suggesting removing 2 redundant words in the API.

I am suspicious of shift! being user friendly. A candidate is fetch! we already have fetch which is the non-mutating version of take!

ref #13538 #12469

@amitmurthy @malmaud

Edit: It would even make sense to reuse send and recv on channels. (I'm surprised that these are only used for UDPSockets at the moment)

@amitmurthy
Copy link
Contributor

+1 for replacing put!/take! with push!/fetch!

@nalimilan
Copy link
Member

I'll add renaming @inferred to @test_inferred.

@martinholters
Copy link
Member

Double-check that specializations are consistent with the more generic functions, i.e. not something like #20233.

@dpsanders
Copy link
Contributor

Review all exported functions to check if any can be eliminated by replacing them with multiple dispatch, e.g. print_with_color

@StefanKarpinski
Copy link
Sponsor Member Author

The typical pairing is push! and shift! when working with a queue-like data structure.

@StefanKarpinski
Copy link
Sponsor Member Author

StefanKarpinski commented Feb 3, 2017

If we're not going to use the typical name pairing for this kind of data structure because we're worried that the operation entails communication overhead that isn't adequately conveyed by those names, then I don't think push! makes sense either. send and recv really might be better.

@malmaud
Copy link
Contributor

malmaud commented Feb 3, 2017

Maybe double-check that there is general consistency between whether functions take a tuple as the last argument or a vararg.

@simonbyrne
Copy link
Contributor

simonbyrne commented Feb 3, 2017

Perhaps too big for this issue, but it would be good to have consistent rules on when functions should throw errors, and when they should return Nullables or Unions (e.g. parse/tryparse, match, etc.)

@StefanKarpinski
Copy link
Sponsor Member Author

No issue too big, @simonbyrne – this is the laundry list.

@StefanKarpinski
Copy link
Sponsor Member Author

Btw: this isn't really for specific changes (e.g. renaming specific functions) – it's more about kinds of things we can review. For specific proposed changes, just open an issue proposing that change.

@bramtayl
Copy link
Contributor

bramtayl commented Feb 4, 2017

We have a lot of tools like @code_xxx that are paired with underlying functions like code_xxx

Not sure if this is what you're talking about, but see CreateMacrosFrom.jl

@tkelman
Copy link
Contributor

tkelman commented Feb 4, 2017

@dpsanders
Copy link
Contributor

  • Document all exported functions (including doctests)

@pkofod
Copy link
Contributor

pkofod commented Feb 7, 2017

Document all exported functions (including doctests)

if this is part of this, then maybe also: remember to label your tests with the issue/pr number. It makes it a lot easier to understand why that test is there. I know how git blame works, but when adding testsets (just to give an example) it's sometimes a bit of a mystery what is being tested, and it would be great if the issue/pr number was always there.

@stevengj
Copy link
Member

stevengj commented Feb 10, 2017

@dpsanders: and exported macros! e.g. @fastmath has no docstring.

@amellnik
Copy link
Contributor

amellnik commented Apr 18, 2017

This is very minor, but the string and Symbol functions do almost the same thing and have different capitalization. I think symbol would make more sense.

@ararslan
Copy link
Member

@amellnik The difference is that Symbol is a type constructor and string is a regular function. IIRC we used to have symbol but it was deprecated in favor of the type constructor. I'm not convinced a change is necessary for this, but if anything I think we should use the String constructor in place of string.

@yuyichao
Copy link
Contributor

if anything I think we should use the String constructor in place of string.

No, they are different functions and shouldn't be merged

julia> String(UInt8[])
""

julia> string(UInt8[])
"UInt8[]"

@jrevels
Copy link
Member

jrevels commented Apr 18, 2017

No, they are different functions and shouldn't be merged

This looks like a situation where string(args...) should just be deprecated in favor of sprint(print, args...), then - having both string and String is confusing. We could specialize on sprint(::typeof(print), args...) to recover any lost performance. Along these lines, it might also make sense to deprecate repr(x) for sprint(showall, args...).

@yuyichao
Copy link
Contributor

That sounds ok although calling string to turn something into a string seems pretty standard....

@ararslan
Copy link
Member

calling string to turn something into a string seems pretty standard

Yes, but that's where the disconnect between String and string comes in.

@StefanKarpinski
Copy link
Sponsor Member Author

Please take a look! I haven't gotten a chance to go through these issues systematically.

@nalimilan
Copy link
Member

BTW, I've noted an inconsistency in the naming of traits: we have iteratorsize, iteratoreltype, but IndexStyle, TypeRangeStep, TypeArithmetic and TypeOrder. Looks like the CamelCase variants are more numerous and more recent, so maybe we should adopt that convention everywhere?

@StefanKarpinski
Copy link
Sponsor Member Author

Those should definitely be made consistent. Do you want to make a PR?

@nalimilan
Copy link
Member

nalimilan commented Jan 2, 2018

I think this should be fixed as part of #25356.

EDIT: see also #25440

@JeffBezanson JeffBezanson added triage This should be discussed on a triage call and removed triage This should be discussed on a triage call labels Jan 10, 2018
@StefanKarpinski
Copy link
Sponsor Member Author

This is mostly done or can be done in 1.x releases. I can update the checkboxes, but we just went through them on the triage call and everything but #25395 and the underscore audit are done.

@ararslan
Copy link
Member

ararslan commented Jan 17, 2018

Underscore audit

The following is an analysis of all symbols exported from Base which contain underscores, are not deprecated, and are not string macros. The main thing to note here is that these are exported names only; this does not include unexported names that we tell people to call qualified.

I've separated things out by category. Hopefully that's more useful than it is annoying.

Reflection

We have the following macros with corresponding functions:

  • @code_llvm, code_llvm
  • @code_lowered, code_lowered
  • @code_native, code_native
  • @code_typed, code_typed
  • @code_warntype, code_warntype

Whatever change is applied to the macros, if any, should be similarly applied to the functions.

pointer_from_objref could perhaps do with a more descriptive name, maybe something like address?

Aliases for C interop

The type aliases containing underscores are C_NULL, Cintmax_t, Cptrdiff_t, Csize_t, Cssize_t, Cuintmax_t, and Cwchar_t. Those that end in _t should stay, as they're named to be consistent with their corresponding C types.

C_NULL is the odd one out here, being the only C alias containing an underscore that isn't mirrored in C (since in C this is just NULL). We could consider calling this CNULL.

  • C_NULL

Bit counting

  • count_ones
  • count_zeros
  • trailing_ones
  • trailing_zeros
  • leading_ones
  • leading_zeros

For a discussion of renaming these, see #23531. I very much favor removing the underscores for these, as well as some of proposed replacements in that PR. I think it should be reconsidered.

Unsafe operations

  • unsafe_copyto!
  • unsafe_load
  • unsafe_pointer_to_objref
  • unsafe_read
  • unsafe_store!
  • unsafe_string
  • unsafe_trunc
  • unsafe_wrap
  • unsafe_write

It's probably okay to keep these as-is; the ugliness of the underscore further underscores their unsafety.

Indexing

  • broadcast_getindex
  • broadcast_setindex!
  • to_indices

Apparently broadcast_getindex and broadcast_setindex! exist. I don't understand what they do. Perhaps they could use a more descriptive name?

Interestingly, the single index version of to_indices, Base.to_index, is not exported.

Traces

Presumably these are the catch block equivalents of backtrace and stacktrace, respectively.

Tasks, processes, and signals

  • current_task
  • task_local_storage
  • disable_sigint
  • reenable_sigint
  • process_exited
  • process_running

Streams

It would be nice to have a more general IO -> IO redirection function into which all of these could be combined, e.g. redirect(STDOUT, io), thereby removing both underscores and exports.

Promotion

  • promote_rule
  • promote_shape
  • promote_type

See #23999 for a relevant discussion regarding promote_rule.

Printing

escape_string and unescape_string are a little odd in that they can print to a stream or return a string. See #25620 for a proposal to move/rename these.

Code loading

  • include_dependency
  • include_string

include_dependency. Is this even used outside of Base? I can't think of a situation where you would want this instead of include in any typical scenario.

include_string. Isn't this just an officially sanctioned version of eval(parse())?

Things I didn't bother categorizing

get_zero_subnormals and set_zero_subnormals could do with more descriptive names. Do they need to be exported?

@JeffBezanson
Copy link
Sponsor Member

+1 for method_exists => methodexists and object_id => objectid. It's also kind of silly that catch_stacktrace even exists. It can be deprecated to its definition, stacktrace(catch_backtrace()).

@JeffBezanson
Copy link
Sponsor Member

How do we feel about de-underscoring C_NULL? I've gotten pretty used to it, but I also buy the argument that none of the other C* names have an underscore.

@iamed2
Copy link
Contributor

iamed2 commented Jan 22, 2018

The other C names are types, while C_NULL is a constant. I think it's good how it is and follows naming guidelines.

@ararslan
Copy link
Member

and follows naming guidelines.

How so?

@StefanKarpinski
Copy link
Sponsor Member Author

Constants are often all caps with underscores – C_NULL follows that. As @iamed2 said, it's a value, not a type, so the Cfoo naming convention doesn't necessarily apply.

@iamed2
Copy link
Contributor

iamed2 commented Jan 22, 2018

I mistakenly thought https://github.com/JuliaLang/julia/blob/master/doc/src/manual/variables.md#stylistic-conventions referenced constants but it doesn't. It probably should.

@juliohm
Copy link
Sponsor Contributor

juliohm commented Jan 25, 2018

I suggest a consistent, mathematically sound, interface for general Hilbert spaces in which vectors are not Julia Arrays. Function names like vecdot, vecnorm, etc. could well be replaced by the general concepts of inner and norm as discussed in #25565.

@StefanKarpinski
Copy link
Sponsor Member Author

As I've said a few times, this is not a catchall issue for things one wants to change.

@JeffBezanson
Copy link
Sponsor Member

I believe the only items remaining under this umbrella for 1.0 are #25501 and #25717.

@ararslan
Copy link
Member

I'd like to do something with (get|set)_zero_subnormals but maybe the best short-term solution is to just unexport them.

@nalimilan
Copy link
Member

Something which should probably review is how numbers are treated in the context of collection operations like map and collect. It was pointed that the former returns a scalar but the latter returns a 0D array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecation This change introduces or involves a deprecation
Projects
None yet
Development

No branches or pull requests