API consistency review #20402

StefanKarpinski · 2017-02-02T17:02:16Z

I'm starting this as a place to leave notes about things to make sure to consider when checking for API consistency in Julia 1.0.

Convention prioritization. Listing and prioritizing our what-comes-first conventions in terms of function arguments for do-blocks, IO arguments for functions that print, outputs for in-place functions, etc (official conventional argument precedence #19150).
Positional vs keyword arguments. Long ago we didn't have keyword arguments. They're still sometimes avoided for performance considerations. We should make this choice based on what makes the best API, not on that kind of historical baggage (keyword performance issues should also be addressed so that this is no longer a consideration).
Metaprogramming tools. We have a lot of tools like @code_xxx that are paired with underlying functions like code_xxx. These should behave consistently: similar signatures, if there are functions with similar signatures, make sure they have similar macro versions. Ideally, they should all return values, rather than some returning values and others printing results, although that might be hard for things like LLVM code and assembly code.
IO <=> file name equivalence. We generally allow file names as strings to be passed in place of IO objects and the standard behavior is to open the file in the appropriate mode, pass the resulting IO object to the same function with the same arguments, and then ensure that the IO object is closed afterwards. Verify that all appropriate IO-accepting functions follow this pattern.
Reducers APIs. Make sure reducers have consistent behaviors – all take a map function before reduction; congruent dimension arguments, etc.
Dimension arguments. Consistent treatment of "calculate across this [these] dimension[s]" input arguments, what types are allowed etc, consider whether doing these as keyword args might be desired.
Mutating/non-mutating pairs. Check that non-mutating functions are paired with mutating functions where it makes sense and vice versa.
Tuple vs. vararg. Check that there is general consistency between whether functions take a tuple as the last argument or a vararg.
Unions vs. nullables vs. errors. Consistent rules on when functions should throw errors, and when they should return Nullables or Unions (e.g. parse/tryparse, match, etc.).
Support generators as widely as possible. Make sure any function that could sensibly work with generators does so. We're pretty good about this already, but I'm guessing we've missed a few.
Output type selection. Be consistent about whether "output type" API's should be in terms of element type or overall container type (ref Functions that return arrays with eltype as input should use container type instead? #11557 and RFC/WIP: Support array type as input for functions returning AbstractArray instance #16740).
Pick a name. There are a few functions/operators with aliases. I think this is fine in cases where one of the names is non-ASCII and the ASCII version is provided so people can still write pure-ASCII code, but there are also cases like <: which is an alias for issubtype where both names are ASCII. We should pick one and deprecated the other. We deprecated is in favor of === and should do similarly here.
Consistency with DataStructures. It's somewhat beyond the scope of Base Julia, but we should make sure that all of collections in DataStructures have consistent APIs with those provided by Base. The connection in the other direction is that some of those types may inform how we end up designing the APIs in Base since we want them to extend smoothly and consistently.
NaNs vs. DomainErrors. See NaN vs wild (or, what's a DomainError, really?) #5234 – have a policy for when to do which and make sure it is followed consistently.
Collection <=> generator. Sometimes you want a collection, sometimes you want a generator. We should go through all our APIs and make sure there's an option for both where it makes sense. Once upon a time, there was a convention to use an uppercase name for the generator version and a lowercase name for the version that's eager and returns a new collection. But no one ever paid any attention to that, so maybe we need a new convention.
Higher order functions on associatives. Currently some higher order functions iterate over associative collections with signature (k,v) – e.g. map, filter. Others iterate over pairs, i.e. with signature kv, requiring the body to explicitly destructure the pair into k and v – e.g. all, any. This should be reviewed and made consistent.
Convert vs. construct. Allow conversion where appropriate. E.g. there have been multiple issues/questions about convert(String, 'x'). In general, conversion is appropriate when there is a single canonical transformation. Conversion of strings into numbers in general isn't appropriate because there are many textual ways to represent numbers, so we need to parse instead, with options. There's a single canonical way to represent version numbers as strings, however, so we may convert those. We should apply this logic carefully and universally.
Review completeness of collections API. We should look at the standard library functions for collections provided by other languages and make sure we have a way of expressing the common operations they have. For example, we don't have a flatten function or a concat function. We probably should.
Underscore audit.

The text was updated successfully, but these errors were encountered:

ararslan · 2017-02-02T18:15:06Z

Apologies if this isn't the appropriate place to mention this, but it would be nice to be more consistent with underscores in function names going forward.

StefanKarpinski · 2017-02-02T20:50:48Z

No, this is a good place for that. And yes, we should strive to eliminate all names where underscores are necessary :)

tkelman · 2017-02-02T21:26:44Z

consistent treatment of "calculate across this [these] dimension[s]" input arguments, what types are allowed etc, consider whether doing these as keyword args might be desired
listing and prioritizing our what-comes-first conventions in terms of function arguments for do-blocks, IO arguments for functions that print, outputs for in-place functions, etc (edit: thought there might already be one open for this)

ararslan · 2017-02-02T21:34:04Z

For @tkelman's second point, see #19150

ararslan · 2017-02-02T21:38:35Z

There was also a recent Julep regarding the API for find and related functions: https://github.com/JuliaLang/Juleps/blob/master/Find.md

shashi · 2017-02-03T07:20:46Z

Should we deprecate put! and take! on channels (and maybe do the same for futures) since we have push! and shift! on them? Just suggesting removing 2 redundant words in the API.

I am suspicious of shift! being user friendly. A candidate is fetch! we already have fetch which is the non-mutating version of take!

ref #13538 #12469

@amitmurthy @malmaud

Edit: It would even make sense to reuse send and recv on channels. (I'm surprised that these are only used for UDPSockets at the moment)

amitmurthy · 2017-02-03T07:27:09Z

+1 for replacing put!/take! with push!/fetch!

nalimilan · 2017-02-03T09:14:14Z

I'll add renaming @inferred to @test_inferred.

martinholters · 2017-02-03T10:32:42Z

Double-check that specializations are consistent with the more generic functions, i.e. not something like #20233.

dpsanders · 2017-02-03T12:10:46Z

Review all exported functions to check if any can be eliminated by replacing them with multiple dispatch, e.g. print_with_color

StefanKarpinski · 2017-02-03T14:54:13Z

The typical pairing is push! and shift! when working with a queue-like data structure.

StefanKarpinski · 2017-02-03T14:56:01Z

If we're not going to use the typical name pairing for this kind of data structure because we're worried that the operation entails communication overhead that isn't adequately conveyed by those names, then I don't think push! makes sense either. send and recv really might be better.

malmaud · 2017-02-03T15:12:38Z

Maybe double-check that there is general consistency between whether functions take a tuple as the last argument or a vararg.

simonbyrne · 2017-02-03T16:07:28Z

Perhaps too big for this issue, but it would be good to have consistent rules on when functions should throw errors, and when they should return Nullables or Unions (e.g. parse/tryparse, match, etc.)

StefanKarpinski · 2017-02-03T21:37:37Z

No issue too big, @simonbyrne – this is the laundry list.

StefanKarpinski · 2017-02-03T21:40:47Z

Btw: this isn't really for specific changes (e.g. renaming specific functions) – it's more about kinds of things we can review. For specific proposed changes, just open an issue proposing that change.

bramtayl · 2017-02-04T00:25:53Z

We have a lot of tools like @code_xxx that are paired with underlying functions like code_xxx

Not sure if this is what you're talking about, but see CreateMacrosFrom.jl

tkelman · 2017-02-04T10:21:41Z

Whether "output type" API's should be in terms of element type or overall container type (ref Functions that return arrays with eltype as input should use container type instead? #11557 and RFC/WIP: Support array type as input for functions returning AbstractArray instance #16740)

dpsanders · 2017-02-07T13:13:29Z

Document all exported functions (including doctests)

pkofod · 2017-02-07T22:04:29Z

Document all exported functions (including doctests)

if this is part of this, then maybe also: remember to label your tests with the issue/pr number. It makes it a lot easier to understand why that test is there. I know how git blame works, but when adding testsets (just to give an example) it's sometimes a bit of a mystery what is being tested, and it would be great if the issue/pr number was always there.

stevengj · 2017-02-10T02:23:07Z

@dpsanders: and exported macros! e.g. @fastmath has no docstring.

amellnik · 2017-04-18T18:16:18Z

This is very minor, but the string and Symbol functions do almost the same thing and have different capitalization. ~~I think symbol would make more sense.~~

ararslan · 2017-04-18T18:24:06Z

@amellnik The difference is that Symbol is a type constructor and string is a regular function. IIRC we used to have symbol but it was deprecated in favor of the type constructor. I'm not convinced a change is necessary for this, but if anything I think we should use the String constructor in place of string.

yuyichao · 2017-04-18T18:36:05Z

if anything I think we should use the String constructor in place of string.

No, they are different functions and shouldn't be merged

julia> String(UInt8[])
""

julia> string(UInt8[])
"UInt8[]"

jrevels · 2017-04-18T18:59:47Z

No, they are different functions and shouldn't be merged

This looks like a situation where string(args...) should just be deprecated in favor of sprint(print, args...), then - having both string and String is confusing. We could specialize on sprint(::typeof(print), args...) to recover any lost performance. Along these lines, it might also make sense to deprecate repr(x) for sprint(showall, args...).

yuyichao · 2017-04-18T19:07:17Z

That sounds ok although calling string to turn something into a string seems pretty standard....

ararslan · 2017-04-18T19:22:55Z

calling string to turn something into a string seems pretty standard

Yes, but that's where the disconnect between String and string comes in.

StefanKarpinski · 2017-12-16T02:59:39Z

Please take a look! I haven't gotten a chance to go through these issues systematically.

nalimilan · 2018-01-02T13:45:50Z

BTW, I've noted an inconsistency in the naming of traits: we have iteratorsize, iteratoreltype, but IndexStyle, TypeRangeStep, TypeArithmetic and TypeOrder. Looks like the CamelCase variants are more numerous and more recent, so maybe we should adopt that convention everywhere?

StefanKarpinski · 2018-01-02T15:12:22Z

Those should definitely be made consistent. Do you want to make a PR?

nalimilan · 2018-01-02T15:14:01Z

I think this should be fixed as part of #25356.

EDIT: see also #25440

StefanKarpinski · 2018-01-11T20:24:17Z

This is mostly done or can be done in 1.x releases. I can update the checkboxes, but we just went through them on the triage call and everything but #25395 and the underscore audit are done.

ararslan · 2018-01-17T01:01:02Z

Underscore audit

The following is an analysis of all symbols exported from Base which contain underscores, are not deprecated, and are not string macros. The main thing to note here is that these are exported names only; this does not include unexported names that we tell people to call qualified.

I've separated things out by category. Hopefully that's more useful than it is annoying.

Reflection

We have the following macros with corresponding functions:

Whatever change is applied to the macros, if any, should be similarly applied to the functions.

module_name -> nameof (Deprecate module_name in favor of nameof(::Module) #25622)
module_parent -> parentmodule (Deprecate module querying functions to parentmodule methods #25629, see [RFC] rename module_parent to enclosingmodule #25436 for a previous attempt at renaming)
method_exists -> hasmethod (remove some underscores #25615)
object_id -> objectid (remove some underscores #25615)
pointer_from_objref

pointer_from_objref could perhaps do with a more descriptive name, maybe something like address?

Aliases for C interop

The type aliases containing underscores are C_NULL, Cintmax_t, Cptrdiff_t, Csize_t, Cssize_t, Cuintmax_t, and Cwchar_t. Those that end in _t should stay, as they're named to be consistent with their corresponding C types.

C_NULL is the odd one out here, being the only C alias containing an underscore that isn't mirrored in C (since in C this is just NULL). We could consider calling this CNULL.

C_NULL

Bit counting

For a discussion of renaming these, see #23531. I very much favor removing the underscores for these, as well as some of proposed replacements in that PR. I think it should be reconsidered.

Unsafe operations

It's probably okay to keep these as-is; the ugliness of the underscore further underscores their unsafety.

Indexing

broadcast_getindex
broadcast_setindex!
to_indices

Apparently broadcast_getindex and broadcast_setindex! exist. I don't understand what they do. Perhaps they could use a more descriptive name?

Interestingly, the single index version of to_indices, Base.to_index, is not exported.

Traces

catch_backtrace
catch_stacktrace -> stacktrace(catch_backtrace()) (remove some underscores #25615)

Presumably these are the catch block equivalents of backtrace and stacktrace, respectively.

Tasks, processes, and signals

Streams

redirect_stderr
redirect_stdin
redirect_stdout
nb_available -> bytesavailable (Rename nb_available to bytesavailable #25634)

It would be nice to have a more general IO -> IO redirection function into which all of these could be combined, e.g. redirect(STDOUT, io), thereby removing both underscores and exports.

Promotion

promote_rule
promote_shape
promote_type

See #23999 for a relevant discussion regarding promote_rule.

Printing

print_with_color -> printstyled (see Deprecate print_with_color #25522)
print_shortest (see deprecate print_shortest? #25745)
escape_string (see Deprecate (un)escape_string to Unicode.(un)escape #25620)
unescape_string

escape_string and unescape_string are a little odd in that they can print to a stream or return a string. See #25620 for a proposal to move/rename these.

Code loading

include_dependency
include_string

include_dependency. Is this even used outside of Base? I can't think of a situation where you would want this instead of include in any typical scenario.

include_string. Isn't this just an officially sanctioned version of eval(parse())?

Things I didn't bother categorizing

gc_enable -> GC.enable (Move gc and gc_enable to their own module #25616)
get_zero_subnormals
set_zero_subnormals
time_ns

get_zero_subnormals and set_zero_subnormals could do with more descriptive names. Do they need to be exported?

JeffBezanson · 2018-01-17T04:44:58Z

+1 for method_exists => methodexists and object_id => objectid. It's also kind of silly that catch_stacktrace even exists. It can be deprecated to its definition, stacktrace(catch_backtrace()).

JeffBezanson · 2018-01-22T22:08:50Z

How do we feel about de-underscoring C_NULL? I've gotten pretty used to it, but I also buy the argument that none of the other C* names have an underscore.

iamed2 · 2018-01-22T22:22:21Z

The other C names are types, while C_NULL is a constant. I think it's good how it is and follows naming guidelines.

ararslan · 2018-01-22T22:23:50Z

and follows naming guidelines.

How so?

StefanKarpinski · 2018-01-22T22:26:55Z

Constants are often all caps with underscores – C_NULL follows that. As @iamed2 said, it's a value, not a type, so the Cfoo naming convention doesn't necessarily apply.

iamed2 · 2018-01-22T22:35:44Z

I mistakenly thought https://github.com/JuliaLang/julia/blob/master/doc/src/manual/variables.md#stylistic-conventions referenced constants but it doesn't. It probably should.

juliohm · 2018-01-25T02:30:21Z

I suggest a consistent, mathematically sound, interface for general Hilbert spaces in which vectors are not Julia Arrays. Function names like vecdot, vecnorm, etc. could well be replaced by the general concepts of inner and norm as discussed in #25565.

StefanKarpinski · 2018-01-25T05:34:06Z

As I've said a few times, this is not a catchall issue for things one wants to change.

JeffBezanson · 2018-01-29T20:45:38Z

I believe the only items remaining under this umbrella for 1.0 are #25501 and #25717.

ararslan · 2018-01-29T20:57:14Z

I'd like to do something with (get|set)_zero_subnormals but maybe the best short-term solution is to just unexport them.

nalimilan · 2018-02-02T18:26:09Z

Something which should probably review is how numbers are treated in the context of collection operations like map and collect. It was pointed that the former returns a scalar but the latter returns a 0D array.

StefanKarpinski added this to the 1.0 milestone Feb 2, 2017

StefanKarpinski self-assigned this Feb 2, 2017

StefanKarpinski mentioned this issue Feb 9, 2017

implement unique! #20549

Closed

yurivish mentioned this issue Dec 19, 2017

Change argument order to read when reading a type #25189

Closed

bramtayl mentioned this issue Dec 19, 2017

Moar keywords #25188

Closed

mauro3 mentioned this issue Dec 21, 2017

RFC: IdDict replaces ObjectIdDict #25210

Merged

nalimilan mentioned this issue Jan 2, 2018

RFC: Do not consider iterators as scalars in broadcast #25356

Closed

bramtayl mentioned this issue Jan 4, 2018

Keywords unlocked part 1 #25395

Closed

nalimilan mentioned this issue Jan 7, 2018

Find better names for traits TypeOrder, TypeArithmetic and TypeRangeStep #25440

Closed

JeffBezanson added triage This should be discussed on a triage call and removed triage This should be discussed on a triage call labels Jan 10, 2018

JeffBezanson mentioned this issue Jan 17, 2018

remove some underscores #25615

Merged

JeffBezanson closed this as completed Jan 29, 2018

vtjnash mentioned this issue Nov 27, 2018

reliable line numbers in code info printing #29893

Merged

tkf mentioned this issue Sep 30, 2019

Decide API for setproperties and constructorof JuliaObjects/ConstructionBase.jl#1

Merged

Tokazama mentioned this issue Nov 8, 2022

Functional non-mutating methods. #46453

Open

API consistency review #20402

API consistency review #20402

Comments

StefanKarpinski commented Feb 2, 2017 • edited by JeffBezanson

ararslan commented Feb 2, 2017

StefanKarpinski commented Feb 2, 2017

tkelman commented Feb 2, 2017 • edited

ararslan commented Feb 2, 2017

ararslan commented Feb 2, 2017

shashi commented Feb 3, 2017 • edited

amitmurthy commented Feb 3, 2017

nalimilan commented Feb 3, 2017

martinholters commented Feb 3, 2017

dpsanders commented Feb 3, 2017

StefanKarpinski commented Feb 3, 2017

StefanKarpinski commented Feb 3, 2017 • edited

malmaud commented Feb 3, 2017

simonbyrne commented Feb 3, 2017 • edited

StefanKarpinski commented Feb 3, 2017

StefanKarpinski commented Feb 3, 2017

bramtayl commented Feb 4, 2017

tkelman commented Feb 4, 2017

dpsanders commented Feb 7, 2017

pkofod commented Feb 7, 2017

stevengj commented Feb 10, 2017 • edited

amellnik commented Apr 18, 2017 • edited

ararslan commented Apr 18, 2017

yuyichao commented Apr 18, 2017

jrevels commented Apr 18, 2017

yuyichao commented Apr 18, 2017

ararslan commented Apr 18, 2017

StefanKarpinski commented Dec 16, 2017

nalimilan commented Jan 2, 2018

StefanKarpinski commented Jan 2, 2018

nalimilan commented Jan 2, 2018 • edited

StefanKarpinski commented Jan 11, 2018

ararslan commented Jan 17, 2018 • edited

Underscore audit

Reflection

Aliases for C interop

Bit counting

Unsafe operations

Indexing

Traces

Tasks, processes, and signals

Streams

Promotion

Printing

Code loading

Things I didn't bother categorizing

JeffBezanson commented Jan 17, 2018

JeffBezanson commented Jan 22, 2018

iamed2 commented Jan 22, 2018

ararslan commented Jan 22, 2018

StefanKarpinski commented Jan 22, 2018

iamed2 commented Jan 22, 2018

juliohm commented Jan 25, 2018

StefanKarpinski commented Jan 25, 2018

JeffBezanson commented Jan 29, 2018

ararslan commented Jan 29, 2018

nalimilan commented Feb 2, 2018

StefanKarpinski commented Feb 2, 2017 •

edited by JeffBezanson

tkelman commented Feb 2, 2017 •

edited

shashi commented Feb 3, 2017 •

edited

StefanKarpinski commented Feb 3, 2017 •

edited

simonbyrne commented Feb 3, 2017 •

edited

stevengj commented Feb 10, 2017 •

edited

amellnik commented Apr 18, 2017 •

edited

nalimilan commented Jan 2, 2018 •

edited

ararslan commented Jan 17, 2018 •

edited