Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ruby codegen: support generation of rbs files #15633

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

HoneyryderChuck
Copy link

this introduces support for a new protoc option, --rbs_out, which points to a directory where ruby type definition files, defined in the RBS format, are stored.

rbs is the type signature syntax blessed by the ruby core team, used by static analysis tools such as steep, which integrates with VS Code, the irb console (for features such as autocompletion). and typeprof.

It relies on type definitions written into .rbs files.

The protobuf library already exposes type definitions in gem_rbs_collection, which is used to source type definitions for libraries which do not want, or can't maintain type definitions themselves.

(protobuf could arguably import these into this repository, lmk if you're interested).

This should fix gaps such as better IDE integration, such as the ones described
here.

The plan is to do roughly the same type of integration as was done for .pyi annotations, which also write to separate files: add the cli option and the rbs generator.

protobuf classes in ruby rely on dynamic attribution of the base class, which makes what I'm trying to achieve a bit difficult. Ideally the type hierarchy could be specified statically in the ruby source code.

class Bar < AbstractMessage
class Bar <
Google::Protobuf::DescriptorPool.generated_pool.lookup("Bar").msgclass

@HoneyryderChuck HoneyryderChuck requested review from a team as code owners January 29, 2024 17:37
@HoneyryderChuck HoneyryderChuck requested review from ericsalo and removed request for a team January 29, 2024 17:37
Copy link

google-cla bot commented Jan 29, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@HoneyryderChuck
Copy link
Author

@acozzette can I ask for your input on directions here, given your thoughts described in #9495 ?

@zhangskz zhangskz requested review from acozzette and removed request for a team and ericsalo February 5, 2024 16:47
@acozzette
Copy link
Member

@HoneyryderChuck This looks like a great direction to me, but I will let @haberman review this since he is much more knowledgeable about Ruby than I am.

@HoneyryderChuck
Copy link
Author

@haberman any feedback? waiting for comments about the direction, before coming back to this.

@haberman haberman added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label May 6, 2024
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label May 6, 2024
@haberman
Copy link
Member

haberman commented May 6, 2024

Sorry for the delay. I think this is a good direction, and I appreciate the small and incremental PR. I think we should move forward with this.

It looks like the CLA check is failing.

@HoneyryderChuck
Copy link
Author

thx for the feedback @haberman . Did you have a look at #15881 , which may be already the endgame?

@haberman
Copy link
Member

haberman commented May 9, 2024

#15881 has a much greater scope, and I think we'd need to individually evaluate its many parts.

I'm convinced that we want a .rbs code generator that has static types for all of the generated APIs. I'm less certain about:

  • Adding .rbs files for our public core APIs
  • Adding .rbs files for our internal core APIs
  • Adding .rbs files for our unit tests
  • Adding Steep type checking to our build

@HoneyryderChuck
Copy link
Author

makes sense 👍 will deal with CLA and move on from there.

this introduces support for a new protoc option, `--rbs_out`, which
points to a directory where ruby type definition files, defined in the
RBS format, are stored.

[rbs](https://github.com/ruby/rbs) is the type signature syntax blessed
by the ruby core team, used by static analysis tools such as
[steep](https://github.com/soutaro/steep), which integrates with VS
Code, the `irb` console (for features such as autocompletion). and
[typeprof](https://github.com/ruby/typeprof).

It relies on type definitions written into `.rbs` files.

The `protobuf` library already exposes type definitions in
[gem_rbs_collection](https://github.com/ruby/gem_rbs_collection/tree/main/gems/google-protobuf/3.22),
which is used to source type definitions for libraries which do not
want, or can't maintain type definitions themselves.

(`protobuf` could arguably import these into this repository, lmk if
you're interested).

This should fix gaps such as better IDE integration, such as the ones
described
[here](protocolbuffers#9495).

The plan is to do roughly the same type of integration as was done for `.pyi`
annotations, which also write to separate files: add the cli option and
the rbs generator.

protobuf classes in ruby rely on dynamic attribution of the base class,
which makes what I'm trying to achieve a bit difficult. Ideally the type
hierarchy could be specified statically in the ruby source code.

```ruby
class Bar < AbstractMessage
class Bar <
Google::Protobuf::DescriptorPool.generated_pool.lookup("Bar").msgclass
```

```
@HoneyryderChuck HoneyryderChuck requested a review from a team as a code owner May 10, 2024 15:47
@HoneyryderChuck HoneyryderChuck changed the title draft: ruby codegen: support generation of rbs files ruby codegen: support generation of rbs files May 10, 2024
@HoneyryderChuck
Copy link
Author

@haberman went with cherry-picking @qnighy 's commit, and removing the parts not related with sole generation of rbs files on target protos. There was a lot done there that I'd struggle to replicate with the same quality, as I'm still fighting with the build system to generate binaries I can test with locally.

FWIW RBS definitions to protobuf itself are already available here; while it's possible to have the definitions within this repo itself as per the other MR, it's a choice of the maintainers.

@qnighy
Copy link
Contributor

qnighy commented May 13, 2024

Honestly, from the experience working on the PR, I agree that it is probably too early to introduce a typechecker to an existing large project like google-protobuf, which naturally have idioms incompatible with the current type system.

That said, I think it is better to bundle the public type definition here.

The reason: firstly, just similarly to other ecosystems like TypeScript, type definition files are not mere derivative of the runtime code; writing type definitions inherently involves additional design decision. For example, whether to give a type an alias or not, and its name, depends on the author of the type definitions.

Secondly, in google-protobuf, the runtime and the generated code are expected to work in cooperation, and this would also be true in type definitions as well. Therefore, design decision made in the runtime type definition naturally depends on the decisions in the generator.

So, to consistently maintain type definitions, I think it should bundle the one for the runtime as well.

@haberman
Copy link
Member

Those arguments make sense to me, but we also have to balance the benefits against the maintenance cost. The primary motivation for introducing type signatures for generated code is to make generated interfaces more discoverable and to help IDEs, because the Ruby generated code is currently very opaque, especially since we merged #12462.

The public APIs from the core runtime will also have this issue to some extent, since many interfaces are implemented in C and are probably not discoverable to IDEs. We have Rubydoc embedded in comments, like this example, but that won't help IDEs:

/*
* call-seq:
* DescriptorPool.lookup(name) => descriptor
*
* Finds a Descriptor, EnumDescriptor or FieldDescriptor by name and returns it,
* or nil if none exists with the given name.
*/

But once we're talking about internal-only APIs (like everything in ffi/) and especially unit tests, I think we start to get diminishing returns. The cost is having to write everything twice (once in .rb and once in .rbs), using an unfamiliar language/tool that nobody on the team currently knows, and we don't get any benefits unless we perform type checking in our own project. I think the maintenance burden would outweigh the benefit.

Steep describes itself as "Gradual Typing", which I assume means that type signatures are not required to be complete. Anything that has type signatures is checked, but everything else can continue to work even if it's not typed.

I think the sweet spot would be if we could generate .rbs for generated code, and possibly for the public interfaces in the core runtime, and add just enough type checking in our CI tests to ensure that those type signatures are correct with respect to the actual code. But we wouldn't bother with typing the internal-only interfaces or unit tests.

How does that sound?

@HoneyryderChuck
Copy link
Author

@haberman that sounds reasonable, but I don't think we need to do this as part of this PR. FWIW anyone forcibly generating rbs defs for their protobufs will very likely know about rbs-collection already, and get typedefs for core runtime.

There'd still be value in adding typedefs to core runtime; I could see steep integration in CI via the added rbs-based suite catching basic bugs, and could be gradually adopted by the core team (or not).

@haberman
Copy link
Member

I totally agree. If anything, I'm arguing to limit the scope of each individual PR and make things as incremental as possible.

@HoneyryderChuck
Copy link
Author

@haberman thx for the feedback. Would you mind adding the safe for tests label, so we can continue moving forward?

@haberman haberman added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label May 14, 2024
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label May 14, 2024
@haberman
Copy link
Member

I ran the tests, looks like the build failed.

@HoneyryderChuck
Copy link
Author

@haberman is it possible for the bot not to remove the "safe for tests" label?

@haberman
Copy link
Member

@haberman is it possible for the bot not to remove the "safe for tests" label?

Unfortunately not, the purpose of the tag is to visually verify each commit to ensure that our CI infrastructure isn't being exploited either for resources (eg. bitcoin mining) or to exfiltrate repository secrets.

@haberman haberman added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label May 29, 2024
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label May 29, 2024
@HoneyryderChuck
Copy link
Author

@haberman would you mind having a look? The CI failures seem unrelated to this change (windows bazel build failing on objective C / java, and some security check around workflow files).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants