Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue to track comments and updates to the proposed Checked C extension changes. #470

Open
sulekhark opened this issue Aug 24, 2021 · 3 comments

Comments

@sulekhark
Copy link
Contributor

This page proposes a set of changes to the Checked C extension to improve backward compatibility. Creating this issue to track review/suggestions/updates to this proposal.

@mwhicks1
Copy link
Collaborator

Overall this looks good! Here are a few comments.

Including an extra header

The proposal states

You would modify your C files to include a header file that results in Checked C annotations and changes being removed when the compiler does not support Checked C.

This implies that a normal compiler will interpret the header, which serves to undo the Checked C syntax, but that the Checked C clang will override normal #include processing so as to ignore this header. Two questions:

  • Must the user include this header to enable the alternative syntax when using the Checked C compiler?
  • Will there be a way for the normal Checked C compiler to treat the header as a normal compiler would, so that it has the effect of undoing Checked C syntax?

It feels unfortunate that this is done via a header inclusion; it's a change to the source file itself. E.g., if you could just change environment variables or something like that, that would be preferred (to me).

Type qualifiers on pointers

I like the idea of treating Checked types as type qualifiers. Note that this idea was done in Cyclone in ~2005, so it predates the Condit paper you cite. (See the Cyclone manual, which presents notation that prefixes qualifiers with @, and the introduction of this idea in the 2006 SCP paper, where you can see the @aqual notation.)

You might include examples with nested qualifiers -- I presume you want things like

int * _Ptr * _Array p;

to indicate an array of singleton pointers. What happens if pointers already have qualifiers on them? E.g., many standard library functions use restrict. Would it be something like the following?

int * _Ptr restrict * _Array p;

I don't know how the C parser handles multiple qualifiers, but I presume it does and you can do the same thing.

Checked arrays (and blocks?)

Just want to make sure: You are basically saying that the syntax for the _Checked (and _Nt_checked) keywords is unchanged because they can be #defined away, whether attached to code blocks or to array declarations; is that right?

Polymorphic types

The requirement that programmers remember to typedef type variables seems like it's asking for trouble:

#ifndef __checkedc__
typedef void T
#endif

That is, I can imagine someone easily forgetting to do this. Moreover, type abstraction for generics is local, whereas the typedef will have to be global. There could be conflicts with other typedefs or variable names in the code, no?

Itypes

What about itypes? I don't see anything about these, and yet they are an important/pervasive part of the language. Are they basically similar to _Bounds(...) replacing : bounds(...) ?

@mattmccutchen-cci
Copy link
Contributor

mattmccutchen-cci commented Sep 7, 2021

Can we please have a more specific title for this issue and the wiki page? Maybe "erasable" Checked C syntax?

Re Mike's specific points:

  • Including an extra header: I think the idea is that the user's files will include optcheckedc.h unconditionally, but all the macro definitions in optcheckedc.h will be inside #ifndef __checkedc, so when the user's files are processed by the Checked C compiler, the macros won't get defined and the compiler will process the annotations as normal. This should be clarified in the wiki page.

    Will there be a way for the normal Checked C compiler to treat the header as a normal compiler would, so that it has the effect of undoing Checked C syntax?

    Why do we need this?

  • Polymorphic types: The wiki page was recently updated to solve this problem by using a _TyVar macro instead of typedefs.

  • Itypes: Yes, the end of the Bounds annotations section says there will be _Itype analogous to _Bounds.

One other problem I noticed:

  • The *_bounds_cast macros need to accept a relative alignment parameter as described in the Checked C specification.

Aside from that, the proposal looks good to me for all the Checked C language features that it covers (I may not be aware of some of the less common language features). It's possible I would find additional problems if I thought about it harder, but I think the chance that those problems would require significant rework is low enough that it's probably fine to proceed to implementation and address any further problems as we find them.

The syntax looks very cluttered and hard to read in the more complex cases such as bsearch, but I don't know if anything can be done about that within the design constraints. For complex pieces of code, we could offer the alternative approach of maintaining two copies of the code, one in plain C and one using dedicated (non-erasable) Checked C syntax, and have the Checked C compiler verify that they are consistent. (This might be similar to the int *p CHKC(_Ptr<int>); proposal that Mike relayed to us back in May.) For the subset of cases that involve declarations of redeclarable global program elements (functions and variables) that have only itypes rather than fully checked types, since the compiler already supports merging of unchecked and itype declarations, we can already more-or-less do this by putting the Checked C declaration inside #ifdef __checkedc, modulo the need to use #pragma CHECKED_SCOPE push/off/pop around every unchecked declaration (which would get annoying). We'd still need a solution for fully checked types and for non-redeclarable elements (local variables, structs, etc.), which might end up requiring significant compiler changes in total. But note that we already need a solution for non-redeclarable global elements for the Checked C system headers (see #452 for structs in particular), independent of the erasable syntax proposal.

As the "Motivation" section of the wiki page says, one problem that seems to be out of scope of the current proposal is that users get no guarantee of spatial memory safety when compiling with a plain-C compiler since the runtime checks won't be inserted. It would be great if users could compile with the Checked C compiler, verify that there are no errors, and then turn around and compile with a specialized plain-C compiler required by their target system and have a guarantee of spatial memory safety. This probably could be achieved if the Checked C compiler reported an error instead of inserting an implicit runtime check when it cannot prove that an operation is safe, thus forcing the user to add the necessary explicit _Dynamic_checks (which could be a useful feature independent of erasable syntax for users who want to audit the potential runtime failure sites in their code). If the compiler currently leaves it to a later LLVM optimization pass to determine the necessity of runtime checks, it would need to be changed to do that during compilation proper using its own bounds proof logic (as it does when one pointer is assigned to another). Shall I file a separate issue for this to possibly be considered in the future? Update 2021-09-28: I filed microsoft/checkedc-clang#1188 for an option to report insertion of implicit runtime checks, so that no longer needs to be tracked in this issue.

@sulekhark
Copy link
Contributor Author

sulekhark commented Sep 7, 2021

A proposal for name change (to emphasize that these are type qualifiers for pointer types, and to have uniform names for type qualifiers introduced by Checked C):

  • _Single to _SPtr
  • _Array to _APtr
  • _Nt_array to _NtPtr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants