Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add feature toggles (DSP-910) #1742

Merged
merged 46 commits into from Nov 5, 2020
Merged

Conversation

benjamingeer
Copy link

@benjamingeer benjamingeer commented Oct 26, 2020

  • Implementation framework in FeatureFactory.scala, KnoraSettings.scala, and KnoraRoute.scala
  • Proof of concept: replace the implementation of ListsRouteADM using a feature toggle (to be used for DSP-597)
  • Update all v2 and admin routes so they can use feature toggles
  • Fix R2RSpec so tests can override the application configuration
  • Tests: FeatureToggleR2RSpec
  • Docs:
    • Design/development docs
    • API docs

Resolves DSP-910.

@subotic subotic changed the base branch from develop to main November 2, 2020 07:38
@benjamingeer benjamingeer self-assigned this Nov 3, 2020
@benjamingeer benjamingeer added API/Admin API/V2 enhancement improve existing code or new feature labels Nov 3, 2020
Copy link
Contributor

@SepidehAlassi SepidehAlassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for implementing this. I just see the following problem about not being able to specify the version number when turning a toggle off:
Imagine we have initial newListRoute:1 and months later we add newListRoute:2 both are by default off. After a while, we decide to make a release with newListRoute:1 by default on, but newListRoute:2 to be by default off because other subsystems still need time to catch up with changes of newListRoute:2.
Now, imagine a client pulls this release and is ready with support of newListRoute:2 and wants to use it. He has to turn newListRoute:1 off and turn newListRoute:2 on. If he cannot specify the version of the toggle that must be off, as it is now to achieve what he wants he needs to do

 X-Knora-Feature-Toggles: new-foo=off, new-foo:2=on

that is kind of confusing. I would suggest specification of version for turning a toggle off to be possible, i.e. user must be able to set

 X-Knora-Feature-Toggles: new-foo:1=off, new-foo:2=on

License along with Knora. If not, see <http://www.gnu.org/licenses/>.
-->

# Feature Toggles
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very well explained!

```

A version number must be given when enabling a toggle.
It is an error to specify a version number when disabling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, in my opinion, the user should be able to specify which version of the toggle must be disabled.

```

A version number must be given when enabling a toggle.
It is an error to specify a version number when disabling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above!

/**
* A user representing the Knora API server, used for initialisation on startup.
*/
private val systemUser = KnoraSystemInstances.Users.SystemUser
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this deleted?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was unused.

/**
* Indicates that a feature toggle is off.
*/
case object Off extends FeatureToggleState
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the reason explained above, I believe Off state must also have the version number as Off(version:Int) to indicate which version of a feature toggle is off.

/**
* A matchable object indicating that a feature toggle is off.
*/
case object Off extends MatchableState[Nothing]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following the discussion above, I believe Off state should also know the version number

overrideAllowed: Boolean,
expirationDate: Option[Instant],
developerEmails: Set[String])

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add empty line here

* @param overrideAllowed `true` if this configuration can be overridden, e.g. by per-request feature
* toggle configuration.
* @param expirationDate the expiration date of the feature.
* @param developerEmails one or more email addresses of developers who can be contacted about the feature.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why we need the developer's email? since feature toggles last a year, maybe until it is removed a developer quits, then his email address would be a useless piece of information.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to know who to contact if you have questions about a feature. If a developer quits, we can update the email address.

}
}

"not accept a version number when disabling a toggle" in {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not?

@benjamingeer
Copy link
Author

benjamingeer commented Nov 4, 2020

After a while, we decide to make a release with newListRoute:1 by default on, but newListRoute:2 to be by default off because other subsystems still need time to catch up with changes of newListRoute:2.

This would be done with a base configuration like this:

new-list-route {
    available-versions = [ 1, 2 ]
    default-version = 1
    enabled-by-default = yes
}

Now, imagine a client pulls this release and is ready with support of newListRoute:2 and wants to use it. He has to turn newListRoute:1 off and turn newListRoute:2 on.

No, he just has to request:

 X-Knora-Feature-Toggles: newListRoute:2=on

I think it wouldn't make sense to specify a version when disabling a feature, because you can't enable more than one version at a time. It's not possible to enable both newListRoute:1 and newListRoute:2. So there are only two possibilities:

  1. You turn on one version of the feature.
  2. You turn off all versions of the feature.

To turn on one version of the feature, you just enable it, specifying the version you want. There's no need to disable the other versions, because you can only get one version.

@benjamingeer
Copy link
Author

@benjamingeer
Copy link
Author

@SepidehAlassi I've added docs to clarify things, and added a test. Is this better?

@SepidehAlassi
Copy link
Contributor

@SepidehAlassi I've added docs to clarify things, and added a test. Is this better?

yes, it is very clear now, thank you.

Copy link
Contributor

@SepidehAlassi SepidehAlassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes, it looks good to me.


# Feature Toggles

Some Knora features can be turned or off on a per-request basis.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess: "Some Knora features can be turned on or off on a per-request basis."?

came from.

- An HTTP response should indicate which features are turned
on.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the response header also going to return turned off toggles?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it only returns enabled toggles. I was thinking that if a toggle is turned off by default because the feature isn't ready to use yet, there wouldn't be much point in returning it. But maybe it would be clearer and simpler to return everything. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think returning everything might not be a good idea. Since the lifetime of toggles is a year, we might have many (>10 even) toggles in place that are turned off but still available. IMO, returning all of them would confuse the user.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the only time this header might get looked at by a developer is if there is a problem. In that case, I would probably like to have the whole picture. Ideally, there would also be something like x-knora-api-version. Those two headers would then paint the whole picture (in combination with a table in the documentation).

As for the confusion bit, well toggles are going to add another layer of complexity for understanding what is happening with the API. Hiding them, would in this case probably provide more confusion, because a turned off toggle, which should maybe be on can be valuable information during debugging.

should have expiration dates except for long-lived ops toggles like `fast-baz` above.

`KnoraSettingsFeatureFactoryConfig` reads this base configuration on startup. If
a feature toggle has an expiration date in the past, the application will not start.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is a hard one. I wouldn't punish ops for developers not removing a toggle in time ;-) Also what should then happen, if the expiration date is passed while the app is deployed into production. I hope the app will not stop working? I think a log output would be sufficient.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to:

the application will log the error message "SHOW-STOPPER!!!" and the server will then self-destruct

Just kidding. 😃

/**
* The old lists route feature.
*/
private val oldListsRouteADMFeature = new OldListsRouteADMFeature(routeData)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe move the factory and features to a sub-package, e.g., org.knora.webapi.routing.admin.lists?

Copy link
Collaborator

@subotic subotic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing this :-)

@SepidehAlassi
Copy link
Contributor

@benjamingeer I keep thinking about how we are going to add non-breaking new features, can you please not merge this PR until we discuss this thoroughly.

Please imagine the following scenario:
We haveoldListRoute (the original existing form), newListRoute:1 (contains some breaking changes), and newListRoute:2 (contains even more breaking changes). That would mean we have three versions of ListsRoute.scala file, 3 versions of its spec, and 3 versions of its documentation. At this point, we decide to add a totally new feature that is non-breaking (like a new route), how are we going to handle it? I see the following options, please correct me if I am missing something:

  1. Are we going to define a new toggle for it newListRoute:3 to wrap the new feature? That would mean the new feature would not be available to those who have not yet the support for newListRoute:2. That means clients cannot benefit from the new features for a long time because they do not have the support for the undelaying breaking changes.

  2. Are we going to add the new non-breaking feature to all three existing versions of the code (oldListRoute, newListRoute:1, and newListRoute:2)? This would make the new non-breaking feature available to all clients no matter which version of toggle they support, even those who still use the good old oldListRoute. However, that would mean adding the new piece of code to all 3 versions of ListsRoute.scala, its tests have to be added to all 3 versions of the ListsRouteSpec.scala, and its documentation has to be added to all three versions of lists.md. That would mean modifying 9 files instead of 3. The more toggles there are for routes, the more files to modify.

  3. Are we going to add a completely new independent toggle for it, newRouteforList:1, with a very short lifetime? That means, if newRouteforList:1=on use the copy of ListsRoutes.scala that has the new non-breaking feature, otherwise use the oldListRoute. In this case, once the short-lived toggle newRouteforList:1 is removed, we should add the same changes to the codes under toggles newListRoute:1, and newListRoute:2 that are still alive.

I would appreciate knowing your opinion on this, it can well be that I am missing a fourth perfect option.

@subotic
Copy link
Collaborator

subotic commented Nov 5, 2020

Does it probably depend on the granularity of a feature toggle? In the case of newListRoute:1 the whole route is under the toggle. Maybe in your example, it would make sense to only have parts of the route under the toggle?

Or, the implementation is broken down one more level, where each subroute goes into its separate file and the factory part is only referencing the separate implementations. Those separate implementations are then unit tested.

I'm sure, that there are a lot more questions open, that we didn't think of yet. We need to jump into the cold water and see what happens.

- Move lists features into subpackage.
- If a feature toggle is expired, just log a warning.
- Update docs.
@benjamingeer
Copy link
Author

benjamingeer commented Nov 5, 2020

I've made the format of the request and response headers identical (listing all toggles in the response header), because I think this will make the response header easier to read: you don't have to figure out that something is turned off because it's not there, you can see that it's turned off.

I changed KnoraSettings so that if a feature toggle has expired, you just get a warning in the log on startup.

I moved the list factory into a sub-package.

@subotic you wrote:

Ideally, there would also be something like x-knora-api-version

Doesn't the existing server version header include that?

@SepidehAlassi About non-breaking new features, I agree with Ivan. In general I think it wouldn't be bad to go with your option (2), especially if we do as Ivan suggests:

the implementation is broken down one more level, where each subroute goes into its separate file and the factory part is only referencing the separate implementations. Those separate implementations are then unit tested.

That would reduce code duplication.

But maybe we can take a different approach in each case, depending on how urgent the new feature is, how much work is involved in implementing support for the breaking changes, etc.

If it's OK with both of you, I'll merge this now.

@subotic
Copy link
Collaborator

subotic commented Nov 5, 2020

Fine with me :-) Thanks!

@SepidehAlassi
Copy link
Contributor

@benjamingeer it is also fine with me, thanks for all your work.

@benjamingeer
Copy link
Author

Many thanks to both of you for the very helpful reviews!

@benjamingeer benjamingeer merged commit 2e6db2e into main Nov 5, 2020
@benjamingeer benjamingeer deleted the wip/DSP-910-feature-toggle branch November 5, 2020 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API/Admin API/V2 enhancement improve existing code or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants