So in the past week a bunch of us have been talking about API Keys, OAuth, passwords, and other means of managing authn and authz in the web apps that are up and coming (specifically mentioned were copr and datagrepper). Puiterwijk has put in some time reading the OAuth specifications and on Friday he walked me through how OAuth is supposed to work. I'll give a summary of his talkl here and then we can kick off some discussion.
OAuth is a standardized method for a user to grant access to resources that they own to people and things that are not themselves. Currently this is being used to allow a user to control the access to data and actions that may be performed on one web service by another web service. The concepts and mechanisms can be used in any situation where the user wants to limit what the software they are using can do on their behalf.
= Part I: What is OAuth? =
== Let's name all the things ==
* Protected Resource is either data or a function that you want to use. * Resource Server is the server that hosts the data. * Client is the program that needs access to the Protected Resource. * Resource owner is the person (may also be a system but well concentrate on actual people for this summary) who is authorized to grant clients access to the Resource. * Authorization Server is the server that grants tokens and codes.
.. note:: `access` is used to mean that the client can use the protected resource. That usage might cause changes to data or cause other actions to be taken (like kicking off a build). be careful not to read `access` as "ability to read the data".
Example time:
There's a resource called 'full name of toshio'. It's hosted on the Fedora Account System so FAS is the resource server. Since toshio is my account, I am the resource owner for it. When I log into the PackageDB, PackageDB wants to display my full name to greet me. PackageDB needs to contact FAS for that information. Thus PackageDB is the client. If we create an OAuth server, that server will be the Authorization Server that verifies my identity and asks me to grant access to 'full name of toshio' to the PackageDB. Once I've done so, it will issue the tokens and codes that actually let the PackageDB get the information it wants from FAS.
== Flow of a basic request from start to finish ==
* The client program needs to get access to a protected resource. * The client asks the authorization server for a client-id and tells the server which permissions it needs * The authrization server gives a url to the client * The client program redirects the user to that url so the user can grant permissions to the client * The authorization server authenticates the user (ie: they login to the authorization server). * The authorization server asks the user to confirm they want to grant the requested permissions to the client. * If the answer is no, the protocol ends. * If he answer is yes, the user is redirected to the client with an `authorization code` in the request * The client sends the `authorization code` to the authorization server. * The authorization server generates an `access token` with the specific permissions that the client requested, expires the `authorization code`, and returns the `access token` to the client, * The client requests the protected resource from the resource server using the `access token`. * The resource server verifes that the `access token` is valid. If it is, it allows access
.. note:: the `authorization code` is only good for retrieving a single `access token` for the particular set of permissions that the user confirmed.
.. note:: A client can request access to multiple resources at once. Assuming the resource owner accepted all of them, the access token the client receives at the end will allow access to all of those. A client typically has one access token from an authorization server that grants it all needed permissions on all of the resource servers that the authorization server can give out permissions for. It is possible for a client to have multiple access tokens with different permissions from the same authorization server but the client would have to keep track of which permissions were granted by which token (and the user would have had to confirm that the client should be granted each set of permissions).
.. question:: an access token can contain permissions for multiple resource servers. How do we secure the token from being used maliciously by a different resource server? ie: I get an access token which grants some permissions on both fas and bodhi. I send that access token to fas to retrieve some information. What prevents fas from hanging onto that token and using it to access the protected resources on bodhi that it grants without my knowledge?
== But wait, there's more! ==
We've now seen one authorization via oauth. But Oauth is flexible. There's a few different ways this can work to be aware of:
* Other ways to request the access token. The example above is what works best for third-party web clients. However, there's other flows that might work better for CLI apps or "trusted" web clients - Implicit: user gets the access token directly from the authorization server rather than through a authorization code. This sortcut is useful when the client is entirely in the browser (no third-party server involved). With a third party server, the authorization code makes it so the user never sees the actual access token, only the authorization code. if the client is running on the user's machine anyhow, there's no sense in that step. - Resource owner password credentials: The resource owner provides their credentials (username and password) to the client. The client retrieves the access token from the authorization server using the credentials. Then it discards the credentials and only keeps the access token for further requests. - Client credentials: Just defines that if the client is the resource server, it can authenticate itself to access its own resources... I'm a little unclear on this but I think one use would be for a resource server to use its externally available functions (which are protected by oauth) rather than having to write an equivalent function that is usable internally. puiterwijk mentions a different use: having a strict separation between tenants in the resource server's model and then having to prove you have permission to access the resource from a different tenant (not something we're likely to do). * Verification of the access token can take many forms. - The authorization server could notify the resource server whenever a new access token is issued/revoked - The resource server could ask the authorization server to verify the token each time it receives one - The token could be signed by the auth server and thus be verifiable in and of itself. The token could then contain the list of permissions so that the resource server would just consult the token to know what was available. This should not be preferred as it makes revoking a token harder. * The authorization server may or may not know about the range of permissions that it can grant. The resource server needs to interpret what the permissions the access token grants mean so if the authorization server grants a made-up permission the application should just ignore it.
.. question:: Is it possible for the user to grant some of the requested permissions and deny others? Or is it all or nothing?
== Refreshing a token and its caveats ==
An access token can have an expire time. The expire time can be coupled with a second token called a refresh token. Usually the refresh token would expire sometime after the access token would expire. When the access token expires, the refresh token could be used by the client to request a new access token without prompting the user. This is indended to protect against an attacker who is sniffing packets from amassing enough ciphertext from multiple uses of a single access token to be able to brute force that token.
This sort of automatic expiration and refresh **is not** meant to protect the user in case the access token is copied without their knowledge (because the refresh token can be copied at the same time).
= Part II: How do we use this? =
This section is less about OAuth itself but some proposals about how we can best code OAuth usage in our web applications to be secure and featureful.
== Session vs token ==
Currently we have a concept of a session in all of our web apps. You login. Once you're logged in, the web app knows that future connections from your web browser/CLI/etc are being made by you. At some point the session expires or you explicitly log out. At that point, the session is over. The expiration time for most of our apps is currently 20 minuts of idle time but we've talked about increasing this in the past. Sessions in my mind should last tens of minutes to hours. Certainly no more than a day. A session conceptually tells the server that the user is present and interacting with the website (by saying that the user has "recently" authenticated).
Tokens are more akin to passwords coupled with a restricted set of permissions. They're intended to be valid for days to weeks. Refresh tokens can (but don't necessarily) be used to keep a low amount of ciphertext in the system while still making authentication via access token transparent to the client. Conceptually, they tell the server that the **client** (not user) is the same one that was granted the permissions.
=== Using tokens to implement sessions ===
* Sessions need to be short term -- expiration would need to be low (perhaps an hour). No possibility to refresh the token. If you need to continue, you have to re-send your username + password (+ otp?) * We want this specific token to represent that the user is present, not just that the client has been delegated permissions. * It would make sense for the token to give out all permissions that the user has (at least, on this resource server) because the user is present. Example token permission: "*@*" permissions token * If possible, saving this type of session token into a wallet/keyring would make sense as that would encrypt the on-disk representation. However, we'd also have to account for the fact that these services might not be present. * Suggested to have access tokens with validity of 5 minutes. refresh tokens of 20 minutes. This would approximate our current cookie-based idle timeout.
.. question:: Can we also have a maximum number of refreshes or maximum time before the user has to reenter their credentials (username + password (+otp?))
== Some proposed best practices ==
* Oauth allows for very granular permissions. You could put a separate permission on each resource that a client can request. However, it doesn't require that you are granular or not because the application interprets the meaning of the permission. A lazy resource server could have a single permission that covered anything that can be performed on the server but this means that a stolen token can be used to do anything that that user could do on that resource server. We should attempt to identify common use cases and code separate permissions for them. ie: "building a package in a copr" would belong in a separate permission from "creating a new copr".
* An access token should not be taken to represent the presence of the user. It means the user has delegated permission to perform this action to some "client". It is possible that the client is a command line app or an api and the user is interacting with it directly but it cannot be assunmed that this is the case.
* Following from that, changing authentication methods, password, yubikey, security questions, etc should never be allowed via an access token. We want the user to be present to change these settings.
* Tokens and sessions should not contain information about the authentication status. They should not contain what permissions are held or when the session expires. These are for the resource server and authorization server to determine.
* Also following from that -- we should write things to allow for a session to be sufficient for allowing users to perform actions. access tokens describe a subset of the functions that the user themselves is allowed to perform.
* Client side -- we want to have different permissions if the user is running the cli from the command line vs running the cli from a cron job. A user running from the cli could be said to have a session.
IRC log (since this is all paraphrased and I could have misunderstood what puiterwijk meant):
http://toshio.fedorapeople.org/puiterwijk-oauth.html
-Toshio
On Fri, Mar 8, 2013 at 11:07 PM, Toshio Kuratomi a.badger@gmail.com wrote:
So in the past week a bunch of us have been talking about API Keys, OAuth, passwords, and other means of managing authn and authz in the web apps that are up and coming (specifically mentioned were copr and datagrepper). Puiterwijk has put in some time reading the OAuth specifications and on Friday he walked me through how OAuth is supposed to work. I'll give a summary of his talkl here and then we can kick off some discussion.
OAuth is a standardized method for a user to grant access to resources that they own to people and things that are not themselves. Currently this is being used to allow a user to control the access to data and actions that may be performed on one web service by another web service. The concepts and mechanisms can be used in any situation where the user wants to limit what the software they are using can do on their behalf.
= Part I: What is OAuth? =
<snip>
== Flow of a basic request from start to finish ==
- The client program needs to get access to a protected resource.
- The client asks the authorization server for a client-id and tells the
server which permissions it needs
- The authrization server gives a url to the client
- The client program redirects the user to that url so the user can grant permissions to the client
- The authorization server authenticates the user (ie: they login to the authorization server).
- The authorization server asks the user to confirm they want to grant the requested permissions to the client.
- If the answer is no, the protocol ends.
- If he answer is yes, the user is redirected to the client with an `authorization code` in the request
- The client sends the `authorization code` to the authorization server.
- The authorization server generates an `access token` with the specific permissions that the client requested, expires the `authorization code`,
and returns the `access token` to the client,
- The client requests the protected resource from the resource server
using the `access token`.
- The resource server verifes that the `access token` is valid. If it is,
it allows access
.. note:: the `authorization code` is only good for retrieving a single `access token` for the particular set of permissions that the user confirmed.
.. note:: A client can request access to multiple resources at once. Assuming the resource owner accepted all of them, the access token the client receives at the end will allow access to all of those. A client typically has one access token from an authorization server that grants it all needed permissions on all of the resource servers that the authorization server can give out permissions for. It is possible for a client to have multiple access tokens with different permissions from the same authorization server but the client would have to keep track of which permissions were granted by which token (and the user would have had to confirm that the client should be granted each set of permissions).
.. question:: an access token can contain permissions for multiple resource servers. How do we secure the token from being used maliciously by a different resource server? ie: I get an access token which grants some permissions on both fas and bodhi. I send that access token to fas to retrieve some information. What prevents fas from hanging onto that token and using it to access the protected resources on bodhi that it grants without my knowledge?
Every request send to auth server or resources server have to be signed
with a consumer secret related to the token/access_token which mean that any other program than the one which get that access token can't get through
== But wait, there's more! ==
We've now seen one authorization via oauth. But Oauth is flexible. There's a few different ways this can work to be aware of:
- Other ways to request the access token. The example above is what works
best for third-party web clients. However, there's other flows that might work better for CLI apps or "trusted" web clients
- Implicit: user gets the access token directly from the authorization
server rather than through a authorization code. This sortcut is useful when the client is entirely in the browser (no third-party server involved). With a third party server, the authorization code makes it so the user never sees the actual access token, only the authorization code. if the client is running on the user's machine anyhow, there's no sense in that step.
- Resource owner password credentials: The resource owner provides their credentials (username and password) to the client. The client
retrieves the access token from the authorization server using the credentials. Then it discards the credentials and only keeps the access token for further requests.
- Client credentials: Just defines that if the client is the resource
server, it can authenticate itself to access its own resources... I'm a little unclear on this but I think one use would be for a resource server to use its externally available functions (which are protected by oauth) rather than having to write an equivalent function that is usable internally. puiterwijk mentions a different use: having a strict separation between tenants in the resource server's model and then having to prove you have permission to access the resource from a different tenant (not something we're likely to do).
- Verification of the access token can take many forms.
- The authorization server could notify the resource server whenever a
new access token is issued/revoked
- The resource server could ask the authorization server to verify the
token each time it receives one
- The token could be signed by the auth server and thus be verifiable in
and of itself. The token could then contain the list of permissions so that the resource server would just consult the token to know what was available. This should not be preferred as it makes revoking a token harder.
- The authorization server may or may not know about the range of
permissions that it can grant. The resource server needs to interpret what the permissions the access token grants mean so if the authorization server grants a made-up permission the application should just ignore it.
.. question:: Is it possible for the user to grant some of the requested permissions and deny others? Or is it all or nothing?
It's all or nothing. It's obvious, if you deny access to requested resources, the related token get revoked.
We have a case at work where we have 20 tokens for one resources server. it's just a matter of security level/choice.
== Refreshing a token and its caveats ==
An access token can have an expire time. The expire time can be coupled with a second token called a refresh token. Usually the refresh token would expire sometime after the access token would expire. When the access token expires, the refresh token could be used by the client to request a new access token without prompting the user. This is indended to protect against an attacker who is sniffing packets from amassing enough ciphertext from multiple uses of a single access token to be able to brute force that token.
This sort of automatic expiration and refresh **is not** meant to protect the user in case the access token is copied without their knowledge (because the refresh token can be copied at the same time).
= Part II: How do we use this? =
This section is less about OAuth itself but some proposals about how we can best code OAuth usage in our web applications to be secure and featureful.
== Session vs token ==
Currently we have a concept of a session in all of our web apps. You login. Once you're logged in, the web app knows that future connections from your web browser/CLI/etc are being made by you. At some point the session expires or you explicitly log out. At that point, the session is over. The expiration time for most of our apps is currently 20 minuts of idle time but we've talked about increasing this in the past. Sessions in my mind should last tens of minutes to hours. Certainly no more than a day. A session conceptually tells the server that the user is present and interacting with the website (by saying that the user has "recently" authenticated).
Tokens are more akin to passwords coupled with a restricted set of permissions. They're intended to be valid for days to weeks. Refresh tokens can (but don't necessarily) be used to keep a low amount of ciphertext in the system while still making authentication via access token transparent to the client. Conceptually, they tell the server that the **client** (not user) is the same one that was granted the permissions.
=== Using tokens to implement sessions ===
- Sessions need to be short term -- expiration would need to be low
(perhaps an hour). No possibility to refresh the token. If you need to continue, you have to re-send your username + password (+ otp?)
- We want this specific token to represent that the user is present, not
just that the client has been delegated permissions.
- It would make sense for the token to give out all permissions that the
user has (at least, on this resource server) because the user is present. Example token permission: "*@*" permissions token
- If possible, saving this type of session token into a wallet/keyring
would make sense as that would encrypt the on-disk representation. However, we'd also have to account for the fact that these services might not be present.
- Suggested to have access tokens with validity of 5 minutes. refresh
tokens of 20 minutes. This would approximate our current cookie-based idle timeout.
.. question:: Can we also have a maximum number of refreshes or maximum time before the user has to reenter their credentials (username + password (+otp?))
== Some proposed best practices ==
- Oauth allows for very granular permissions. You could put a separate permission on each resource that a client can request. However, it
doesn't require that you are granular or not because the application interprets the meaning of the permission. A lazy resource server could have a single permission that covered anything that can be performed on the server but this means that a stolen token can be used to do anything that that user could do on that resource server. We should attempt to identify common use cases and code separate permissions for them. ie: "building a package in a copr" would belong in a separate permission from "creating a new copr".
I'm definitively +1 on this one.
- An access token should not be taken to represent the presence of the
user. It means the user has delegated permission to perform this action to some "client". It is possible that the client is a command line app or an api and the user is interacting with it directly but it cannot be assunmed that this is the case.
+1 Also, all allowed access should be revoke-able by the user at any time.
- Following from that, changing authentication methods, password, yubikey, security questions, etc should never be allowed via an access token. We
want the user to be present to change these settings.
+1
* Tokens and sessions should not contain information about the
authentication status. They should not contain what permissions are held or when the session expires. These are for the resource server and authorization server to determine.
Yeah, that's the resource sever which actually defines what third-parties are allowed to to get from it even if token is granted.
- Also following from that -- we should write things to allow for a
session to be sufficient for allowing users to perform actions. access tokens describe a subset of the functions that the user themselves is allowed to perform.
- Client side -- we want to have different permissions if the user is
running the cli from the command line vs running the cli from a cron job. A user running from the cli could be said to have a session.
Hmm... I'm not sure we can really prevent user from running the exact same
cmd-line from a terminal to a cron tab. Unless having strong policies on user/admin's operation. SOP!
IRC log (since this is all paraphrased and I could have misunderstood what
puiterwijk meant):
On Fri, 2013-03-08 at 14:07 -0800, Toshio Kuratomi wrote:
So in the past week a bunch of us have been talking about API Keys, OAuth, passwords, and other means of managing authn and authz in the web apps that are up and coming (specifically mentioned were copr and datagrepper). Puiterwijk has put in some time reading the OAuth specifications and on Friday he walked me through how OAuth is supposed to work. I'll give a summary of his talkl here and then we can kick off some discussion.
OAuth is a standardized method for a user to grant access to resources that they own to people and things that are not themselves. Currently this is being used to allow a user to control the access to data and actions that may be performed on one web service by another web service. The concepts and mechanisms can be used in any situation where the user wants to limit what the software they are using can do on their behalf.
Here are some of my thoughts on the question. My main interrogation is "What does it bring us?" (© Seth), more precisely, what does it bring us compare to an approach where each apps implement their own token mechanism (cf copr).
Advantages: * One central place where in case of problem all the token associated with someone can be removed * Potential easy/easier integration in gnome-online-account which already integrate google's oauth mechanism * Standard mechanism * Possibility to restrict a token's action - Well this can already be discussed. It has been made clear that it is a all or nothing mechanism (a bit like when installing an app on an Android phone, you have the choice between giving this app access to your contact, the gps... or not using the app). Meaning, since most if not all our tools will be able to perform most of the action possible on the website, it's pretty much coming back on having one big token to use the APi of the website.
Disadvantages: * One more application to develop, deploy and maintain, application rather sensitive if we want it to work with all the oauth client. * One more highly critical application this is will store the API token for everyone and thus if broken the attacker can do a whole bunch of stuff on a whole bunch of webapp. * Implementation are rather specific and one implementation might not work with another one. * It means we have to all agree on this and actually implement it :) (which might be the hardest part considering we've not even agreed on a framework :-))
At the end, too me the killer feature is the integration with gnome-online-account (goa) as this is clearly something we would like to have. All our CLI could then rely on this and call dbus to retrieve the api login and tokens (example for the google oauth[1]). In addition, to my understanding, gnome-online-account stores its information in the session's (gnome-)keyring, so we would have some level of protection. Foreseeable problem, integration in other desktop? I have no idea if there is an equivalent of goa for KDE, LXDE or XFCE, Mate & co.
Actually, how are the CLI supposed to work if we don't integrate oauth with goa? The cli spits out an URL that the user should visit? But then the oauth server will have to give the user the api login and token which the user will have to put somewhere on the system himself, which already looses one of the interest of oauth which is that the user (normally) doesn't see these information.
So to me, I'm not sure yet if the pros outweigh the cons. I see two majors gain: - potentially easier integration in goa - easier cleanup in case of security breach but the cost of developing, maintaining and keeping secure such app are also very high.
I guess it comes down to, are we really going to use the fine-level permission system oauth brings us? And if so how (cf my remark above)?
This is the state of my current thoughts :) Pierre
[1] https://github.com/derflocki/gnome-shell-google-calendar/commit/b07e3bf77fcb...
On Mon, Mar 11, 2013 at 07:05:57PM +0100, Pierre-Yves Chibon wrote:
- Possibility to restrict a token's action
- Well this can already be discussed. It has been made clear that it
is a all or nothing mechanism (a bit like when installing an app on an Android phone, you have the choice between giving this app access to your contact, the gps... or not using the app). Meaning, since most if not all our tools will be able to perform most of the action possible on the website, it's pretty much coming back on having one big token to use the APi of the website.
You'd need a big token if you intend to use all of the api. You wouldn't if you only intend to use a portion of it.
The thing that we seem to be getting hung up on is that a CLI application can potentially access all of the API of a single resource server and potentially other resource servers as well. What I would propose here is that we utilize puiterwijk and my thoughts on emulating sessions to have one big token per resource server. But that token has a very limited lifetime. That makes it inappropriate for scripting and cron jobs. When you want to make a shell script that runs on its own, you need to get a different token that has more limited permissions. That sort of token would have a longer or perhaps indefinite lifetime (indefinite when you consider the presence of refresh tokens).
Actually, how are the CLI supposed to work if we don't integrate oauth with goa?
There are several methods depending on how much we want the user to trust the commandline app. puiterwijk and I agreed that we could use the "Resource owner password credentials" method -- this would require giving the username and password to the CLI app which would then perform the necessary calls to the resource server to get the access token. It could also be done using one of the other methods if you don't want to require the user to trust the CLI app we write :-)
The cli spits out an URL that the user should visit? But then the oauth server will have to give the user the api login and token which the user will have to put somewhere on the system himself, which already looses one of the interest of oauth which is that the user (normally) doesn't see these information.
This is pretty normal though. I know that I've encountered this in both a micro-blogging and a photo management GUI application in Fedora. Hiding this from the user in this case is also not a feature in terms of security. It's only a feature in terms of convenience. The ability of OAuth to hide the access token when the user and client are on two separate machines run by two separate entities (for instance, facebook wanting to use flikr photos and you) is good for security but when the user is on the same machine as the client we cannot actually hide the token from the user.
-Toshio
On Mon, 2013-03-11 at 11:28 -0700, Toshio Kuratomi wrote:
On Mon, Mar 11, 2013 at 07:05:57PM +0100, Pierre-Yves Chibon wrote:
- Possibility to restrict a token's action
- Well this can already be discussed. It has been made clear that it
is a all or nothing mechanism (a bit like when installing an app on an Android phone, you have the choice between giving this app access to your contact, the gps... or not using the app). Meaning, since most if not all our tools will be able to perform most of the action possible on the website, it's pretty much coming back on having one big token to use the APi of the website.
You'd need a big token if you intend to use all of the api. You wouldn't if you only intend to use a portion of it.
The thing that we seem to be getting hung up on is that a CLI application can potentially access all of the API of a single resource server and potentially other resource servers as well.
hm, atm we have one CLI per application, tbh, I'd rather keep it this way.
What I would propose here is that we utilize puiterwijk and my thoughts on emulating sessions to have one big token per resource server.
So we're back pretty much on 1 token / application (pkgdb, copr...) which would be the approach we would take if we implement token w/o oauth.
But that token has a very limited lifetime. That makes it inappropriate for scripting and cron jobs. When you want to make a shell script that runs on its own, you need to get a different token that has more limited permissions. That sort of token would have a longer or perhaps indefinite lifetime (indefinite when you consider the presence of refresh tokens).
Meaning, we need to have a way to generate a token for the user (where he can choose which permission he associates with the token), no?
Actually, how are the CLI supposed to work if we don't integrate oauth with goa?
There are several methods depending on how much we want the user to trust the commandline app. puiterwijk and I agreed that we could use the "Resource owner password credentials" method -- this would require giving the username and password to the CLI app which would then perform the necessary calls to the resource server to get the access token. It could also be done using one of the other methods if you don't want to require the user to trust the CLI app we write :-)
But then that's pretty much what we are already doing with for example pkgdb-cli and to me that defeats a bit the point of tokens, where would we use them then? I my idea, tokens are there to replace the use of the password which if intercept gives you access to more than the token itself (ie: with your password I can change your password, with your api token I can change the ACLs on your package)
Pierre
On Mon, Mar 11, 2013 at 07:42:50PM +0100, Pierre-Yves Chibon wrote:
On Mon, 2013-03-11 at 11:28 -0700, Toshio Kuratomi wrote:
On Mon, Mar 11, 2013 at 07:05:57PM +0100, Pierre-Yves Chibon wrote:
- Possibility to restrict a token's action
- Well this can already be discussed. It has been made clear that it
is a all or nothing mechanism (a bit like when installing an app on an Android phone, you have the choice between giving this app access to your contact, the gps... or not using the app). Meaning, since most if not all our tools will be able to perform most of the action possible on the website, it's pretty much coming back on having one big token to use the APi of the website.
You'd need a big token if you intend to use all of the api. You wouldn't if you only intend to use a portion of it.
The thing that we seem to be getting hung up on is that a CLI application can potentially access all of the API of a single resource server and potentially other resource servers as well.
hm, atm we have one CLI per application, tbh, I'd rather keep it this way.
That's not quite true...
Things like fedora-active-user hit multiple resource servers. Thankfully, only one of those requires authn/z to get the information it needs.
There's plenty of on off scripts in infrastructure that need to access the API of multiple resources (pkgdb, koji, fas, bodhi are all intertwined).
What I would propose here is that we utilize puiterwijk and my thoughts on emulating sessions to have one big token per resource server.
So we're back pretty much on 1 token / application (pkgdb, copr...) which would be the approach we would take if we implement token w/o oauth.
No.
We're 1 token per application *if and only if* the user is actively using the application at the moment.
We've got to separate out the session use case which is really only going to work with single token from the scripted use case. We can make both of those more secure if we treat them separately.
But that token has a very limited lifetime. That makes it inappropriate for scripting and cron jobs. When you want to make a shell script that runs on its own, you need to get a different token that has more limited permissions. That sort of token would have a longer or perhaps indefinite lifetime (indefinite when you consider the presence of refresh tokens).
Meaning, we need to have a way to generate a token for the user (where he can choose which permission he associates with the token), no?
That really depends on implementation. We need to be able to generate targeted tokens. But how we generate them is up in the air. here are some ways:
* Browse to a fas-oauth web page that as lists of permissions that you can pick through to assemble a single token with a long expiry. * Run a command ( /usr/bin/copr --generate-token build ) that makes the request for only the needs that the command has. * Run a command that prints out a url that you use to retrieve the token
Actually, how are the CLI supposed to work if we don't integrate oauth with goa?
There are several methods depending on how much we want the user to trust the commandline app. puiterwijk and I agreed that we could use the "Resource owner password credentials" method -- this would require giving the username and password to the CLI app which would then perform the necessary calls to the resource server to get the access token. It could also be done using one of the other methods if you don't want to require the user to trust the CLI app we write :-)
But then that's pretty much what we are already doing with for example pkgdb-cli and to me that defeats a bit the point of tokens, where would we use them then? I my idea, tokens are there to replace the use of the password which if intercept gives you access to more than the token itself (ie: with your password I can change your password, with your api token I can change the ACLs on your package)
I don't see how this defeats it. In order to retrieve a token, you need to authenticate yourself, correct. So whenever you retrieve a token you have to use your username and password (or other acceptable credential). There's always going to be the risk of interception there. The security benefit is we get to avoid ever storing the username and password on disk with any of the token methods we're discussing. Not that the username and password don't have to flow from you over the network to the server at some point in time.
-Toshio
On 03/11/2013 07:05 PM, Pierre-Yves Chibon wrote:
Disadvantages:
- One more application to develop, deploy and maintain, application
rather sensitive if we want it to work with all the oauth client.
- One more highly critical application this is will store the API token
for everyone and thus if broken the attacker can do a whole bunch of stuff on a whole bunch of webapp.
- Implementation are rather specific and one implementation might not
work with another one.
This is correct, but as long as we specify the variables in the protocol (like the url's and way to get tokens), and we and all developers using it stick to this for all Fedora apps, this shouldn't be a problem.
- It means we have to all agree on this and actually implement it :)
(which might be the hardest part considering we've not even agreed on a framework :-))
Well, I was planning on building a high-level (python) library to make implementing it for app writers as easy as possible.
Actually, how are the CLI supposed to work if we don't integrate oauth with goa? The cli spits out an URL that the user should visit? But then the oauth server will have to give the user the api login and token which the user will have to put somewhere on the system himself, which already looses one of the interest of oauth which is that the user (normally) doesn't see these information.
CLI could use the "Resource owner password credentials" as well, so it can do this by itself.
On Fri, 8 Mar 2013 14:07:28 -0800 Toshio Kuratomi a.badger@gmail.com wrote:
So in the past week a bunch of us have been talking about API Keys, OAuth, passwords, and other means of managing authn and authz in the web apps that are up and coming (specifically mentioned were copr and datagrepper). Puiterwijk has put in some time reading the OAuth specifications and on Friday he walked me through how OAuth is supposed to work. I'll give a summary of his talkl here and then we can kick off some discussion.
...big snip...
.. note:: A client can request access to multiple resources at once. Assuming the resource owner accepted all of them, the access token the client receives at the end will allow access to all of those. A client typically has one access token from an authorization server that grants it all needed permissions on all of the resource servers that the authorization server can give out permissions for. It is possible for a client to have multiple access tokens with different permissions from the same authorization server but the client would have to keep track of which permissions were granted by which token (and the user would have had to confirm that the client should be granted each set of permissions).
So, if you got a token for 'build new packages in existing copr' and later went and got a 'make new copr' token in addition to your existing permission, you would have two tokens? Or you could ask for one new token with both permissions? Either would be valid?
Additionally, you could get a new token for 'change toshios full name' and get a new token for it, or a new token with it and 'build new packages in existing copr' in the same token?
.. question:: an access token can contain permissions for multiple resource servers. How do we secure the token from being used maliciously by a different resource server? ie: I get an access token which grants some permissions on both fas and bodhi. I send that access token to fas to retrieve some information. What prevents fas from hanging onto that token and using it to access the protected resources on bodhi that it grants without my knowledge?
Yeah, good question.
What does a auth token look like? Just a file with encoded data? Can you see what a token is issued by or for based on the token?
...snip...
- Resource owner password credentials: The resource owner provides
their credentials (username and password) to the client. The client retrieves the access token from the authorization server using the credentials. Then it discards the credentials and only keeps the access token for further requests.
This is nasty because the client could save the username/pass and reuse it for other things, no?
...snip...
- Verification of the access token can take many forms.
- The authorization server could notify the resource server
whenever a new access token is issued/revoked
Not sure that gets us too much.
- The resource server could ask the authorization server to verify
the token each time it receives one
- The token could be signed by the auth server and thus be
verifiable in and of itself. The token could then contain the list of permissions so that the resource server would just consult the token to know what was available. This should not be preferred as it makes revoking a token harder.
Yeah.
- The authorization server may or may not know about the range of
permissions that it can grant. The resource server needs to interpret what the permissions the access token grants mean so if the authorization server grants a made-up permission the application should just ignore it.
.. question:: Is it possible for the user to grant some of the requested permissions and deny others? Or is it all or nothing?
It might be better to reject a request entirely if you can't grant all the permissions? Otherwise is there a clear way to note only the permissions you actually granted? ie, "Can I build new coprs and packages in this copr" and it rejects new coprs but grants build, the caller could be confused when it goes to try and use the token for something that wasn't granted.
...snip...
=== Using tokens to implement sessions ===
Is this something we want to do?
- Sessions need to be short term -- expiration would need to be low
(perhaps an hour). No possibility to refresh the token. If you need to continue, you have to re-send your username + password (+ otp?)
- We want this specific token to represent that the user is present,
not just that the client has been delegated permissions.
- It would make sense for the token to give out all permissions that
the user has (at least, on this resource server) because the user is present. Example token permission: "*@*" permissions token
- If possible, saving this type of session token into a
wallet/keyring would make sense as that would encrypt the on-disk representation. However, we'd also have to account for the fact that these services might not be present.
- Suggested to have access tokens with validity of 5 minutes.
refresh tokens of 20 minutes. This would approximate our current cookie-based idle timeout.
.. question:: Can we also have a maximum number of refreshes or maximum time before the user has to reenter their credentials (username + password (+otp?))
Only by expiring the token and making them get a new one I guess?
== Some proposed best practices ==
- Oauth allows for very granular permissions. You could put a
separate permission on each resource that a client can request. However, it doesn't require that you are granular or not because the application interprets the meaning of the permission. A lazy resource server could have a single permission that covered anything that can be performed on the server but this means that a stolen token can be used to do anything that that user could do on that resource server. We should attempt to identify common use cases and code separate permissions for them. ie: "building a package in a copr" would belong in a separate permission from "creating a new copr".
Completely agreed.
- An access token should not be taken to represent the presence of
the user. It means the user has delegated permission to perform this action to some "client". It is possible that the client is a command line app or an api and the user is interacting with it directly but it cannot be assunmed that this is the case.
- Following from that, changing authentication methods, password,
yubikey, security questions, etc should never be allowed via an access token. We want the user to be present to change these settings.
Also completely agreed.
- Tokens and sessions should not contain information about the
authentication status. They should not contain what permissions are held or when the session expires. These are for the resource server and authorization server to determine.
I asked about this above. Yeah, if a token can be a anonymous looking blob I think thats best. Of course an attacker could watch how and where you use a token, but if they obtained that from other means than local access they may not be able to do that.
- Also following from that -- we should write things to allow for a
session to be sufficient for allowing users to perform actions. access tokens describe a subset of the functions that the user themselves is allowed to perform.
I guess that's assuming we replace sessions with tokens...
- Client side -- we want to have different permissions if the user is
running the cli from the command line vs running the cli from a cron job. A user running from the cli could be said to have a session.
Sure. But if we kept sessions and tokens seperate we could do that too, no?
So, what are our initial use cases for this? I guess coprs is a big one. Any other obvious ones on the map right now?
I'd like to be carefull about this, and not just convert everything, but have a few apps that could use it written up with use cases, etc.
kevin
<snip>
.. note:: A client can request access to multiple resources at once. Assuming the resource owner accepted all of them, the access token the client receives at the end will allow access to all of those. A client typically has one access token from an authorization server that grants it all needed permissions on all of the resource servers that the authorization server can give out permissions for. It is possible for a client to have multiple access tokens with different permissions from the same authorization server but the client would have to keep track of which permissions were granted by which token (and the user would have had to confirm that the client should be granted each set of permissions).
So, if you got a token for 'build new packages in existing copr' and later went and got a 'make new copr' token in addition to your existing permission, you would have two tokens? Or you could ask for one new token with both permissions? Either would be valid?
Either would be valid. It's really depend on how you write you permission security in resources sever-side.
Additionally, you could get a new token for 'change toshios full name' and get a new token for it, or a new token with it and 'build new packages in existing copr' in the same token?
.. question:: an access token can contain permissions for multiple resource servers. How do we secure the token from being used maliciously by a different resource server? ie: I get an access token which grants some permissions on both fas and bodhi. I send that access token to fas to retrieve some information. What prevents fas from hanging onto that token and using it to access the protected resources on bodhi that it grants without my knowledge?
Yeah, good question.
What does a auth token look like? Just a file with encoded data? Can you see what a token is issued by or for based on the token?
it's just a key string. And yes, you see what a token is issued by. As the first token you get is generated by the oauth provider to be tied to you client program.
...snip...
- Resource owner password credentials: The resource owner provides
their credentials (username and password) to the client. The client retrieves the access token from the authorization server using the credentials. Then it discards the credentials and only keeps the access token for further requests.
This is nasty because the client could save the username/pass and reuse it for other things, no?
No. One of the purpose of oauth is to not use user's credentials between client and resources servers. If your client ask you for a user/passwod, just nuke it right away and vise-versa for the resource owner.
...snip...
On Mon, Mar 11, 2013 at 12:28:47PM -0600, Kevin Fenzi wrote:
So, if you got a token for 'build new packages in existing copr' and later went and got a 'make new copr' token in addition to your existing permission, you would have two tokens? Or you could ask for one new token with both permissions? Either would be valid?
I believe that either would be valid. puiterwijk can tell us if the spec says something about older tokens being invalidated if a new one is requested but I don't think it would be able to. Say that you have a client running on machine foo and the same software on machine bar. The server wouldn't be able to tell that you had two separate clients that you needed two separate tokens for in this case.
Additionally, you could get a new token for 'change toshios full name' and get a new token for it, or a new token with it and 'build new packages in existing copr' in the same token?
According to my understanding, you can do it either way as long as the authentication server is allowed to issue tokens for both.
.. question:: an access token can contain permissions for multiple resource servers. How do we secure the token from being used maliciously by a different resource server? ie: I get an access token which grants some permissions on both fas and bodhi. I send that access token to fas to retrieve some information. What prevents fas from hanging onto that token and using it to access the protected resources on bodhi that it grants without my knowledge?
Yeah, good question.
What does a auth token look like? Just a file with encoded data? Can you see what a token is issued by or for based on the token?
As discussed later, you could put various pieces of information in here but best practice would seem to indicate it's pretty anonymized (perhaps even just a hash?) The interpretation of what it does is then done on the server.
- Resource owner password credentials: The resource owner provides
their credentials (username and password) to the client. The client retrieves the access token from the authorization server using the credentials. Then it discards the credentials and only keeps the access token for further requests.
This is nasty because the client could save the username/pass and reuse it for other things, no?
This isn't really any different than what we have now in the CLI world, though. You have to trust the CLI programs to do the right thing. For third party websites, you would never want to use this model as you can never trust the client there to do the right thing. We could allow both methods if we want to cater to people who trust the CLI to do this conveniently vs those who are paranoid and want their web browser to be the only thing they trust. (Enter your username, password, and otp or the access_token you receive from visiting this [url])
...snip...
- Verification of the access token can take many forms.
- The authorization server could notify the resource server
whenever a new access token is issued/revoked
Not sure that gets us too much.
Keeps us from having to query the authz server everytime a token is presented. I think it's a nbit of an optimization and probably pre-mature to think about doing it this way as a first pass.
- The authorization server may or may not know about the range of
permissions that it can grant. The resource server needs to interpret what the permissions the access token grants mean so if the authorization server grants a made-up permission the application should just ignore it.
.. question:: Is it possible for the user to grant some of the requested permissions and deny others? Or is it all or nothing?
It might be better to reject a request entirely if you can't grant all the permissions? Otherwise is there a clear way to note only the permissions you actually granted? ie, "Can I build new coprs and packages in this copr" and it rejects new coprs but grants build, the caller could be confused when it goes to try and use the token for something that wasn't granted.
I believe the client gets failure or an access token and doesn't know what permissions are embodied by that token. It probably does make sense to fail in most circumstances. But I think there might be cases where it makes sense not to fail. Perhaps being able to customize the list of permissions a token would have in another UI would take care of those use cases though...
=== Using tokens to implement sessions ===
Is this something we want to do?
Maybe.... If we used tokens to implement sessions, we'd be able to cut down on the number of different mechanisms that one could use to gain permission to access a protected resource. That is good in several different ways: less places where bugs can crop up, less things to test, less things to code and maintain.... It also would mean that resource server methods that took oauth tokens would also take sessions with very few changes.
OTOH, I don't want to shoehorn something that is a bad fit.
On yet another hand, I don't know if we'd be able to entirely get rid of the cookie implementation of sessions. Web browsing might still rely on cookie-based sessions. API might purely use access token sessions. (This might still have a side benefit though -- if access tokens aren't something handled automatically by the web browser and cookies don't work on the api, then csrf protection isn't needed on the api.)
.. question:: Can we also have a maximum number of refreshes or maximum time before the user has to reenter their credentials (username + password (+otp?))
Only by expiring the token and making them get a new one I guess?
I think this would be an implementation detail for the oauth-server. The server would record either the first time the chain of tokens was issued with a or it would record the number of refreshes. After a limit is exceeded, it would invalidate (or not send) the next refresh token along with the access token. When that access token expired, the user would have to reauthenticate.
== Some proposed best practices ==
- Also following from that -- we should write things to allow for a
session to be sufficient for allowing users to perform actions. access tokens describe a subset of the functions that the user themselves is allowed to perform.
I guess that's assuming we replace sessions with tokens...
This is actually generic to me. It would be made easier if we replace sessions with tokens.
I see two use cases: * Case one: I have a script that run /usr/bin/copr build toshio/pkgdb on every commit to the pkgdb repo. * Case two: I want to run /usr/bin/copr new-repo toshio/messing-about myself. Then I want to run /usr/bin/copr build toshio/messing-about on a specific srpm. I would prefer not to have to retype my credentials twice if I'm running those one right after the other.
Both of these tasks will use /usr/bin/copr However, the first requires that the permissions reside on disk unencrypted and remain valid for a long period. The second requires that the permissions be stored for a short period (perhaps on disk but also potentially in a password manager if you're running one in your desktop session).
The second task is what we conceptually think of as a session. The user is present and doing somewhat arbitrary things that the user wants to do at that time. It could be implemented with a cookie or with an authorization token.
What this best practice is trying to accomplish is to make a single api function serve both use cases. It could be done by making the api function accept both cookie auth and token authz but then you have to code the server for both cases.
- Client side -- we want to have different permissions if the user is
running the cli from the command line vs running the cli from a cron job. A user running from the cli could be said to have a session.
Sure. But if we kept sessions and tokens seperate we could do that too, no?
Yep. client side, we can separate sessions vs restricted-permissions-tokens no matter what. What this best practice is saying is that if we combine the mechanism by which we're serving tokens and sessions, we still want to handle them differently on the client end.
So, what are our initial use cases for this? I guess coprs is a big one. Any other obvious ones on the map right now?
datagrepper wants to implement auth tokens as well. pingou and I (mostly pingou :-) is working on revamping the pkgdb api. It would be the right time to add a consistent authn/z piece to the api.
I'd like to be carefull about this, and not just convert everything, but have a few apps that could use it written up with use cases, etc.
yeah -- I'd like to but... I'm not sure we have as much time as we'd like here.
-Toshio
On Mon, 2013-03-11 at 12:28 -0600, Kevin Fenzi wrote:
So, what are our initial use cases for this? I guess coprs is a big one. Any other obvious ones on the map right now?
Right now we have three applications for which there is work in progress in designing them an api: - copr - already handles api token but in its own way. - tagger - atm, in the API itself there isn't really a login mechanism, it relies on the user IP. - pkgdb - we're not as far as we have any login mechanism in place, it's probably the least advanced of the three.
Pierre
infrastructure@lists.fedoraproject.org