Thursday, September 11, 2008

Protected Sharing on the Open Web

One feature I have wished for on the web for quite some time is the ability to securely share family photos with my extended family and close friends. Currently, all photo sharing sites (that I’ve been able to find) require all parties to have an account at that photo sharing site in order to securely share the photos. Note that I don’t want the current solution of “security-by-obsecurity” where a big random URL is created and emailed to the group.

I think we can build a much better sharing environment using existing and emerging specifications like OpenID, OAuth, Portable Contacts and XRDS-Simple. Here is a use case and one way it could work.

I have an account at flickr and I create an album (flickr set) that I want to share with my extended family. Previously, I’ve associated my flickr account with my plaxo account (using OAuth) to enable flickr to access my contacts (via “Portable Contacts”). Flickr needs to use XRDS-Simple to find my “portable contacts” service and OAuth discovery to set up the connection between the two services.

  1. I tell flickr I want the new album (“Family photos”) protected and shared only with those people in my contacts lists that are labeled as “Family”.
  2. Flickr marks the album as “protected” and remembers that those allowed to view the album are anyone who is a member of my “Family” tag at my “Portable Contacts” service.
  3. I send out an email to my family members sending them the direct URL to the protected resource (note that flickr could also do this for me since it has a connection to my portable contacts service).
  4. A family member receives the email and clicks the URL to the protected album at flickr
  5. Flickr recognizes this is a protected resource and returns both the OAuth information for how to access the protected resource as well as HTML telling the user that the resource is protected and the user needs to authenticate
  6. The family member logs into flickr using their OpenID (not currently supported)
  7. Flickr takes the OpenID and asks my “Portable Contacts” service whether this OpenID has a tag of “Family” (basically a membership query; see previous post)
  8. If the user's OpenID is a contact with a tag of “Family” then they get access to the album, otherwise they are denied

What’s currently missing to make this a reality are...
  • Relying parties accepting OpenIDs
  • Users knowing they have an OpenID and using them
  • Portable Contacts adding “membership” type APIs
  • Portable Contacts supporting an explicit 'urls' type of 'openid'

In finalizing this blog post, I read David Recordon's summary of the Portable Contacts hackathon held last night. The following quote shows this is very near reality, Yeah!

Brian Ellin of JanRain has successfully combined OpenID, XRDS-Simple, OAuth, and the Portable Contacts API to start showing how each of these building blocks should come together. Upon visiting his demo site he logs in using his OpenID. From there, the site discovers that Plaxo hosts his address book and requests access to it via OAuth. Finishing the flow, his demo site uses the Portable Contacts API to access information about his contacts directly from Plaxo. End to end, login with an OpenID and finish by giving the site access to your address book without having to fork over your password.

Tagging for contacts

Has any one else had this problem? You need to IM someone and you can't remember which group you filed their name under. I realize that if I just kept an alphabetized list, and I remembered their name this wouldn't be a problem. However, sometimes it not that I’m looking for a particular person, but someone on a particular team.

Basically, what I’ve found is that I want to attach “tags” to my contacts that describe attributes about that person. Then I can find people by my own “folksonomy” whenever I need to. This would allow me to “query” my contacts (or IM client) by a “tag”. So I could say, “Show me all the architects at AOL that are currently online?”. It also allows me to do queries like “Does Bob have an ‘Extended Family’ tag?”. This is really a membership query and can be thought of as “Is Bob a member of the group ‘Extended Family’?”

Combining tags with contact data allows for all sorts of interesting capabilities. For instance, my Adium IM client could query my Portable Contacts service for all contacts with at least one IM identifier present and return all IM identifiers and tags. With this information Adium could auto-create groups, show me a “tag-cloud” of who’s online, etc. Another use would be true access-controlled sharing of protected resources. I’ll have another post on that soon.

With the emerging Portable Contacts specification, I think there is a great opportunity to enable this kind of capability in an open standard. The portable contacts spec already supports both tagging and filtering. What is a little unclear from a quick read of the specification is whether filters can be combined. However it should be easy with this specification to support the queries listed above. For a membership style query, the following should suffice...


Wednesday, September 03, 2008

Continuing the discussion...

This post is in response to the thoughts Praveen posted on his blog regarding Open Identity Tokens.

These thoughts around an Open Identity Token are more focused on enabling sharing access-controlled resources than trying to duplicate existing OAuth functionality. One use case that OAuth doesn't currently solve is my desire to access a protected resource where I DON'T have an account at the service provider managing the protected resource. An Open Identity Token allows the service provider to allow access to the protected resource (not mine, but my friends) without my having to have an account at the service provider.

My vision was that an Open Identity Token could be passed as part of a standard OAuth invocation allowing multiple verifiable identities to be specified in the API call. The OAuth mechanisms bind the Consumer, the Service Provider and the Identity into a single token. This doesn't leave room for additional identities.

Some specific comments to the points raised follow...

"Bob’s discovery service" might not know Alice with the same Id (OpenID) - Bob might have only entered Alice's alternate email address that doesn't even resolve to the same OpenID that Alive used currently to sign in to

I think this problem is out of scope for identity tokens. This is really an identity "association" problem and a problem space that I believe portable contacts could grow into solving. Just like with Adium (IM client for the mac) I can merge multiple IM identifiers into a single identity, I should be able to do the same with portable contacts. I would love to see portable contacts grow into allowing membership based APIs so that a service provider could contact my portable contacts server and say... "Is this identity token a member of George's 'hiking buddies'?". might need to go back to Alice's OpenID provider for each (notification) service that it wants to invoke on behalf of the user. Bob might use notification service A, David might use notification service B, and so on... - where A, B, etc.. totally different services.

I don't think that would need to go back to Alice's OpenID provider. However, it would need to go to each of her friends discovery "service" to find their notification service. One of the goals is to allow the dynamic distribution enabled by the "web". To me this is the benefit of having a discovery "service". Any person or service can contact my discovery service and find my preferred services (much like the use case you outline as the end of your post). While in many cases my identity provider may also be my discover service it doesn't have to be that way and the protocols shouldn't require that.

If the Identity/OpenID Provider provides a Open Token verification API, then it would have no way to make sure the token is being used at the same place for which it is granted. This goes back to the same problem that was solved by doing a RP Discovery in OpenID2.0.

Actually, I don't think the identity provider cares where the identity token is presented. The purpose of this identity token is to provide a "verifiable" identity (in the same vein that Amazon provides the "real name" feature). To me, the key is can the service provider that receives the identity token have confidence that this Consumer is "allowed" to send the identity token to the service provider. That's why the Consumer uses a nonce and a different hash value than is delivered to the Consumer by the identity provider.

This requires a real discovery service (process) instead of a simple XRDS (static) file hosted some where - since the services defined in the XRDS will be anyway protected, there shouldn't be any harm in saying "my notification service is here and oh btw, it's only open to a restricted list of people so you might not be able to send notification to me".

This is a great point. It does require more processing logic than just returning a file on disk. However, any protected resource requires a service, even when using OAuth so I don't see that as a big hurdle. Even if we separated the discovery information into public and non-public, it would still simplify the logic to be able to serve the non-public data based on an identity token versus a full OAuth UI experience.

Open Identity Token seems less trust worthy - of course the same problem that people attribute to OpenID but at least in OpenID case, it is not directly meant for a specific service invocation - it's merely for knowing who the user is and the RP/SP can do more things before it allows the user to do certain things.

I guess I'd argue that the trust of the token is dependent on a lot of factors. Does the service provider trust the identity provider? (this is that same trust question that OpenID faces). Can I trust the security of the protected token? (as described in my post it's probably only as good as OpenID "dumb mode", but that is pretty easily solvable). I'd also argue that there is great value in having a verifiable identity for certain operations. Yes, I can get a verifiable identity by using front-channel requests and asking the user to authenticate... again... but if it's not needed, why put the user through that experience? Also, using a Open Identity Token where a verifiable identity is required significantly reduces the number of interactions between the Consumer and Service Provider.
  • With an Open Identity Token...
    1. Consumer accesses protected resource at the Service Provider with Open Identity Token
    2. Service Provider verifies Open Identity Token with Identity Provider
    3. Service Provider performs access-control check and returns response

  • With standard OAuth...
    1. Consumer accesses protected resource at the Service Provider
    2. Service Provider returns error and requests authorization
    3. Consumer requests the RequestToken
    4. Consumer sends user to the Service Provider 'Authorize' endpoint
    5. The Service Provider uses the check_immediate method to attempt to authenticate the user (Assuming the Consumer sent the Service Provider an OpenID for the user)
    6. The OpenID Provider returns that the user is logged in
    7. The Service Provider invokes the Consumer's callback method (front-channel)
    8. The Consumer requests the AccessToken
    9. The Consumer re-tries the initial request for the protected resource (and gets access if the identity associated with the OAuth AccessToken is in the ACL)

In general in the current social networking era where things (notification) are more publish/subscribe model, not sure how important it is to solve this use case. Most of the user's anyway still use their email addresses not a notification service.

With an Open Identity Token, there is no requirement that the UserIdentifier value be the user's OpenID. It could be the user's email address, an opaque blob, a signed SAML Assertion, etc. Again, I consider the identifier association problem to be "out of scope" for identity tokens.

Thanks for the great comments. This is exactly the kind of discussion I hoped to start. One of my driving motivations in this is to enable easy access-controlled sharing such that my parents and in-laws don't have to have accounts at my personal photo service, and yet I can ensure that only the people I want can access my personal photos. I've never liked the security-by-obscurity model used for "privately" sharing my photos with others who don't use my preferred service.

Open Identity Token

Assuming that an “Open Identity Token” is useful, here are some of my initial thoughts.

  • The identity token needs to clearly identify the identity provider that issued the token, some value that identifies the user, and I believe the party the token was initially issued to (the Consumer in OAuth speak).
  • The value that identifies the user must support opaque values to prevent this token becoming a global correlation handle (if desired by the involved parties). Of course, the user identifier could be the user’s OpenID.
  • The identity token must be signed in some way and protected from replay attacks.

If we go back to the use case from yesterday’s post, using a Open Identity Token would enable the flow to work like this.

  1. Alice logs into with her OpenID
    • When invokes the OpenID flow, it asks Alice’s OpenID provider to return an Open Identity Token
    • receives the OpenID assertion and Open Identity Token
  2. Alice uploads a GPS track and some photos of a new trail she hiked over the Labor Day weekend.
  3. At the conclusion of her upload, asks Alice if it should notify her friends about her activity.
  4. Alice thinks that’s a great idea and agrees.
  5. So queries, using pre-established OAuth credentials, and retrieves Alice’s list of contacts with a tag of “hiking buddy”.
  6. Now for each of these friends, has to discover the “notification” service and send it the new activity message.
  7. One of Alice’s friends, Bob, only exposes the endpoint and metadata of his “notification” service to a restricted list of people
  8. queries Bob’s discovery service presenting the Alice’s Open Identity Token
  9. Bob’s discovery service validates Alice’s Open Identity Token and then returns the non-public service endpoint and metadata

In this flow, Alice does not have to interact via some user interface with Bob’s discovery service. Of course the identity represented in the Open Identity Token needs to be resolvable in to an identifier that Bob’s discovery service can use.

Finally, here are some initial technical implementation ideas...

I was thinking that the identity provider could construct the token and “sign” it with a HMAC_SHA? hash. The signature-base-string would be IdentityProvider:Consumer:UserIdentifier and the identity provider would construct a random value to use as the secret in the HMAC_SHA? hash. This value would need to be remembered based on the IdentityProvider:Consumer pair. What would be returned (probably base64’d) as the Open Identity Token would be “IdentityProvider:Consumer:UserIdentifier,hash”.

When the Consumer that receives the token wants to use it in an API call, it constructs a unique Open Identity Token (to protect against replay of the token) for the API call. This token uses the hash received from the Identity Provider as the secret in a new HMAC_SHA? hash. The signature base string for this hash would be “IdentityProvider:Consumer:UserIdentifier:Nonce”. What would go on the wire as the token would be base64(“IdentityProvider:Consumer:UserIdentifier:Nonce,hash”).

When a Service Provider receives the Open Identity Token, it can verify the token by sending it to the IdentityProvider specified in the token. For OpenID this would require a new extension and “API” method. Note that in the verification step, if the user identifier was opaque in the token it can be resolved into something the Service Provider can use. This allows for generating tokens that are unique to a specific context (no global correlation) while still providing the Service Provider with the data they need.

In this model, only the identity provider can validate the Open Identity Token because it is the only entity (besides the Consumer) that has the secret used by the Consumer in signing the token. All the identity provider needs to do is look up the hash it gave to that Consumer and then use it in a HMAC_SHA? hash of the “IdentityProvider:Consumer:UserIdentifier:Nonce” string and finally compare hashes.

I realize that going back to the Identity Provider to verify the token does have some drawbacks: privacy (leaking where this identity token is used), complexity and performance (an extra lookup/validation is required). But since I’m not a security expert, I’m hoping that others will be able to modify these ideas to allow for direct Service Provider validation. It’s just critical to me that the mechanism used to generate the Open Identity Token allow for both un-defined “circles of trust” (e.g. OpenID) as well as more closed or dynamic “circles of trust” (e.g. SAML). This might be as simple as leveraging the OAuth signature method and then support RSA signing.

Tuesday, September 02, 2008

Protecting "discovery" information?

I’ve been thinking a lot lately about discovery of personal services (e.g. endpoint and metadata of my “portable contacts” service, endpoint and metadata of my preferred “email service”, etc). One problem with enabling discovery of this kind of information is that it leaks information about me. For example, I might not want the world to know where I keep my personal photos?

So the question is... How, in the world of open identity protocols, do I restrict access to a subset of my “discovery” information? The obvious answer is for the discovery service (or service provider in general) to restrict access based on the identity of the invoking party. However, how is that invoking identity presented? At first thought is seems like OpenID and OAuth should suffice, but it turns out this doesn’t work to well in practice.

Let’s take the following example and walk it through.
“Alice logs into her hiking site, uploads a GPS track and photos, and notifies her friends of the new information.”
  1. Alice logs into one of her favorite web sites (
  2. Alice uploads a GPS track and some photos of a new trail she hiked over the Labor Day weekend.
  3. At the conclusion of her upload, asks Alice if it should notify her friends about her activity.
  4. Alice thinks that’s a great idea and agrees.
  5. So queries, using pre-established OAuth credentials, and retrieves Alice’s list of contacts with a tag of “hiking buddy”.
  6. Now for each of these friends, has to discover the “notification” service and send it the new activity message.
  7. One of Alice’s friends, Bob, only exposes the endpoint and metadata of his “notification” service to a restricted list of people.

It’s at this point that things begin to break down. How does identify Alice to Bob’s discovery service so that can attempt to discover Bob’s “notification” service? There currently isn’t a binding for OAuth to be used with XRDS discovery, and even if there were, it would mean that Alice would have to have an “account” at Bob’s discovery service in order for the discovery service to be able to authenticate Alice and establish OAuth credentials. While this would only have to be done once with Bob’s discovery service, the user experience would have to be repeated with each of Alice’s friend’s discovery service. That seems like over kill for the simple purpose of identifying Alice to Bob's discovery service.

A possible solution would be an “open identity token” that could be created by an identity provider and passed to any service provider. I have some thoughts on this that I hope to expound on in another post.