Last updated: 
3 months 3 weeks ago
Blog Manager
One of Jisc’s activities is to monitor and, where possible, influence regulatory developments that affect us and our customer universities, colleges and schools as operators of large computer networks. Since Janet and its customer networks are classified by Ofcom as private networks, postings here are likely to concentrate on the regulation of those networks. Postings here are, to the best of our knowledge, accurate on the date they are made, but may well become out of date or unreliable at unpredictable times thereafter. Before taking action that may have legal consequences, you should talk to your own lawyers. NEW: To help navigate the many posts on the General Data Protection Regulation, I've classified them as most relevant to developing a GDPR compliance process, GDPR's effect on specific topics, or how the GDPR is being developed. Or you can just use my free GDPR project plan.

Group administrators:

Choosing the Right Identifier

Wednesday, June 6, 2012 - 13:42

In discussing a legal framework for federated access management we’ve concluded that the right approach to use as a basis for exchanging attributes is that a particular attribute is “necessary” to provide a service. That implies both that service providers shouldn’t ask for attributes they don’t need, and also that where there is a choice of attributes that could be used they should choose the one that includes the smallest amount of unnecessary information.

Identifiers are one area where this choice may well arise, since the eduPerson standard contains a range of different identifiers that can all be used to distinguish one user from another. Since the purpose of an identifier is to link together a series of actions by the same individual, it seems to be the range across which those links need to be made that should guide the choice of identifier.

  • If the requirement is only to link different visits to the same service by the same individual (for example to remember the choices or searches I made on my previous visit) then what’s needed is a label, unique to me, that will always be provided when I use that service. However the same label doesn’t need to be provided to any other service and I don’t need to know what the value of the label is. eduPerson provides a unique, opaque, per-service label, eduPersonTargetedID.
  • If the requirement is to link together my activities on different services then the per-service uniqueness is not wanted but all the other characteristics are. In particular I still don’t need to know the value of the label. As far as I know, eduPerson doesn’t contain this type of identifier (from some identity providers eduPersonTargetedID may have these characteristics but that can't be relied upon), so services with this requirement tend to use the next one...
  • Next is an identifier that can be used to link services and which I know myself. In other words it’s still unique, but it’s not opaque. This type of identifier is most useful for solving the “invitation problem”, where the operator of an on-line service wants to grant access to me, as a real-world individual. One solution is for me to tell the operator a unique identifier which will then be shared between our two computers. To make them memorable, such identifiers are often constructed in the same way as an e-mail address, with a unique local name an @ character and the domain used by my identity provider. However it’s important to remember that they aren’t e-mail addresses: sending e-mail to them may well not work. eduPerson's human-friendly unique identifier is eduPersonPrincipalName.
  • Finally there’s the identifier that can be used if I want to link my real world relationships and reputation into the on-line world, for example if I want to continue an off-line discussion in an on-line collaboration tool. In the real world we use personal names for that, so this is one of the few kinds of service where knowing my personal name really is necessary, rather than a just an optional personalisation. Unlike the identifiers above, though, personal names aren’t unique so it’s important for the computer also to have one of those identifiers as well, which is unique, to ensure that my stuff isn’t mixed up with someone else who happens to have the same name. In fact name isn't really a good thing for a computer to try to process at all, because I may well give different versions, using full names or initials and perhaps even different orders, at different times and in different places. Humans can (usually) recognise that they all refer to the same person, but that's really hard for a computer to do.

These different characteristics and purposes can be summarised as follows:

Purpose Unique? Opaque? Per-service? eduPerson example
Linking visits Y Y Y eduPersonTargetedID
Linking services Y Y n  
Linking to an individual Y n n eduPersonPrincipalName
Linking off/on-line worlds n n n cn