Messaging Abstractions Discussion Document

This document is a mutable input to developer design discussions and should not be considered a final design.

Intro

For general details of v3 messaging API, see also MessagingAPIRefactoring.

v2 made use of a statically defined hierarchy of message context subtypes to expose various properties related to message processing. v3 uses a more flexible design based around the notion of contexts which expose class-indexed (and therefore type-safe) subcontexts. The actual nature and use of message contexts for various message processing use cases is not dictated by the fundamental API. Each processing case can use the contexts and abstractions appropriate for its needs.

Goals of this document:

to sketch out various general and specific messaging abstractions, with a focus (at least initially) on SAML message processing
to identify issues of messaging uncovered within v2 and attempt to avoid them, esp with respect to how we identity actors/entities in a message processing workflow

Actors/Entities - Identifier Terminology and Roles

OpenSAML v2 defined the following messaging related properties for identifying actors/entities. The semantics were not really formally or precisely defined very well . All are simple String property types.

inbound message issuer - The issuer of the inbound message being processed.
- Defined on root MessageContext of openws, intended for general message cases.
- In a server-side context, this was the "requester" who sent the inbound message.
- For client-side, this (hypothetically) was the "responder" which sent the inbound response to your request.
outbound message issuer - The issuer of the outbound message being processed.
- Defined on root MessageContext of openws, intended for general message cases.
- In a server-side context this was the self-identifying entityId under which you are issuing a response.
- For client-side, this (hypothetically) was the self-identifying entityId under which you are issuing a request.
peer entity - The party with which you are communicating.
- Defined on SAMLMessageContext, so SAML-specific.
- Has associated Endpoint, EntitiesDescriptor, RoleDescriptor, and role QName.
- The intent (I think) was to mean "the other party vis-a-vis SAML protocol communication" which is agnostic as to whether you are a client/server, or requester/responder.
local entity - Pretty self-explanatory, it's "yourself".
- Defined on SAMLMessageContext, so SAML-specific.
- Has associated Endpoint, EntitiesDescriptor, RoleDescriptor, and role QName.
- The IdP can respond under multiple entityId's, this is the one in use for the current response being generated, i.e. "asserting party Id". It's also used in artifact generation.
relying party - Nominally the party to which an assertion (or more generally a response?) is being issued.
- Carried (indirectly) on shib-common ProfileRequestContext, defined on RelyingPartyConfiguration.
- RPC is in turn populated on ProfileRequestContext via lookup based on inbound message issuer.
provider - "Entity ID of the responder when communicating with the relying party." (Javadoc). Note: Out of context, the name is somewhat ambiguous.
- Carried (indirectly) on shib-common ProfileRequestContext, defined on RelyingPartyConfiguration.
- RPC in in turn populated on ProfileRequestContext via lookup based on inbound message issuer, value comes from static config property for a given RPC.
presenting entity - The party that is presenting an AuthnRequest as a part of the Liberty ID-WSF SSOS profile.
- Only used in the v2 IdP delegation extension
- The ECP (e.g. portal) presenting the message is an active participant in the exchange, unlike the passive browser intermediary in standard SAML SSO.
- Used to authenticate the inbound message via client TLS and HoK subject confirmation.
- Message issuer (in the SAML protocol sense) and relying party in this case is still the upstream SAML SP.

Additionally, SAML Core 3 Protocols discusses in an implied manner the notions of "requester" and "responser", and SAML Core 3.2.2.2 implicitly defines those terms where protocol message status code URI's are defined:

urn:oasis:names:tc:SAML:2.0:status:Requester
urn:oasis:names:tc:SAML:2.0:status:Responder

The only place in SAML (that I know of) where roles in messaging processing flows are actually defined in detail are in SAML 2 Core 3.4 re: the AuthnRequest processing model. Conceptually they (mostly) overlap with what OpenSAML used in v2, albeit with in some cases differing terminology:

Requester - The entity who creates the authentication request and to whom the response is to be returned.
Presenter - The entity who presents the request to the identity provider and either authenticates itself during the transmission of the message, or relies on an existing security context to establish its identity. If not the requester, the presenter acts as an intermediary between the requester and the responding identity provider.
Attesting Entity -The entity or entities expected to be able to satisfy one of the <SubjectConfirmation> elements of the resulting assertion(s).
Relying Party - The entity or entities expected to consume the assertion(s) to accomplish a purpose defined by the profile or context of use, generally to establish a security context.
Identity Provider - The entity to whom the presenter gives the request and from whom the presenter receives the response.

Current v3 Messaging Related Contexts For Discussion

Generic message metadata (message issuer, issue instant, message id): BasicMessageMetadataContext
- Keep or go for SAML-specific contexts only?
Common SAML-specific contexts: saml-api common/messaging/context
- Most of these were mapped straight from the v2 analogs, b/c the existing code is structure around these notions. So these were just a straight port to the new context API.
IdP relying party storage: RelyingPartyContext

Known Issues and Questions

Actor/Entity Identification

Our own terminology from v2 differs from that of SAML - Can we harmonize, e.g. use terminology like SAML requester and SAML responder?
- Not sure:
  1. The above model assumes an AuthnRequest model, we are targeting more generic message processing requirements
  2. We need to model both client and server sides of things, and other outliers like ECP, delegation, etc
  3. Importantly: code that needs to be agnostic as to whether the other party is a requester or a responder wouldn't work with those terms, at least without having logic to select the right one and/or different instances of those components. E.g. SAML protocol signature validation, need to get the Issuer of the inbound message to feed to trust engine. Doesn't know or care whether inbound message is a request or response.
We are mostly focused on SAML cases, but we know there are other protocols/specs that we may want to implement, and that have different processing models, or that at least use different terminology to describe. Some resulting common abstractions are seemingly obvious (e.g. message issuer/requester), but we need to decide whether:
1. We carry the generalization forward by having actual single generalized components,
  1. See BasicMessageMetadataContext, which has been used so far in the IdP.
2. Stick mostly or entirely with abstractions that are specific to a given domain, e.g. SAML, CAS, OpenID. E.g. do we keep the above generic message issuer abstraction, or only a SAML-specific issuer context?
  1. We do currently have other code that operates on a more general level than SAML, e.g. the client cert security policy rule that uses the concept of generic message issuer
Actual v2 code sometimes inconsistently conflated the actor notions of "message issuer", "relying party" and perhaps "peer entity". It usually "worked", but only coincidentally, because for the SAML cases we support, they are always the same. We need to
1. clarify and document the exact meanings of those terms
2. eliminate duplication of abstractions if they exist (maybe there aren't duplicates for what we need to support in the library)
3. ensure that components use the one that is conceptually correct for what the component does. E.g. One does not issue an Assertion or base attribute filtering policy to a "inbound message issuer", but rather to a "relying party".
We probably want to account for both client-side and server-side abstractions, and to support both IdP-side and SP-side cases, at least in the general design of the contexts and abstractions we define. However, dropping some use cases might simplify things, albeit at the expense of features.
v2 does a lot of copying of context properties into other context properties (like "message issuer" into "peer entity").
- We should try and minimize if we can and have a single canonical place where a property value exists.
- However, this isn't necessarily wrong in all cases. For example, some property A might conditionally be populated from either property B or C, depending on the profile, or might change over time, etc. (e.g. third party request extension)
Do we want to consider representing SAML entity actors by the appropriate SAML structure, e.g. saml:Issuer?
- There has always been a slight issue with representing these as a String, since technically they are scoped by a format, but in reality this may be too much trouble to implement. We've never had a case where it was actually an issue and it's hard to imagine one.
- There really isn't a common thing to use, SAML 1 and SAML 2 would be different structures, so we'd have duplicate components/actions, or at least lots of conditional logic, for things that are shared between SAML 1 and SAML 2.
Peer/local entity vs inbound/outbound message context issuers, is implicit in new model ? Can collapse?
Do we want or need to represent relying party explicitly in OpenSAML, or is this too high level and therefore only an IdP-level concept.

Other Context Abstraction Issues

SamlMetadataContext - IdP right now treats as direct child of MessageContext, Javadoc states is implicit it refers to the "issuer" of the message represented by the context.
- In at least outbound case, need both peer and local metadata
- Seems would make most sense if usage is symmetrical and done the same way for all entities represented in the context
- Action AddSamlMetadataToMessageContext would need to change, probably need 2 of them (for peer and local) or have it take a enum param as to which to populate
v2 associated the metadata role QName with local and peer entity explicitly. What Chad had in v3 so far puts it on protocol context, so just 1 slot. Seems it really should be associated with the info identifying the actor, not the protocol under which the message exchange is happening
What Chad had in v3 so far has relayState on protocol context. We have now added a binding context, for reasons that arose during the encoder/decoder refactoring - should move it there?
- Binding context is specific to each message context, so if move to binding context then something has to copy from inbound binding context to outbound binding context.
- Alternative is to put it on the operation context (in IdP the profile context), but so far things like encoders and decoders only know about the direct message context on which they operate.
In general, given 1) the context hierarchy model and the 2 message contexts (inbound and outbound) 2) that OpenSAML is lower level than the IdP, there is an implicit need somewhere (likely the IdP) to duplicate some data/contexts from the inbound to the outbound message contexts. Can we avoid this?
- Maybe. Perhaps the model of storing/retrieving information in the MessageContext because that is what the component is handed per its interface is too simple. Maybe instead we make liberal use of the ContextDataLookupFunction notion, allowing storing the data in one place, e.g. the InOutOperationContext (in the IdP the ProfileRequestContext subclass thereof).
- Examples
  - Info about the inbound requester is needed both in things that process the inbound message context and the outbound message context, e.g. inbound-oriented handlers and message encoders.
  - SAML protocol info and esp relay state are relevant and used in both the inbound and outbound message context. Those apply to the whole message exchange, not just the inbound and/or outbound message.
    - Issue with the relay state: Decoders know about and extract relay state. But since they create and return the decoded message context, there is at that point no parent to the MessageContext. So probably have to live with storing relay state on the binding context and then either copying it somewhere for encoding, or use a ContextDataLookupFunction to get at it in the inbound message context.

Other issues

Do we maintain the reusable components in OpenSAML that represent general lower level SAML processing needs and not specific to the (higher level) IdP?
- Mostly talking about things that were formerly called SecurityPolicyRules, and would now be the more general MessageHandler abstraction, e.g. inbound signature validation, replay detection, message issue instant eval.
- Chad didn't want to in general (why?), so some things he has currently implemented as actions in the IdP, e.g. for replay and message issue instant
  - This wouldn't seem to make as much sense for the more complex security and trust engine related rules
- Brent generally thinks our original notion was correct and that code should be pushed generally speaking to the lowest level where it makes sense

Noteworthy cases or outliers:

SAML third party request extension - Here, the message starts out being processed as the SAML Issuer entity (e.g. signature verification, etc). As soon as the extension is processed, processing should then proceed as if the request effectively was from the third party entity
- This is probably a case where the message issuer/requester (or peer entity property?) starts out as the value indicated in the SAML Issuer, but later transitions to the third party entity value. This allows downstream components to not know or care whether the third party extension is in use.
Message issuer != relying party (or peer entity) - We often make this assumption, however it is not strictly true.
- In the part of the Liberty delegation stuff that we didn't implement yet, the SAML protocol Issuer is requesting an Assertion(s) on behalf of other relying parties.
Relying party number > 1 - Again looking at the Liberty delegation stuff, you might be issuing one or more Assertions to multiple relying parties, so having a single relying party doesn't always work.
- Chad and I discussed, probably would not worry about this case unless and until we need to. Start with a relying party context that holds a single RP (which we currently have here in idp-profile-api as RelyingPartyContext). If a later profile supports multiple, add a new context that holds a collection of RP info, profiles that can handle would loop over them.
Message issuer (in the SAML protocol sense) != relevant entity for (some) initial security evaluation
- Also in delegation, the inbound message client TLS has to be eval'ed not on basis of (SAML) message issuer, but on SOAP Sender header, which identifies the ECP.
- Had to have a specialization of the client cert auth security policy rule which pulls the issuer from the soap header instead of the standard message context slot.

Proposals

Trying to anticipate and plan around all possible and general message processing needs is hard. The new message context model doesn't require us to have statically defined context properties, which are replaced with dynamic and flexible emergent context tree. So:
- Get rid of generalized message abstractions, such as "message issuer", in favor of domain specific contexts, at least for identifying actor/entity in the processing flow. This simplifies and minimizes terminology and eliminates overlap around concepts like inbound/outbound message issuer and peer/local entity.
  - Could possibly keep the generalized message metadata context for message id and issue instant, esp if decide to keep components in OpenSAML that validate those
- For SAML use cases would be nice to use terminology of "requester" and "responder" which harmonize with the SAML spec. But as noted above this causes problems for components that don't know or care. They care about "the other party" in the SAML protocol exchange.
  - Instead propose stick with something like v2, e.g. local and peer entity.
    - Are there better terms? "Local" entity I think is pretty uncontroversial. Is there something better than "peer"?
    - TODO: make sure that this doesn't cause a similar issue with components like requester/responder. Don't think so, since we used in v2.
  - If there is a need to explicitly represent notion of "requester" and "responder", we could define enum which is optionally carried on local and peer entity contexts
- For the small minority of things like client cert eval rule (assuming we keep in OpenSAML) that DO operate on a more general level than SAML, need to either:
  1. make it abstract and have SAML-specific concrete class that knows what context abstraction to use.
  2. Use a functor plugin type of thingy that abstracts away how the inbound message "issuer" is accessed and mutated
Represent actors/entities with a dedicated context that at a minimum has a property slot for the entityId
- Allows natural association of other actor/entity-specific information as subcontexts and avoids the restriction of having only 1 subcontext of a given type in a context, e.g. SAML metadata. Contexts underneath are construed as "scoped" to that actor/entity
- See SamlLocal- and SamlPeerEntityContexts above, which each would contain for example a SamlEndpoint- and SamlMetadataContext.
- Probably also add a SamlPresenterContext for the known use case of delegation.
Code review etc and mindful coding in IdP to make sure that we use e.g. peer entity context when we mean the SAML protocol message requester, and relying party when we mean relying party (e.g. consumer of issued assertion)
In v2 context properties were often just data that has been extracted from the message and copied into the context. See for example all the decoder base classes. One idea behind the new (sub)contexts in v3 was the notion of having "view contexts", i.e. smart contexts that know how to look up the requested data from the message lazily and on-demand by tree-walking (i.e. from MessageContext.getMessage()), possibly caching it for efficient lookup later. Try and use where possible to steamline and avoid having lots of code that just extracts data and calls setters on contexts.
- Smart contexts could also compute/calculate/resolve etc information that isn't directly in the message but is derivable from it.

TODO

Sketch out a basic message process lifecyle of what/how/when both populate and use the various abstractions properly, esp for edge cases like third party request and delegation