Filter Specification

Purpose

unblu and some implementation partners provide filters for many integration scenarios (see this page for further information). Sometimes, however, an adapted integration may be required. This custom filter specification describes how a filter can be implemented.

Implementation Options

Depending on the integration method, the filter can be implemented thus:

  1. Integrated into and running on the web server.
  2. Integrated into and running on the application server.
  3. Integrated into and running within a (reverse) proxy.
  4. Implemented as a (reverse) proxy.
  5. Running in the end user's browser.

This document will not explain all integrations and implementation variations in detail. Instead it will focus on providing all necessary information to allow developers to implement a fitting solution on their own.

Requirements overview

With the following sections, requirements are split into the responsibilities of the filter:

  1. Rule-based javascript injection.
  2. Rule-based resource catch-and-forward to unblu.
  3. Rule-based request proxying (only required when implemented as a reverse proxy).

"Injection" filter requirements

If the filter is integrated into or implemented as a proxy or if the filter is integrated into the customer's web or application server, it must

  • have access to all http(s) requests and responses that are exchanged between the browser and the backend server.
  • be able to intercept requests and/or modify responses (headers and content).

If the filter is running in the end user's browser, it must

  • be included on all web pages intended to be co-browsable

"Resource forwarding" requirements

The filter must have access to all http(s) requests and responses that are exchanged between the browser and the backend server. It can be implemented as a reverse proxy or as a filter directly running on the customer's web or application server or the end user's browser. It must be able to intercept requests, forward response bodies to unblu and/or modify responses (headers and content). In addition it must be able to redirect certain requests (messages from the browser) to the collaboration server instead of letting the backend process them.

"Proxying" filter requirements

When implemented as a proxy, the filter must be able to

  • redirect certain requests (requests from the browser) to the collaboration server instead of letting the backend process them.

Configuration

In order to keep the configuration in one place, the filter needs to be configured dynamically from the collaboration server. Local configuration should be limited to things that cannot be provided by the server (such as locating the collaboration server) and to purely implementation-specific, technical configuration. When the filter starts up, it must read its configuration from the server. The configuration contains all information the filter needs in order to do its work. Because the collaboration server might be reconfigured without the filter being restarted, a special header containing a configuration version identifier is included in every response unblu sends to the filter. As soon as the version identifier changes, the filter must reload the configuration from the server.

Forward requests to unblu

The configuration defines a list of path prefixes (i.e. /unblu/) that identifies requests that need to be forwarded to the collaboration server. All requests from the customers browser starting with this prefix must be forwarded to the server instead of processing them as normal requests to the backend web or application server.

Cache content

The filter must send the content of the response bodies (enriched by additional information) to the collaboration server so that it can be stored for (later) playback in the unblu player. The filter has to decide whether a response needs to be sent to the collaboration server according to rules defined in the configuration.

Code injection

The filter must be able to inject javascript and css code into the body of the html response. The content that needs to be injected and the rules defining whether a response needs injection or not are defined in the configuration.

Regex replacements (optional)

The filter must be able to perform regular expression replacements in textual http responses (html, css, javascript...). A list of replacements is included in the filter configuration.

String replacement

Parts of the injected code need to be dynamic (for instance the id of the current cache content). Therefore, the filter needs to perform simple string replacements in the javascript content before injecting into the html body of the response. These string replacements are based on the #Environment Variables that the filter provides.

Rule evaluation

The decision whether or not a page needs to be sent to the collaboration server or whether it needs code injection must be taken on the filter. A round-trip to the collaboration server is not acceptable. Therefore, the filter needs to evaluate a set of rules (provided in the configuration) in order to decide. These rules need to be able to take into account the #Environment Variables.

Environment variables

The filter needs to provide a predefined set of environment variables to be used in #String replacement and #Rule evaluation. Some of these environment variables are static (such as the unbluPath, path to the collaboration server), others are dynamic and need to be provided in a request scope (requestUri, cookie values, contentType...)

Non functional requirements

The filter must smoothly continue to deliver backend requests even if the collaboration server is (temporarily) unavailable.

UML Overview

UML overview of the java implementation

Environment variables / String substitution

Filter implementations must provide environment variables and string substitution functionality. This functionality is used by the rule evaluation system and by the code injection.

Variable names must be case insensitive.

Scoping

Environment variables must be provided on three different scopes: filter, configuration and request. In the filter scope, globally valid variables such as the URL of the collaboration server are stored. The request scope holds variables that are only valid for a single request (cookie values, response content type...). The request scope environment must fall back on the configuration scope environment if a variable key is not found within the request scope, configuration scope must fall back to filter scope.

String substitution

Filter implementations must be able to search for placeholders in strings and replace them with variable values from the environment. The expansion of placeholders with their value must be performed recursively as variable values can also hold placeholders.

Placeholders have the following format: ${PLACEHOLDER_NAME}. Only 'a' to 'z' characters, - and _ are allowed in variable keys. The following regular expression can be used to find placeholders: \${a-zA-Z_-+}.

Lookup of placeholders is case insensitive.

Sources of environment variables

Environment variables are defined from several sources:

  • local filter configuration (things the filter must know on its own like the URL of the collaboration server)
  • collaboration server (the server can send environment variables in its responses)
  • http request headers
  • http response headers

The following table lists all environment variables that filter implementations must support:

Variable name Scope Source remarks
UNBLU_PATH Filter Local filter configuration Deprecated, use UNBLU_PUBLIC_PATH instead
UNBLU_PUBLIC_PATH Filter Local filter configuration The path prefix for requests that are redirected to the collaboration server (usually /unblu)
UNBLU_SYSTEM_PATH Filter Local filter configuration Path prefix for requests from the filter to the collaboration server (for the filter server communication)
UNBLU_URL Filter Local filter configuration The url of the collaboration server (i.e. http://localhost:8080)
ORIGINAL_URL Request Request The URL of the current request
UNBLU_START_TIME Filter Response Headers Initially empty as soon as a response with a x-unblu-start-time header has been processed, UNBLU_START_TIME must be set to the value of the header
FILTER_START_TIME Filter Filter implementation Set during filter start up. Timestamp of the start time of the filter (milliseconds since 1.1.1970 0:00 UTC).
START_TIME Filter Filter implementation MAX(FILTER_START_TIME, UNBLU_START_TIME)
ORIGINAL_PATH Request Request The PATH component of the URL of the current request
CONTENT_TYPE Request Response Headers The content type of the response. If the original header contains a character set, it must be stripped away.
CONTENT_LENGTH Request Response Headers The content length of the response, if known.
CHARACTER_SET Request Response Headers The character set of the response, if available
DEFAULT_CHARACTER_SET Filter Local filter configuration The default character set of the filter
COOKIE_<cookieName> Request Request Headers For every cookie the client sends to the filter, an environment variable with the pattern COOKIE_<cookieName> must be generated
REQUEST_HEADER_<headerName> Request Request Headers For every request header the client sends to the filter, an environment variable with the pattern REQUEST_HEADER_<headerName> must be generated
RESPONSE_HEADER_<headerName> Request Response Headers For every response header the backend sends to the filter, an environment variable with the pattern RESPONSE_HEADER_<headerName> must be generated
whatever sent by the unblu server Configuration Filter configuration from unblu server Every time the filter loads its configuration, it must put all supplied environment variables into its configuration scope
whatever sent by the unblu server Request ContentId response from the server Every time the filter sends a "cacheContent" message to the server, it must put all supplied environment variables into the request scope

Rule Evaluation Support

Filter implementations must provide rule evaluation support. Rules are boolean expressions that consist of various comparisons (equal, greater than, startsWith...), conjunctive ("and") and disjunctive ("or") combinations and negations ("not"). Rules evaluations are used where the filter needs to decide whether a response needs to be sent to the collaboration server for caching and to decide if a code injection needs to be performed or not.

Rule evaluation always takes place within an environment scope (filter, configuration or request). The evaluation of every rule within an environment results in a boolean value.

Rule Types

Comparison

type-property: comparison

Comparison rules consist of the following parts:

  • leftSide: the left side of the comparison
  • operator: the operator of the comparison
  • rightSide: the right side of the comparison
  • caseSensitive: whether or not the comparison should consider case

The operator property can have one of the following values:

operator true if
equals left and right side are equal (string comparison)
startsWith left side string value starts with right side string value
endsWith left side string value ends with right side string value
contains left side string value contains right side string value
= left side is equal to right side (numeric comparison)
> left side is bigger than right side
< left side is smaller than right side
>= left side is bigger or equal to right side
<= left side is smaller or equal to right side

For example, a comparison rule could compare the response content type of a request against a string literal:

  • leftSide: ${CONTENT_TYPE}
  • operator: equal
  • rightSide: text/html
  • caseSensitive: false

And

type-property: and

And rules consist of a list of compound rules ("rules" property).

And rules evaluate to true if all components evaluate to true.

Or

type-property: or

Or rules consist of a list of compound rules ("rules" property).

Or rules evaluate to true if at least one of the components evaluates to true.

Not

type-property: not

Not rules consist of a single rule ("rule" property).

Not rules evaluate to true if the contained rule evaluates to false.

UML

UML of the java implementation

Conditional Code Injection

The filter configuration holds a list of ConditionalCodeInjections. A conditional code injection consists of a Rule (condition) and a list of CodeInjections that must be performed if the condition evaluates to true (in the request scope environment).

Every CodeInjection consists of a reference (place in the html code where the injection must be performed), a type (internal or external javascript, internal or external style sheet) and a value holding the string value of the code injection. The value can hold placeholders so it must be expanded in the request environment prior to injection.

Reference

The code injection references represents the spot in the html code where the injection string must be placed.

BEFORE_BODY_CLOSE

Before body close means that the string must be inserted just in front of the closing body tag.

<html>
<head>
</head>
<body>
whatever content
INJECTION GOES HERE</body>
</html>

AFTER_HEAD_START

After head start means that the string must be inserted just after the opening head tag.

<html>
<head>
INJECTION GOES HERE
<title></title>
</head>
<body>
whatever content
</body>
</html>

BEFORE_HEAD_CLOSE

Before head close means that the string must be inserted just before the closing head tag.

<html>
<head>
<title></title>
INJECTION GOES HERE
</head>
<body>
whatever content
</body>
</html>

AFTER_LAST_META

After last meta means that the string must be inserted just after the last meta tag that occurs between starting head and ending head tag.

<html>
<head>
<title></title>
<meta>
<meta>
<anyothertag>
<meta>
INJECTION GOES HERE
<anyothertag>
</head>
<body>
whatever content
</body>
</html>

Type

The code injection type defines how the injection string must be modified before it is inserted in the html.

INTERNAL_JAVASCRIPT

Expanded value:

var unbluConfig = {sessionCookieName: "x-unblu-sid", contentId: "iidzzllei889088d88kke8dujd"}

Injected string:

<script type="text/javascript" charset="UTF-8">
var unbluConfig = {sessionCookieName: "x-unblu-sid", contentId: "iidzzllei889088d88kke8dujd"}
</script>

EXTERNAL_JAVASCRIPT

Expanded value:

/unblu/javascript/consultant.js

Injected string:

<script type="text/javascript" charset="UTF-8" src="/unblu/javascript/consultant.js"></script>

INTERNAL_STYLE_SHEET

Expanded value:

.unbluSupportLink { background-color: #FF0000 }

Injected string:

<style type="text/css">
.unbluSupportLink { background-color: #FF0000 }
</style>

EXTERNAL_STYLE_SHEET

Expanded value:

/unblu/css/consultant.css

Injected string:

<link rel="stylesheet" href="/unblu/css/consultant.css" type="text/css" media="all"></link>

HTML_CONTENT

Expanded value

<div id="myDiv"></div>

Injected string:

<div id="myDiv"></div>

Conditional Regex Replacements (optional)

The filter configuration holds a list of ConditionalRegexReplacement instances. Every ConditionalRegexReplacement holds a rule (condition property). If this condition evaluates to true in the request scope environment the regular expression-based string replacement must be performed on the response body. All matches of the given pattern must be replaced with the given replacement string. The replacement must be expanded in the request scope environment as it can contain placeholders.

Communication with collaboration server

Filter implementations need to communicate with the collaboration server in the following situations:

  • forward (proxy) requests that start with ${UNBLU_PUBLIC_PATH} to the collaboration server
  • read configuration
  • send HTTP responses to collaboration server for caching

In the first case the filter simply proxies incoming requests to the server by translating the request path and forwarding query string, body and parts of the headers.

For the other cases (read configuration, cache contents), the server provides a simple HTTP interface.

Proxy requests to unblu

The filter has to forward all requests that have a path that starts with ${UNBLU_PUBLIC_PATH} to the collaboration server.

For instance, if the filter is reachable at http://foo.com/ and the unblu server URL is http://localhost:8080/unblu/ and ${UNBLU_PUBLIC_PATH} is /unblu then a request to http://foo.com/unblu/js/bar.jshas to be forwarded to http://localhost:8080/unblu/js/bar.js. The request to the collaboration server has to include the original query string and the body of the request (if it is a POST request). When forwarding a request to the server generally all request headers (as sent from the browser) must also be set in the request to the server and all response headers (as sent from the server) must also be set in the response to the browser.

In most cases a forwarding request to unblu is realized using an already in place reverse proxy (i.e., apache mod_proxy or similar). The proxy must at least implement the "GET" and "POST" HTTP methods.

HTTP interface

When the filter reads configuration from the collaboration server or sends a response to the server for caching, it uses a simple HTTP interface.

The HTTP interface is reachable through the URL ${UNBLU_SYSTEM_URL}/filterBackend.

Requests to this interface consist of a set of key/value pairs. These pairs can be sent in the form of proprietary headers (x-unblu-<name>:<value>) or as request parameters in the query string (i.e., http://localhost:8080/unblu/rest/filterBackend?x-unblu-action=add-to-cache...). In addition to proprietary headers or query string parameters, binary data (body of responses to be cached) is transmitted in the body of a post request. If data is transmitted content-type has to be set to application/octet-stream and the content-length header must be set.

Responses from the HTTP interface always consist of a JSON string holding either a response (depending on the action) or an error message.

In addition to the response body, the HTTP interface responses include the following proprietary headers:

  • x-unblu-configuration-version (the currently valid configuration version)
  • x-unblu-session-invalid: true if the session is not valid (anymore)

The following parameters must be set in every request to the HTTP interface:

  • interface-version: the version of the interface, the filter implements (for now this is always 1)
  • action: the action that the filter wants to perform
  • configuration-version: the version of the current configuration the filter uses, if available, or an empty string if no configuration has been loaded so far

Depending on the "action" parameter additional parameters are required:

Action "ping"

Type of response: empty object

Action "read-configuration"

This action reads the configuration for the filter and returns it as a JSON object.

Type of response: FilterConfiguration

Action "add-to-cache"

This action adds a response to the cache.

Type of response: ContentId

The following additional parameters are required:

  • cache-date: the date the response was received
  • content-type: the content type of the response body (without the character set part)
  • character-set: the character set the response body is encoded with, if available
  • http-status-code: the status code of the response
  • original-url: the url of the request as it was sent to the filter

file-name: a human readable file-name (is used for downloads) in generic implementations the content-disposition headers "filename" fragment can be used (if present)

Note: You need to have a valid session id (received after logging in) which has to be transmitted in the Cookies.

The body of the response must be sent to unblu in the body of the post request.

Data Types

All returned JSON data types are based on a base type with the following structure.

{
    "class": <dataTypeName>
    "environment": {
        "key1": "value1",
        "key2": "value2",
         "keyn" : "valuen"
     }
}

"class" holds the name of the data type, "environment" holds a set of key/value pairs. The elements in the environment must be added the the environment of the scope. The request is executed in (read-configuration: filter scope, add-to-cache: request scope).

FilterBackendError

Whenever something went wrong.

{
 "class": "FilterBackendError"
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "code": <int error code>,
 "message": <error message>
 }

CacheContentId

In responses to "add-to-cache"

{
 "class": "CacheContentId"
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "contentId": <string id of the cache content>
 }

FilterConfiguration

In response to "read-configuration"

{
 "class": "FilterConfiguration",
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "version": <string version of the configuration>
 "cacheCondition": <Rule that must be evaluated in order to decide if a response must be added to the cache>
 "codeInjections": <array of ConditionalCodeInjection objects>
 }

ConditionalRegexReplacement

A regular expression based string replacement that must be performed if the condition evaluates to true in the request scope

{
    "class": "ConditionalRegexReplacement",
    "condition": <Rule, if it evaluates to true, the replacement must be performed>,
    "pattern": <the regex pattern>
    "replacement": <the string that must be inserted as a replacement for all matches>
}

ConditionalCodeInjection

A set of code injections that must be performed if a given condition (rule) evaluates to true.

{
    "class": "ConditionalCodeInjection",
    "condition": <Rule that decides if the injection takes place for the current request>,
    "injections": <array of CodeInjection objects>
}

CodeInjection

A code injection.

{
    "class": "CodeInjection",
    "reference": <BEFORE_BODY_CLOSE|AFTER_HEAD_START|BEFORE_HEAD_CLOSE>, // place in the HTML where the injection must go
    "type": <INTERNAL_JAVASCRIPT|EXTERNAL_JAVASCRIPT|INTERNAL_STYLE_SHEET|EXTERNAL_STYLE_SHEET|HTML_CONTENT>, // type of the injection
    "value": <string value of the injection, must be expanded within the request scope>
}

ComparisonRule

{
    "class": "ComparisonRule",
    "leftSide": <leftSide string value>,
    "rightSide": <rightSide string value>,
    "operator": <equals|startsWith|endsWith|contains|=|<|<=|>|>=>,
    "caseSensitive": <boolean>
}

AndRule

{
    "class": "AndRule",
    "rules": <array of Rules>
}

OrRule

{
    "class": "OrRule",
    "rules": <array of Rules>
}

NotRule

{
    "class": "NotRule",
    "rule": <Rules>
}

Request Processing

This section describes the actual work a filter has to do. A filter implementation must intercept all requests (whether as a module in the HTTP server or as a proxy). For every request the following steps need to be performed (pseudo code).

if (path startsWith ${UNBLU_PUBLIC_PATH} {
    proxy request to unblu
} else {
    let backend process request
    build request scope environment variables
    perform regex replacements
    if (cachCondition evaluates to true) {
        send response to unblu HTTP filter interface "add-to-cache" action
        add returned environment variables to request scope
        set ${UNBLU_USER_COOKIE_NAME} "set-cookie" header if ${UNBLU_USER_ID} is set
        set ${UNBLU_SESSION_COOKIE_NAME} "set-cookie" header if ${UNBLU_USER_SESSION_ID} is set
     }
    perform code injections if their conditions evaluate to true in request scope environment
     if (a code injection took place) {
        clear "etag" header in response
        clear "last-modified" header in response
     }
    send (potentially modified response body to the browser)
}

If the collaboration server is (temporarily) unavailable, the filter must not disturb the normal operation of the backend HTTP server. In order to achieve this it might make sense to remember the state of the server within the filter. Doing this you can avoid running into timeouts at every request.

This could be done similar to:

As soon as the filter fails to call the collaboration server it starts a background thread that waits for the server to come up again. Until the server is up again the filter does not attempt to send requests to the server anymore. As soon as unblu is up again the filter starts sending request to it again. In order to check if the server is up the "ping" action of the HTTP interface can be used.

Lifecycle

This section describes the life cycle of a filter.

Startup

When the filter starts up it must load its initial configuration from the server (by sending a "read-configuration") request to the filter HTTP interface. The filter must store the version of the configuration for later comparison to the server's actual configuration version.

Reconfiguration

Every time the filter sends a request to the collaboration server (whether it calls the HTTP interface or if it forwards a request to the server), it must compare the value of the x-unblu-configuration-version value with the version of the configuration it got when last loading the configuration from the server. If the version does not match, the filter must reload the configuration and apply it.

Development

When developing a filter implementation a local tomcat running a collaboration server can be used as the filter backend.

To deploy the collaboration server to a local tomcat, it makes sense to rename the original product.<productId><version><qualifier>.war to ROOT.war (i.e. product.com.unblu.review-2.2.0-2011090191106.war). After renaming it, it can be simply copied into the webapp directory of the tomcat installation. If autodeploy is enabled in the tomcat, the collaboration server will be available at <tomcatSchema>://<tomcatHost>:<tomcatport>unblu (i.e http://localhost:8080). The UNBLU_URL is http://localhost:8080/unblu/ in this case.

  • deployonprem

results matching ""

    No results matching ""