Documentation

Unblu 6 (latest)

Purpose

Unblu and some implementation partners provide filters for many integration scenarios (see this page for further information). Sometimes, however, an adapted integration may be required. This custom filter specification describes how a filter can be implemented.

Implementation Options

Depending on the integration method, the filter can be implemented thus:

  1. Integrated into and running on the web server.

  2. Integrated into and running on the application server.

  3. Integrated into and running within a (reverse) proxy.

  4. Implemented as a (reverse) proxy.

  5. Running in the end user’s browser.

This document will not explain all integrations and implementation variations in detail. Instead it will focus on providing all necessary information to allow developers to implement a fitting solution on their own.

Requirements overview

With the following sections, requirements are split into the responsibilities of the filter:

  1. Rule-based javascript injection.

  2. Rule-based resource catch-and-forward to Unblu.

  3. Rule-based request proxying (only required when implemented as a reverse proxy).

"Injection" filter requirements

If the filter is integrated into or implemented as a proxy or if the filter is integrated into the customer’s web or application server, it must

  • have access to all HTTP(S) requests and responses that are exchanged between the browser and the backend server.

  • be able to intercept requests and/or modify responses (headers and content).

If the filter is running in the end user’s browser, it must

  • be included on all web pages intended to be co-browsable

"Resource forwarding" requirements

The filter must have access to all HTTP(S) requests and responses that are exchanged between the browser and the backend server. It can be implemented as a reverse proxy or as a filter directly running on the customer’s web or application server or the end user’s browser. It must be able to intercept requests, forward response bodies to Unblu and/or modify responses (headers and content). In addition it must be able to redirect certain requests (messages from the browser) to the collaboration server instead of letting the backend process them.

"Proxying" filter requirements

When implemented as a proxy, the filter must be able to

  • redirect certain requests (requests from the browser) to the collaboration server instead of letting the backend process them.

Configuration

In order to keep the configuration in one place, the filter needs to be configured dynamically from the collaboration server. Local configuration should be limited to things that cannot be provided by the server (such as locating the collaboration server) and to purely implementation-specific, technical configuration. When the filter starts up, it must read its configuration from the server. The configuration contains all information the filter needs in order to do its work. Because the collaboration server might be reconfigured without the filter being restarted, a special header containing a configuration version identifier is included in every response Unblu sends to the filter. As soon as the version identifier changes, the filter must reload the configuration from the server.

Forward requests to Unblu

The configuration defines a list of path prefixes (i.e. /unblu/) that identifies requests that need to be forwarded to the collaboration server. All requests from the customers browser starting with this prefix must be forwarded to the server instead of processing them as normal requests to the backend web or application server.

Cache content

The filter must send the content of the response bodies (enriched by additional information) to the collaboration server so that it can be stored for (later) playback in the Unblu player. The filter has to decide whether a response needs to be sent to the collaboration server according to rules defined in the configuration.

Code injection

The filter must be able to inject JavaScript and CSS code into the body of the HTML response. The content that needs to be injected and the rules defining whether a response needs injection or not are defined in the configuration.

Regex replacements (optional)

The filter must be able to perform regular expression replacements in textual HTTP responses (HTML, CSS, JavaScript…​). A list of replacements is included in the filter configuration.

String replacement

Parts of the injected code need to be dynamic (for instance the id of the current cache content). Therefore, the filter needs to perform simple string replacements in the JavaScript content before injecting into the HTML body of the response. These string replacements are based on the #Environment Variables that the filter provides.

Rule evaluation

The decision whether or not a page needs to be sent to the collaboration server or whether it needs code injection must be taken on the filter. A round-trip to the collaboration server is not acceptable. Therefore, the filter needs to evaluate a set of rules (provided in the configuration) in order to decide. These rules need to be able to take into account the #Environment Variables.

Environment variables

The filter needs to provide a predefined set of environment variables to be used in #String replacement and #Rule evaluation. Some of these environment variables are static (such as unbluPublicPath, the path to the collaboration server), others are dynamic and need to be provided in a request scope (requestUri, cookie values, contentType…​)

Non functional requirements

The filter must smoothly continue to deliver backend requests even if the collaboration server is (temporarily) unavailable.

UML Overview

UML overview of the Java implementation

filter uml

Environment variables / String substitution

Filter implementations must provide environment variables and string substitution functionality. This functionality is used by the rule evaluation system and by the code injection.

Variable names must be case insensitive.

Scoping

Environment variables must be provided on three different scopes: filter, configuration and request. In the filter scope, globally valid variables such as the URL of the collaboration server are stored. The request scope holds variables that are only valid for a single request (cookie values, response content type…​). The request scope environment must fall back on the configuration scope environment if a variable key is not found within the request scope, configuration scope must fall back to filter scope.

String substitution

Filter implementations must be able to search for placeholders in strings and replace them with variable values from the environment. The expansion of placeholders with their value must be performed recursively as variable values can also hold placeholders.

Placeholders have the following format: ${PLACEHOLDER_NAME}. Only 'a' to 'z' characters, - and _ are allowed in variable keys. The following regular expression can be used to find placeholders: \${a-zA-Z_-+}.

Lookup of placeholders is case insensitive.

Sources of environment variables

Environment variables are defined from several sources:

  • local filter configuration (things the filter must know on its own like the URL of the collaboration server)

  • collaboration server (the server can send environment variables in its responses)

  • HTTP request headers

  • HTTP response headers

The following table lists all environment variables that filter implementations must support:

Variable name Scope Source remarks

UNBLU_URL

Filter

Local filter configuration

The URL of the collaboration server (i.e. http://localhost:8080)

UNBLU_PUBLIC_PATH

Filter

Local filter configuration

The path prefix for requests that are redirected to the collaboration server (usually /unblu)

UNBLU_SYSTEM_PATH

Filter

Local filter configuration

Path prefix for requests from the filter to the collaboration server (for the filter server communication)

UNBLU_APIKEY

Filter

Local filter configuration

The APIKEY to be used when communicating with the collaboration server

ORIGINAL_URL

Request

Request

The URL of the current request

UNBLU_START_TIME

Filter

Response Headers

Initially empty as soon as a response with a x-unblu-start-time header has been processed, UNBLU_START_TIME must be set to the value of the header

FILTER_START_TIME

Filter

Filter implementation

Set during filter start up. Timestamp of the start time of the filter (milliseconds since 1.1.1970 0:00 UTC).

START_TIME

Filter

Filter implementation

MAX(FILTER_START_TIME, UNBLU_START_TIME)

ORIGINAL_PATH

Request

Request

The PATH component of the URL of the current request

CONTENT_TYPE

Request

Response Headers

The content type of the response. If the original header contains a character set, it must be stripped away.

CONTENT_LENGTH

Request

Response Headers

The content length of the response, if known.

CHARACTER_SET

Request

Response Headers

The character set of the response, if available

DEFAULT_CHARACTER_SET

Filter

Local filter configuration

The default character set of the filter

COOKIE_<cookieName>

Request

Request Headers

For every cookie the client sends to the filter, an environment variable with the pattern COOKIE_<cookieName> must be generated

REQUEST_HEADER_<headerName>

Request

Request Headers

For every request header the client sends to the filter, an environment variable with the pattern REQUEST_HEADER_<headerName> must be generated

RESPONSE_HEADER_<headerName>

Request

Response Headers

For every response header the backend sends to the filter, an environment variable with the pattern RESPONSE_HEADER_<headerName> must be generated

whatever sent by the Unblu server

Configuration

Filter configuration from Unblu server

Every time the filter loads its configuration, it must put all supplied environment variables into its configuration scope

whatever sent by the Unblu server

Request

ContentId response from the server

Every time the filter sends a "cacheContent" message to the server, it must put all supplied environment variables into the request scope

Rule Evaluation Support

Filter implementations must provide rule evaluation support. Rules are boolean expressions that consist of various comparisons (equal, greater than, startsWith…​), conjunctive ("and") and disjunctive ("or") combinations and negations ("not"). Rules evaluations are used where the filter needs to decide whether a response needs to be sent to the collaboration server for caching and to decide if a code injection needs to be performed or not.

Rule evaluation always takes place within an environment scope (filter, configuration or request). The evaluation of every rule within an environment results in a boolean value.

Rule Types

Comparison

type-property: comparison

Comparison rules consist of the following parts:

  • leftSide: the left side of the comparison

  • operator: the operator of the comparison

  • rightSide: the right side of the comparison

  • caseSensitive: whether or not the comparison should consider case

The operator property can have one of the following values:

| operator | true if | | :-- | :-- | | equals | left and right side are equal (string comparison) | | startsWith | left side string value starts with right side string value | | endsWith | left side string value ends with right side string value | | contains | left side string value contains right side string value | | = | left side is equal to right side (numeric comparison) | | > | left side is bigger than right side | | < | left side is smaller than right side | | >= | left side is bigger or equal to right side | | <= | left side is smaller or equal to right side | For example, a comparison rule could compare the response content type of a request against a string literal:

  • leftSide: ${CONTENT_TYPE}

  • operator: equal

  • rightSide: text/html

  • caseSensitive: false

And

type-property: and

And rules consist of a list of compound rules ("rules" property).

And rules evaluate to true if all components evaluate to true.

Or

type-property: or

Or rules consist of a list of compound rules ("rules" property).

Or rules evaluate to true if at least one of the components evaluates to true.

Not

type-property: not

Not rules consist of a single rule ("rule" property).

Not rules evaluate to true if the contained rule evaluates to false.

UML

UML of the Java implementation

filter rule uml

Conditional Code Injection

The filter configuration holds a list of ConditionalCodeInjections. A conditional code injection consists of a Rule (condition) and a list of CodeInjections that must be performed if the condition evaluates to true (in the request scope environment).

Every CodeInjection consists of a reference (place in the HTML code where the injection must be performed), a type (internal or external JavaScript, internal or external style sheet) and a value holding the string value of the code injection. The value can hold placeholders so it must be expanded in the request environment prior to injection.

Reference

The code injection references represents the spot in the HTML code where the injection string must be placed.

BEFORE_BODY_CLOSE

Before body close means that the string must be inserted just in front of the closing body tag.

<html>
<head>
</head>
<body>
whatever content
INJECTION GOES HERE</body>
</html>

AFTER_HEAD_START

After head start means that the string must be inserted just after the opening head tag.

<html>
<head>
INJECTION GOES HERE
<title></title>
</head>
<body>
whatever content
</body>
</html>

BEFORE_HEAD_CLOSE

Before head close means that the string must be inserted just before the closing head tag.

<html>
<head>
<title></title>
INJECTION GOES HERE
</head>
<body>
whatever content
</body>
</html>

AFTER_LAST_META

After last meta means that the string must be inserted just after the last meta tag that occurs between starting head and ending head tag.

<html>
<head>
<title></title>
<meta>
<meta>
<anyothertag>
<meta>
INJECTION GOES HERE
<anyothertag>
</head>
<body>
whatever content
</body>
</html>

Type

The code injection type defines how the injection string must be modified before it is inserted in the HTML.

INTERNAL_JAVASCRIPT

Expanded value:

var unbluConfig = {sessionCookieName: "x-unblu-sid", contentId: "iidzzllei889088d88kke8dujd"}

Injected string:

<script type="text/javascript" charset="UTF-8">
var unbluConfig = {sessionCookieName: "x-unblu-sid", contentId: "iidzzllei889088d88kke8dujd"}
</script>

EXTERNAL_JAVASCRIPT

Expanded value:

/unblu/javascript/consultant.js

Injected string:

<script type="text/javascript" charset="UTF-8" src="/unblu/javascript/consultant.js"></script>

INTERNAL_STYLE_SHEET

Expanded value:

.unbluSupportLink { background-color: #FF0000 }

Injected string:

<style type="text/css">
.unbluSupportLink { background-color: #FF0000 }
</style>

EXTERNAL_STYLE_SHEET

Expanded value:

/unblu/css/consultant.css

Injected string:

<link rel="stylesheet" href="/unblu/css/consultant.css" type="text/css" media="all"></link>

HTML_CONTENT

Expanded value

<div id="myDiv"></div>

Injected string:

<div id="myDiv"></div>

Conditional Regex Replacements (optional)

The filter configuration holds a list of ConditionalRegexReplacement instances. Every ConditionalRegexReplacement holds a rule (condition property). If this condition evaluates to true in the request scope environment the regular expression-based string replacement must be performed on the response body. All matches of the given pattern must be replaced with the given replacement string. The replacement must be expanded in the request scope environment as it can contain placeholders.

Communication with collaboration server

Filter implementations need to communicate with the collaboration server in the following situations:

  • forward (proxy) requests that start with ${UNBLU_PUBLIC_PATH} to the collaboration server

  • read configuration

  • send HTTP responses to collaboration server for caching

In the first case the filter simply proxies incoming requests to the server by translating the request path and forwarding query string, body and parts of the headers.

For the other cases (read configuration, cache contents), the server provides a simple HTTP interface.

Proxy requests to Unblu

The filter has to forward all requests that have a path that starts with ${UNBLU_PUBLIC_PATH} to the collaboration server.

For instance, if the filter is reachable at http://foo.com/ and the Unblu server URL is http://localhost:8080/unblu/ and ${UNBLU_PUBLIC_PATH} is /unblu then a request to +http://foo.com/unblu/js/bar.js+has to be forwarded to http://localhost:8080/unblu/js/bar.js. The request to the collaboration server has to include the original query string and the body of the request (if it is a POST request). When forwarding a request to the server generally all request headers (as sent from the browser) must also be set in the request to the server and all response headers (as sent from the server) must also be set in the response to the browser.

In most cases a forwarding request to Unblu is realized using an already in place reverse proxy (i.e., Apache mod_proxy or similar). The proxy must at least implement the "GET" and "POST" HTTP methods.

HTTP interface

When the filter reads configuration from the collaboration server or sends a response to the server for caching, it uses a simple HTTP interface.

The HTTP interface is reachable through the URL ${UNBLU_SYSTEM_URL}/filterBackend.

Requests to this interface consist of a set of key/value pairs. These pairs can be sent in the form of proprietary headers (x-unblu-<name>:<value>) or as request parameters in the query string (i.e., http://localhost:8080/unblu/rest/filterBackend?x-unblu-action=add-to-cache…​). In addition to proprietary headers or query string parameters, binary data (body of responses to be cached) is transmitted in the body of a post request. If data is transmitted content-type has to be set to application/octet-stream and the content-length header must be set.

Responses from the HTTP interface always consist of a JSON string holding either a response (depending on the action) or an error message.

In addition to the response body, the HTTP interface responses include the following proprietary headers:

  • x-unblu-configuration-version (the currently valid configuration version)

  • x-unblu-session-invalid: true if the session is not valid (anymore)

The following parameters must be set in every request to the HTTP interface:

  • interface-version: the version of the interface, the filter implements (for Unblu 5+ this is always 2, for older versions of Unblu use 1)

  • apikey: the apikey as specified in the filter base configuration

  • action: the action that the filter wants to perform

  • configuration-version: the version of the current configuration the filter uses, if available, or an empty string if no configuration has been loaded so far

Depending on the "action" parameter additional parameters are required:

Action "ping"

Type of response: empty object

Action "read-configuration"

This action reads the configuration for the filter and returns it as a JSON object.

Type of response: FilterConfiguration

Action "add-to-cache"

This action adds a response to the cache.

Type of response: ContentId

The following additional parameters are required:

  • cache-date: the date the response was received

  • content-type: the content type of the response body (without the character set part)

  • character-set: the character set the response body is encoded with, if available

  • http-status-code: the status code of the response

  • original-URL: the URL of the request as it was sent to the filter

file-name: a human readable file-name (is used for downloads) in generic implementations the content-disposition headers "filename" fragment can be used (if present)

You need to have a valid session id (received after logging in) which has to be transmitted in the Cookies.

The body of the response must be sent to Unblu in the body of the post request.

Data Types

All returned JSON data types are based on a base type with the following structure.

{
	"class": <dataTypeName>
	"environment": {
		"key1": "value1",
		"key2": "value2",
 		"keyn" : "valuen"
 	}
}

"class" holds the name of the data type, "environment" holds a set of key/value pairs. The elements in the environment must be added the environment of the scope. The request is executed in (read-configuration: filter scope, add-to-cache: request scope).

FilterBackendError

Whenever something went wrong.

{
 "class": "FilterBackendError"
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "code": <int error code>,
 "message": <error message>
 }

CacheContentId

In responses to "add-to-cache"

{
 "class": "CacheContentId"
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "contentId": <string id of the cache content>
 }

FilterConfiguration

In response to "read-configuration"

{
 "class": "FilterConfiguration",
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "version": <string version of the configuration>
 "cacheCondition": <Rule that must be evaluated in order to decide if a response must be added to the cache>
 "codeInjections": <array of ConditionalCodeInjection objects>
 }

ConditionalRegexReplacement

A regular expression based string replacement that must be performed if the condition evaluates to true in the request scope

{
	"class": "ConditionalRegexReplacement",
	"condition": <Rule, if it evaluates to true, the replacement must be performed>,
	"pattern": <the regex pattern>
	"replacement": <the string that must be inserted as a replacement for all matches>
}

ConditionalCodeInjection

A set of code injections that must be performed if a given condition (rule) evaluates to true.

{
	"class": "ConditionalCodeInjection",
	"condition": <Rule that decides if the injection takes place for the current request>,
	"injections": <array of CodeInjection objects>
}

CodeInjection

A code injection.

{
	"class": "CodeInjection",
	"reference": <BEFORE_BODY_CLOSE|AFTER_HEAD_START|BEFORE_HEAD_CLOSE>, // place in the HTML where the injection must go
	"type": <INTERNAL_JAVASCRIPT|EXTERNAL_JAVASCRIPT|INTERNAL_STYLE_SHEET|EXTERNAL_STYLE_SHEET|HTML_CONTENT>, // type of the injection
	"value": <string value of the injection, must be expanded within the request scope>
}

ComparisonRule

{
	"class": "ComparisonRule",
	"leftSide": <leftSide string value>,
	"rightSide": <rightSide string value>,
	"operator": <equals|startsWith|endsWith|contains|=|<|<=|>|>=>,
	"caseSensitive": <boolean>
}

AndRule

{
	"class": "AndRule",
	"rules": <array of Rules>
}

OrRule

{
	"class": "OrRule",
	"rules": <array of Rules>
}

NotRule

{
	"class": "NotRule",
	"rule": <Rules>
}

Request Processing

This section describes the actual work a filter has to do. A filter implementation must intercept all requests (whether as a module in the HTTP server or as a proxy). For every request the following steps need to be performed (pseudo code).

if (path startsWith ${UNBLU_PUBLIC_PATH} {
	proxy request to Unblu
} else {
	let backend process request
	build request scope environment variables
	perform regex replacements
	if (cachCondition evaluates to true) {
		send response to Unblu HTTP filter interface "add-to-cache" action
		add returned environment variables to request scope
		set ${UNBLU_USER_COOKIE_NAME} "set-cookie" header if ${UNBLU_USER_ID} is set
		set ${UNBLU_SESSION_COOKIE_NAME} "set-cookie" header if ${UNBLU_USER_SESSION_ID} is set
 	}
	perform code injections if their conditions evaluate to true in request scope environment
 	if (a code injection took place) {
		clear "etag" header in response
		clear "last-modified" header in response
 	}
	send (potentially modified response body to the browser)
}

If the collaboration server is (temporarily) unavailable, the filter must not disturb the normal operation of the backend HTTP server. In order to achieve this it might make sense to remember the state of the server within the filter. Doing this you can avoid running into timeouts at every request.

This could be done similar to:

As soon as the filter fails to call the collaboration server it starts a background thread that waits for the server to come up again. Until the server is up again the filter does not attempt to send requests to the server anymore. As soon as Unblu is up again the filter starts sending request to it again. In order to check if the server is up the "ping" action of the HTTP interface can be used.

Lifecycle

This section describes the life cycle of a filter.

Startup

When the filter starts up it must load its initial configuration from the server (by sending a "read-configuration") request to the filter HTTP interface. The filter must store the version of the configuration for later comparison to the server’s actual configuration version.

Reconfiguration

Every time the filter sends a request to the collaboration server (whether it calls the HTTP interface or if it forwards a request to the server), it must compare the value of the x-unblu-configuration-version value with the version of the configuration it got when last loading the configuration from the server. If the version does not match, the filter must reload the configuration and apply it.

Development

When developing a filter implementation a local tomcat running a collaboration server can be used as the filter backend.

To deploy the collaboration server to a local tomcat, it makes sense to rename the original product.<productId><version><qualifier>.war to <root>.war (i.e. product.com.unblu.review-2.2.0-2011090191106.war). After renaming it, it can be simply copied into the webapp directory of the tomcat installation. If autodeploy is enabled in the tomcat, the collaboration server will be available at <tomcatSchema>://<tomcatHost>:<tomcatport>unblu (i.e http://localhost:8080). The UNBLU_URL is http://localhost:8080/unblu/ in this case.