Filter Specification

1. Purpose

Unblu and some implementation partners provide filters for many integration scenarios (see this page for further informations). Sometimes, however, an adapted integration may be required. This custom filter specification describes, how a filter can be implemented.

2. Implementation Options

Depending on the way of integration, the filter can be implemented

  1. integrated into and running on the web server
  2. integrated into and running on the application server
  3. integrated into and running within a (reverse) proxy
  4. implemented as a (reverse) proxy
  5. running in the end user's browser

This document will not explain all integrations and implementation variations in detail. Instead it will focus on providing all necessary informations to allow developers to implement a fitting solution on their own.

3. Requirements overview

With the following chapters, requirements are split into the responsibilities of the filter:

  1. rule based javascript injection
  2. rule based resource catch-and-forward to unblu
  3. rule based request proxying (only required when implemented as a reverse proxy)

3.1. "Injection" filter requirements

If the filter is integrated into or implemented as a proxy or if the filter is integrated into the customers web or application server, it must

  • have access to all http(s) requests and responses that are exchanged between the browser and the backend server
  • be able to intercept requests and/or modify responses (headers and content).

If the filter is running in the end users's browser, it must

  • be included on all web pages supposed to be co-browsable

3.2. "Resource forwarding" requirements

The filter must have access to all http(s) requests and responses that are exchanged between the browser and the backend server. It can be implemented as a reverse proxy or as a filter directly running on the customers web or application server or the end users browser. It must be able to intercept requests, forward response bodies to unblu and/or modify responses (headers and content). In addition it must be able to redirect certain requests (messages from the browser) to the unblu server instead of letting the backend process them.

3.3. "Proxying" filter requirements

When implemented as a proxy, the filter must be able to

  • redirect certain requests (requests from the browser) to the unblu server instead of letting the backend process them.

3.4. Configuration

In order to keep the configuration in one place, the filter needs to be configured dynamically from the unblu server. Local configuration should be limited to things that cannot be provided by the unblu server (such as locating the unblu server) and to purely impementation specific, technical configuration. When the filter starts up, it must read it's configuration from the unblu server. The configuration contains all information the filter needs in order to do it's work. Because the unblu server might be reconfigured without the filter beeing restarted, a special header containing a configuration version identifier is included in every response unblu sends to the filter. As soon as the version identifier changes, the filter must reload the configuration from the server.

3.5. Forward requests to unblu

The configuration defines a list of path prefixes (i.e. /unblu/) that identifies requests that need to be forwarded to the unblu server. All requests from the customers browser starting with this prefix must be forwarded to the unblu server instead of processing them as normal requests to the backend web- or application server.

3.6. Cache content

The filter must send the content of the response bodies (enriched by additional information) to the unblu server so that it can be stored for (later) playback in the unblu player. The filter has to decide whether a response needs to be sent to the unblu server according to rules defined in the configuration.

3.7. Code injection

The filter must be able to inject javascript and css code into the body of the html response. The content that needs to be injected and the rules defining whether a response needs injection or not are defined in the configuration.

3.8. Regex replacements (optional)

The filter must be able to perform regular expression replacements in textual http responses (html, css, javascript...). A list of replacements is included in the filter configuration.

3.9. String replacement

Parts of the injected code need to be dynamic (for instance the id of the current cache content). Therefor the filter needs to perform simple string replacements in the javascript content before injecting into the html body of the response. These string replacements are based on the #Environment Variables that the filter provides.

3.10. Rule evaluation

The decision whether or not a page needs to be sent to the unblu server or whether it needs code injection must be taken on the filter. A roundtrip to the unblu server is not acceptable. Therefor the filter needs to evaluate a set of rules (provided in the configuration) in order to decide. These rules need to be able to take into account the #Environment Variables.

3.11. Environment variables

The filter needs to provide a predifined set of environment variables to be used in #String replacement and #Rule evaluation. Some of these environment variables are static (such as the unbluPath, path to the unblu server), others are dynamic and need to be provided in a request scope (requestUri, cookie values, contentType...)

3.12. Non functional requirements

The filter must smoothly continue to deliver backend requests even if the unblu server is (temporarily) not available.

4. UML Overview

UML overview of the java implementation

5. Environment variables / String substitution

Filter implementations must provide environment variables and string substitution functionality. This functionality is used by the rule evaluation system and by the code injection.
Variable names must be case insensitive.

6. Scoping

Environment variables must be provided on tree different scopes: filter, configuration and request. In the filter scope, globally valid variables such as the URL of the unblu server are stored. The request scope holds variables that are only valid for a single request (cookie values, response content type...). The request scope environment must fall back on the configuration scope environment if a variable key is not found within the request scope, configuration scope must fall back to filter scope.

7. String substitution

Filter implementations must be able to search for placeholders in strings and replace them with variable values from the environment. The expansion of placeholders with their value must be performed recursively as variable values can also hold placeholders.

Placeholders have the following format: ${PLACEHOLDER_NAME}. Only a to z characters, - and _ are allowed in variable keys. The following regular expression can be used to find placeholders: \${a-zA-Z_-+}.

Lookup of placeholders is case insensitive.

// TODO add documentation for conditional substitution

8. Sources of environment variables

There are several sources, environment variables are defined from:

  • local filter configuration (things the filter must know on its own like the URL of the unblu server)
  • unblu server (the server can send environment variables in its responses)
  • http request headers
  • http response headers

The following tables lists all environment variables, filter implementations must support:

Variable Name

Scope

Source

Remarks

UNBLU_PATH

Filter

Local filter configuration

depricated, use UNBLU_PUBLIC_PATH instead

UNBLU_PUBLIC_PATH

Filter

Local filter configuration

the path prefix for requests that are redirected to the unblu server (usually /unblu)

UNBLU_SYSTEM_PATH

Filter

Local filter configuration

path prefix for requests from the filter to the unblu server (for the filter server communication)

UNBLU_URL

Filter

Local filter configuration

the url of the unblu server (i.e. http://localhost:8080)

ORIGINAL_URL

Request

Request

The URL of the current request

UNBLU_START_TIME

Filter

Response Headers

Initially empty as soon as a response with a x-unblu-start-time header has been processed, UNBLU_START_TIME must be set to the value of the header

FILTER_START_TIME

Filter

Filter implementation

Set during filter start up. Timestamp of the start time of the filter (milliseconds since 1.1.1970 0:00 UTC).

START_TIME

Filter

Filter implementation

MAX(FILTER_START_TIME, UNBLU_START_TIME)

ORIGINAL_PATH

Request

Request

The PATH component of the URL of the current request

CONTENT_TYPE

Request

Response Headers

The content type of the response. If the original header contains a character set, it must be striped away.

CONTENT_LENGTH

Request

Response Headers

The content length of the response if known.

CHARACTER_SET

Request

Response Headers

The character set of the response if available

DEFAULT_CHARACTER_SET

Filter

Local filter configuration

The default character set of the filter

COOKIE_<cookieName>

Request

Request Headers

For every cookie, the client sent to the filter, an environment variable with the pattern COOKIE_<cookieName> must be generated

REQUEST_HEADER_<headerName>

Request

Request Headers

For every request header, the client sent to the filter, an environment variable with the pattern REQUEST_HEADER_<headerName> must be generated

RESPONSE_HEADER_<headerName>

Request

Response Headers

For every response header, the backend sent to the filter, an environment variable with the pattern RESPONSE_HEADER_<headerName> must be generated

what ever sent by the unblu server

Configuration

Filter configuration from unblu server

Every time the filter loads it's configuration, it must put all supplied environment variables into its configuration scope

what ever sent by the unblu server

Request

ContentId response from the server

Every time the filter sends a "cacheContent" message to the server, it must put all supplied environment variables into the request scope

9. Rule evaluation

Filter implementations must provide rule evaluation support. Rules are boolean expressions that consist of various caparisons (equal, greater than, startsWith...), conjunctive ("and") and disjunctive ("or") combinations and negations ("not"). Rules evaluations are used where the filter needs to decide whether a response needs to be sent to the unblu server for caching and to decide if a code injection needs to be performed or not.
Rule evaluation always takes place within an environment scope (filter, configuration or request). The evaluation of every rule within an environment results in a boolean value.

10. Rule Types

10.1. Comparison

type-property: comparison

Comparison rules consist of the following parts:

  • leftSide: the left side of the comparison
  • operator: the operator of the comparison
  • rightSide: the right side of the comparison
  • caseSensitive: whether or not the comparison should consider case

The operator property can have one of the following values:

operator

true if

equals

left and right side are equal (string comparison)

startsWith

left side string value starts with right side string value

endsWith

left side string value ends with right side string value

contains

left side string value contains right side string value

=

left side is equal to right side (numeric comparison)

>

left side is bigger than right side

<

left side is smaller than right side

>=

left side is bigger or equal to right side

<=

left side is smaller or equal to right side

For example a comparison rule could compare the response content type of a request against a string literal:

  • leftSide: ${CONTENT_TYPE}
  • operator: equal
  • rightSide: text/html
  • caseSensitive: false

10.2. And

type-property: and

And rules consist of a list of compound rules ("rules" property).
And rules evaluate to true if all components evaluate to true.

10.3. Or

type-property: or

Or rules consist of a list of compound rules ("rules" property).
Or rules evaluate to true if at least one of the components evaluates to true.

10.4. Not

type-property: not

Not rules consist of a single rule ("rule" property).
Not rules evaluate to true if the contained rule evaluates to false.

11. UML

UML of the java implementation

12. Code Injection

The filter configuration holds a list of ConditionalCodeInjections. A conditional code injection consists of a Rule (condition) an a list of CodeInjections that must be performed if the condition evaluates to true (in the request scope environment).
Every CodeInjection consists of reference (place in the html code where the injection must be performed), a type (internal or external javascript, internal or external style sheet) and a value holding the string value of the code injection. The value can hold placeholders so it must be expanded in the request environment prior to injection.

13. Reference

The code injection references represents the spot in the html code where the injection string must be placed.

13.1. BEFORE_BODY_CLOSE

befor body close means that the string must be inserted just in front of the closing body tag.

<html>
<head>
</head>
<body>
what ever content
INJECTION GOES HERE</body>
</html>

 

13.2. AFTER_HEAD_START

after head start means that the string must be inserted just after the opening head tag.

<html>
<head>
INJECTION GOES HERE
<title></title>
</head>
<body>
what ever content
</body>
</html>

13.3. BEFORE_HEAD_CLOSE

before head close means that the string must be inserted just before the closing head tag.

<html>
<head>
<title></title>
INJECTION GOES HERE
</head>
<body>
what ever content
</body>
</html>

13.4. AFTER_LAST_META

after last meta means that the string must be inserted just after the last meta tag that occurs between starting head and ending head tag.

<html>
<head>
<title></title>
<meta>
<meta>
<anyothertag>
<meta>
INJECTION GOES HERE
<anyothertag>
</head>
<body>
what ever content
</body>
</html>

 

14. Type

The code injection type defines, how the injection string must be modified before it is inserted in the html.

14.1. INTERNAL_JAVASCRIPT

Expanded value:

var unbluConfig = {sessionCookieName: "x-unblu-sid", contentId: "iidzzllei889088d88kke8dujd"}

Injected string:

<script type="text/javascript" charset="UTF-8">
var unbluConfig = {sessionCookieName: "x-unblu-sid", contentId: "iidzzllei889088d88kke8dujd"}
</script>

14.2. EXTERNAL_JAVASCRIPT

Expanded value:

/unblu/javascript/consultant.js

Injected string:

<script type="text/javascript" charset="UTF-8" src="/unblu/javascript/consultant.js"></script>

14.3. INTERNAL_STYLE_SHEET

Expanded value:

.unbluSupportLink { background-color: #FF0000 }

Injected string:

<style type="text/css">
.unbluSupportLink { background-color: #FF0000 }
</style>

14.4. EXTERNAL_STYLE_SHEET

Expanded value:

/unblu/css/consultant.css

Injected string:

<link rel="stylesheet" href="/unblu/css/consultant.css" type="text/css" media="all"></link>

14.5. HTML_CONTENT

Expanded value

<div id="myDiv"></div>

Injected string:

<div id="myDiv"></div>

15. Regex Replacements (optional)

The filter configuration holds a list of ConditionalRegexReplacement instances. Every ConditionalRegexReplacement holds a rule (condition property), if this condition evaluates to true in the request scope environment, the regular expression based string replacement must be performed on the response body. All matches of the given pattern must be replaced with the given replacement string. The replacement must be expanded in the request scope environment as it can contain placeholders.

16. Communication with unblu server

Filter implementations need to communicate with the unblu server in the following situations:

  • forward (proxy) requests that start with ${UNBLU_PUBLIC_PATH} to the unblu server
  • read configuration
  • send HTTP responses to unblu server for caching

In the first case, the filter simply proxies incoming requests to the unblu server by translating the request path and forwarding query string, body and parts of the headers.
For the other cases (read configuration, cache contents), the unblu server provides a simple HTTP interface.

17. Proxy requests to unblu

The filter has to forward all requests that have a path that starts with ${UNBLU_PUBLIC_PATH} to the unblu server.
For instance if the filter is reachable at http://foo.com/ and the unblu server URL is http://localhost:8080/unblu/ and ${UNBLU_PUBLIC_PATH} is /unblu then a request to http://foo.com/unblu/js/bar.js has to be forwarded to http://localhost:8080/unblu/js/bar.js. The request to the unblu server has to include the original query string and the body of the request (if it is a POST request). When forwarding a request to the unblu server, generally all request headers (as sent from the browser) must also be set in the request to the unblu server and all response headers (as sent from the unblu server) must also be set in the response to the browser.
In most cases forwarding request to unblu is realized using an already in place reverse proxy (i.e. apache mod_proxy or similar). The proxy must at least implement the "GET" and "POST" HTTP methods.

18. HTTP interface

When the filter reads configuration from the unblu server or sends a response to the unblu server for caching, it uses a simple HTTP interface.
The HTTP interface is reachable trough the URL ${UNBLU_SYSTEM_URL}/filterBackend.

Requests to this interface consist of a set of key/value pairs. These pairs can be sent in the form of proprietary headers (x-unblu-<name>: <value>) or as request parameters in the query string (i.e. http://localhost:8080/unblu/rest/filterBackend?x-unblu-action=add-to-cache...). In addition to proprietary headers or query string parameters, binary data (body of responses to be cached) is transmitted in the body of a post request. If data is transmitted, content-type has to be set to application/octet-stream and the content-length header must be set.

Responses from the HTTP interface always consist of a JSON string holding either a response (depending on the action) or an error message.
In addition to the response body, the HTTP interface responses include the following proprietary headers:

  • x-unblu-configuration-version (the currently valid configuration version)
  • x-unblu-session-invalid: true if the session is not valid (anymore)

The following parameters must be set in every request to the HTTP interface:

  • interface-version: the version of the interface, the filter implements (for now this is always 1)
  • action: the action that the filter wants to perform
  • configuration-version: the version of the current configuration, the filter uses if available or an empty string if no configuration has been loaded so far

Depending on the "action" parameter additional parameters are required...

18.1. Action "ping"

Type of response: empty object

18.2. Action "read-configuration"

This action reads the configuration for the filter and returns it as a JSON object.
Type of response: FilterConfiguration

18.3. Action "add-to-cache"

This action adds a response to the cache.
Type of response: ContentId

The following additional parameters are required:

  • cache-date: the date the response was received
  • content-type: the content type of the response body (without the character set part)
  • character-set: the character set, the response body is encoded with if available
  • http-status-code: the status code of the response
  • original-url: the url of the request as it was sent to the filter
  • file-name: a human readable file-name (is used for downloads) in generic implementations the content-dispsition headers "filename" fragment can be used (if present)

Note: You need to have a valid unblu session id (received after logging in) which has to be transmitted in the Cookies.

The body of the response must be sent to unblu in the body of the post request.

18.4. Data Types

All returned JSON data types are based on a base type with the following structure

{
	"class": <dataTypeName>
	"environment": {
		"key1": "value1",
		"key2": "value2",
 		"keyn" : "valuen"
 	}
}

"class" holds the name of the data type, "environment" holds a set of key/value pairs. The elements in the environment must be added the the environment of the scope, the request is executed in (read-configuration: filter scope, add-to-cache: request scope).

18.4.1. FilterBackendError

Whenever something went wrong

{
 "class": "FilterBackendError"
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "code": <int error code>,
 "message": <error message>
 }

 

18.4.2. CacheContentId

In responses to "add-to-cache"

 

{
 "class": "CacheContentId"
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "contentId": <string id of the cache content>
 }

 

18.4.3. FilterConfiguration

In responses to "read-configuration"

{
 "class": "FilterConfiguration",
 "environment": {
 "key1": "value1",
 "key2": "value2",
 ...
 }
 "version": <string version of the configuration>
 "cacheCondition": <Rule that must be evaluated in order to decide if a response must be added to the cache>
 "codeInjections": <array of ConditionalCodeInjection objects>
 }

 

18.4.4. ConditionalRegexReplacement

A regular expression based string replacement that must be performed if the condition evaluates to true in the request scope

{
	"class": "ConditionalRegexReplacement",
	"condition": <Rule, if it evaluates to true, the replacement must be performed>,
	"pattern": <the regex pattern>
	"replacement": <the string that must be inserted as a replacement for all matches>
}

 

18.4.5. ConditionalCodeInjection

A set of code injections that must be performed if a given condition (rule) evaluates to true

{
	"class": "ConditionalCodeInjection",
	"condition": <Rule that decides if the injection takes place for the current request>,
	"injections": <array of CodeInjection objects>
}

 

18.4.6. CodeInjection

A code injection

{
	"class": "CodeInjection",
	"reference": <BEFORE_BODY_CLOSE|AFTER_HEAD_START|BEFORE_HEAD_CLOSE>, // place in the HTML where the injection must go
	"type": <INTERNAL_JAVASCRIPT|EXTERNAL_JAVASCRIPT|INTERNAL_STYLE_SHEET|EXTERNAL_STYLE_SHEET|HTML_CONTENT>, // type of the injection
	"value": <string value of the injection, must be expanded within the request scope>
}

 

18.4.7. ComparisonRule

{
	"class": "ComparisonRule",
	"leftSide": <leftSide string value>,
	"rightSide": <rightSide string value>,
	"operator": <equals|startsWith|endsWith|contains|=|<|<=|>|>=>,
	"caseSensitive": <boolean>
}

 

18.4.8. AndRule

{
	"class": "AndRule",
	"rules": <array of Rules>
}

 

 

18.4.9. OrRule

{
	"class": "OrRule",
	"rules": <array of Rules>
}

 

18.4.10. NotRule

{
	"class": "NotRule",
	"rule": <Rules>
}

 

19. Request Processing

This section describes the actual work a filter has to do. A filter implementation must intercept all requests (whether as a module in the HTTP server or as a proxy). For every request the following steps need to be performed (pseudo code)

if (path startsWith ${UNBLU_PUBLIC_PATH} {
	proxy request to unblu
} else {
	let backend process request
	build request scope environment variables
	perform regex replacements
	if (cachCondition evaluates to true) {
		send response to unblu HTTP filter interface "add-to-cache" action
		add returned environment variables to request scope
		set ${UNBLU_USER_COOKIE_NAME} "set-cookie" header if ${UNBLU_USER_ID} is set
		set ${UNBLU_SESSION_COOKIE_NAME} "set-cookie" header if ${UNBLU_USER_SESSION_ID} is set
 	}
	perform code injections if their conditions evaluate to true in request scope environment
 	if (a code injection took place) {
		clear "etag" header in response
		clear "last-modified" header in response
 	}
	send (potentially modified response body to the browser)
}

 

 

If the unblu server is (temporarily) not available, the filter must not disturb the normal operation of the backend HTTP server. In order to achieve this, it might make sense to remember the state of the unblu server within the filter. Like this running into timeouts at every request can be avoided.

This could be done similar to:

As soon as the filter fails to call the unblu server one, it starts a background thread that waits for unblu to come up again. Until unblu is up again, the filter does not attempt to send requests to unblu anymore. As soon as unblu is up again, the filter starts sending request so it again. In order to check if unblu is up, the "ping" action of the HTTP interface can be used.

20. Lifecycle

This section describes the life cycle of a filter.

20.1. Startup

When the filter starts up, it must load it's initial configuration from the server (by sending a "read-configuration") request to the unblu filter HTTP interface. The filter must store the version of the configuration for later comparison to the server's actual configuration version.

20.2. Reconfiguration

Every time the filter sends a request to the unblu server (whether it calls the HTTP interface of if it forwards a request to unblu), it must compare the value of the x-unblu-configuration-version value with the version of the configuration it got when last loading the configuration from the server. If the version does not match, the filter must reload the configuration and apply it.

21. Development

When developing a filter implementation, a local tomcat runing a unblu server can be used as the filter backend.
To deploy unblu to a local tomcat, it makes sense to rename the original product.<productId><version><qualifier>.war to ROOT.war (i.e. product.com.unblu.review-2.2.0-2011090191106.war). After renaming it, it can be simply copyed into the webapp directory of the tomcat installation. If autodeply is enabled in the tomcat, unblu will be available at <tomcatSchema>://<tomcatHost>:<tomcatport>/unblu/ (i.e http://localhost:8080). The UNBLU_URL is http://localhost:8080/unblu in this case.

How can we help?

Chat with us and we will take you through our site!

Read about how we use cookies and how you can control them by clicking "Cookie Settings." If you continue to use this site, you consent to our use of cookies.