Resource history

Resource history and SecureFlow Manager

The resource history is related to the SecureFlow Manager (SFM). One of the main purposes of the SFM is to capture the resources on a web page sent from your organization’s backend and transfer them to the Collaboration Server. From there, the resources are transferred to the Agent Desk. The page’s Document Object Model (DOM), on the other hand, is captured by the JavaScript in the visitor’s browser and isn’t transferred to the resource history.

When a co-browsing session starts and the visitor’s browser requests images or other resources from your organization’s backend, the SFM is triggered as soon as the response arrives at the proxy. So, while the backend is supplying resources to the visitor’s browser, the SFM is duplicating those resources and uploading them to the Collaboration Server.

The default behavior is that the DOM and image data are stored in temporary memory during a session. They aren’t stored permanently. You can change this behavior with the configuration property com.unblu.server.storage.blob.nodeStoreType.

Without the SFM in your proxy’s filter chain, agents can’t view session-specific data. The absence of the SFM doesn’t break the session, but content like images generated specifically for that session aren’t visible to the agent. Resources that are only accessible with a valid authentication also aren’t displayed to agents.

The SFM may also be required if the agent’s browser can’t access images or stylesheets from their location on your organization’s network.

Resource types

Typical resource types are binary resources like images, documents (PDF, DOC, and so on) or multimedia content such as audio or video.. Textual resources such as style sheet (CSS) files, and even HTML, can also be classed as resources. There are some exceptions to this rule. The following styles are typically directly contained in a visual and then transformed into resources on the server:

Styles in HTML attributes
Styles in <style> tags

Resource storage and access

An Unblu resource actually consists of two parts:

The resource object, which contains metadata about the resource.
The resource data, which is stored in a blob (binary large object).

Resource object

The resource object provides the following information:

The UUID representing the resource
The URI of the original resource
The MIME type
The charset (if textual resource)
The state: PENDING, REQUESTING, REQUESTED, INVALID, MATERIALIZED, DELIVERABLE
The origin: CSS_PROPERTY, CSS_IMPORT, STYLE_ATTRIBUTE, TAG, OTHER
The reference to the blob containing the actual data of the resource

A Blob (resource data)

A blob contains the actual data of the resource. There are different kinds of blobs:

Basic blob
Typed blob
Dummy blob, marked with a specific ID d2874f28-96e3-416b-bfed-ecb932b064fa

The purpose of the dummy blob is to indicate that there is no data available. It’s used in resources where the blob isn’t available yet.

Basic blob

Information provided by basic blobs:

Checksum (currently a CRC32 checksum)
Length (in bytes)
Creation date
Binary data

Typed blob

Information provided by typed blobs is the same as the basic blob plus:

ID
MIME type

Resource storage

Resource storage is split into three areas:

The resource store
The blob store
The resource table

The first two are used to store the respective parts of a resource (see above). The resource table is the resolution table when looking for resources with a known backend URI (but not a UUID).

By default, all three stores are located in memory. The resource and resource table stores have session scope—that is, their content is dropped when the session ends. The scope of the blob store, on the other hand, is global: a resource used in multiple sessions is only stored in memory once.

Resource request URI

When the resource history is enabled, the URIs in a visual are transformed to point to the Collaboration Server. They are composed as follows:

http://unbluserver/<restricted-path-prefix>/player/resource/res/<blob-UUID>#<resource-UUID>

The resource-UUID fragment of the URI (the part after the #) is never sent to the Collaboration Server when the agent’s browser requests a resource. It’s only used in the browser.

The reason to have it in the URIs is that it’s appended, for example, to a URI contained in a CSS. Therefore, if the CSS is parsed on the server, the resource UUID can be extracted and the relevant information (especially inbound and outbound references) retrieved and processed. As soon as the CSS arrives at the agent browser, the browser requests the CSS but without the resource UUID.

The fact that it’s actually the blob being retrieved and not the surrounding resource is important. Since blobs are stored in global scope, the agent’s browser only has to retrieve them once. It can then cache them locally. If there are many resources with the same content, then caching is highly effective.

In extreme cases, the agent’s browser can even load a page faster than the visitor’s browser. This can happen if the original webpage contains hundreds of images with differing URIs but always the same data. In the agent browser, all those resources would have the same blob, resulting in the same URI, and thus would be retrieved only once.

How to enable resource history

To enable the resource history, set com.unblu.visual.resourcehistory.enabled to true.

All the configuration properties related to the resource history can be found in the configuration properties reference.

Behavior with the resource history enabled

When the resource history’s enabled, the agent browser only interacts with the Collaboration Server. It doesn’t request resources from anywhere else.

This provides a high level of security. If the visitor’s browser sends a URI to a resource that hasn’t been added to the resource history, the Collaboration Server has no such resource and returns a 404 not found response, so the agent’s browser can’t resolve that resource.

Browser caching

When the resource history’s enabled, resources are transferred to Unblu, for example, from a reverse proxy where all data to the end user is flying by. Traffic is only monitored by Unblu once a session has started. That means that it may well happen that an end user surfs on the web site, retrieves images and stylesheets and has them stored in their browser cache from that moment on. Once the session starts, the resources no longer fly by the reverse proxy and thus aren’t transferred to Unblu. Unblu detects this and sends the visitor browser commands to re-request such files. However, this behavior may take some time and performance is typically slower with the resource history enabled than without.

Behavior with the resource history disabled

When the resource history’s disabled, resource URIs in visuals are transferred from the visitor’s browser to the agent’s browser unchanged. The resources aren’t transferred to the server or stored. Instead, the agent’s browser requests resources directly from the original backend web server.

The Collaboration Server checks all the URIs requested by the visitor’s browser and only transmits them to the agent’s browser if they point to a resource on the backend web server. The agent’s browser won’t request URIs that point to resources on an unknown web server.

Resource processing

Depending on the resource type, a resource is processed before being used, or is used in its original form. A typical processor is the CSS processor.

The main purpose of the CSS processor is to identify resource references within the CSS and replace them with an Unblu resource URI. The Collaboration Server features two kinds of CSS processors:

Full CSS parser-based processor

Used when the resource history is turned on. It simulates a CSS parser as present in the browser and filters the CSS. The full CSS parser not only scans for URIs, it also drops unknown, unsafe, or problematic CSS rules.
Simple regex-based processor

Only scans for URIs and verifies and replaces them.

Dependency processing

Resources can have the following states:

PENDING: The Collaboration Server has seen a resource with a certain URI, but it isn’t present in the resource store. It’s expected to arrive on the server later on.
REQUESTING: A resource is supposed to have arrived on the server but hasn’t arrived so far. It’s requested from the client again.
REQUESTED: A missing resource has been requested.
INVALID: A missing resource is invalid when requesting it results in anything other than an HTTP status code 200.
MATERIALIZED: A resource has arrived but may need processing.
DELIVERABLE: A resource has arrived and has been processed. It’s ready for delivery.

Changes to a resource’s state entail changes in its data, that is, the related blob. If the blob changes, the URIs in visuals (and possibly dependent resources such as CSS files) need to be updated. Therefore, a change in state always leads to a cascade of resource updates.

Resources can have incoming (inbound) or outgoing (outbound) references.

For example, a CSS contains a URI to a background image. The referenced image arrives at the server (outbound reference). This triggers a reparsing of the CSS to replace the placeholder URI in the CSS with the new, valid one pointing to the background image in the resource store. (The placeholder exists because the CSS was already parsed initially and all missing resources were collected.) Because the CSS is reparsed, a new entry is generated in the resource store. This triggers the exchange of the previous <link> tag URI with the new one from the reparsed CSS (inbound reference).

Resource history