Architecture

Here I will describe how RDDR is structured and link to some of the classes that make it up.

Overview

alternate text

RDDR Deployment Block Diagram

Above is a simplified view of an RDDR deployment. RDDR sits on either side of a set of instances of the same application. Incoming requests to the proxy are replicated to all application instances, and their responses are diffed and merged before being sent back. Any outgoing requests from the application instances pass through an outgoing proxy which diffs and merges the requests before passing the request on to the target microservice. The response back is then replicated to all app instances. If the application relies on multiple separate backend services, an outgoing request proxy should be created for each. The entire RDDR deployment is placed behind a production-grade Envoy proxy.

RDDR shines when the application instances differ from one another. Subtle variations in the application that do not change nominal behavior can help to catch bugs and prevent exploitation of them.

An ideal deployment will also consist of two copies of the application that are identical. These form the “filter pair” which helps RDDR to distinguish between random noise and real bugs. Because the filter pair are known to be identical, any differences in their behavior will be ignored across the entire set. See Filtering Non-Deterministic Noise for more information.

Running RDDR

RDDR is packaged as a Python module with a main method.

rddr.__main__.main()

Entrypoint for the RDDR proxy. Parses the config file and starts RDDR.

It should be executed like so once installed:

python -m rddr [-c path/to/config.yaml]

RDDR accepts the following arguments:

usage: python -m rddr [-h] [-c CONFIG]

optional arguments:
-h, --help            show this help message and exit
-c CONFIG, --config CONFIG

The main method parses the config file and starts the top level RDDR class, shown below:

class rddr.rddr.Rddr(config)

Bases: object

Top level class for Rddr. Encapsulates one incoming proxy and one or more outgoing proxy.

Parameters

config (dict) – RDDR configuration dictionary

run()

Endless loop. Calls run() on all proxies configured in separate threads.

Proxies

RDDR implements separate proxies for incoming versus outgoing requests. Both of these proxy classes share the RddrProxy parent class.

class rddr.proxies.proxy.RddrProxy(mp_manager, config)

Bases: abc.ABC

This is an abstract class, a parent to the incoming and outgoing proxies used by RDDR.

Proxies are built on the asyncio library in Python 3.8. This framework was found to be faster and cleaner than the prior state machine-based implementations.

Parameters

config (dict) – Dictionary of user config provided to RDDR at the command line

Incoming Proxy

The RddrIncomingProxy class implements one proxy between a client and N variants of a server.

class rddr.proxies.incoming_proxy.RddrIncomingProxy(mp_manager, config)

Bases: rddr.proxies.proxy.RddrProxy

Implements an incoming proxy for RDDR. Replicates incoming requests to N applications and diffs their responses before forwarding their response.

Parameters

config (dict) – Dictionary of user config provided to RDDR at the command line

init_server()

Start an asyncio server for this proxy. Passes the _new_client member method as the new client callback.

Outgoing Proxy

The RddrOutgoingProxy class implements one proxy between N variants of an application and one server they want to query.

class rddr.proxies.outgoing_proxy.RddrOutgoingProxy(mp_manager, config, dest)

Bases: rddr.proxies.proxy.RddrProxy

Implements an outgoing proxy for RDDR. Merges outgoing requests from N applications to some other microservice and replicates the response back.

Parameters
  • config (dict) – Dictionary of user config provided to RDDR at the command line

  • dest (str) – Destination address where incoming requests will be forwarded. String format: “<HOST>:<PORT>”

init_server()

Start an asyncio server for this proxy. Passes the _new_client member method as the new client callback.

Support for Different Transport Protocols

We intend RDDR to support a variety of transport protocols. The latest version of RDDR supports:

  • Unencrypted TCP

  • SSL

Each proxy (incoming and outgoing) can be configured for a different transport protocol. This is useful in the cloud, where applications can be stitched from many smaller microservices that all speak different protocols. See Configuration for more on configuring the protocol.

class rddr.protocols.protocol.RddrProtocol

Bases: abc.ABC

Abstract class to be used as a base for concrete protocols.

abstract create_server(host, port)

Coroutine. Creates a server socket.

Parameters
  • host (str) – Hostname/IP to bind server to

  • port (int) – Port to bind server to

Return type

None

get_stream_addr(stream)

Given an asyncio stream (either StreamReader or StreamWriter) returns a string representing the host and port of the party on the other end of the streams. Format: “<host>:<port>” May return None if address cannot be retrieved.

Parameters

stream (Union[StreamReader, StreamWriter]) – StreamReader or StreamWriter used to communicate with remote party

Return type

Optional[str]

abstract async open_connection(host, port)

Abstract coroutine. Opens a connection to host:port. Returns (StreamReader, StreamWriter)

Parameters
  • host (str) – Host to connect to

  • port (int) – Port to connect to

Return type

Tuple[StreamReader, StreamWriter]

TCP

class rddr.protocols.protocol_tcp.RddrProtocolTcp

Bases: rddr.protocols.protocol.RddrProtocol

Support for unencrypted TCP.

create_server(host, port)

Coroutine. Creates a server socket.

Parameters
  • host (str) – Hostname/IP to bind server to

  • port (int) – Port to bind server to

Return type

None

async open_connection(host, port)

Creates a new unencrypted socket for forwarding data. Returns (StreamReader, StreamWriter).

Parameters
  • host (str) – Host to connect to

  • port (int) – Port to connect to

Return type

Tuple[StreamReader, StreamWriter]

SSL

class rddr.protocols.protocol_ssl.RddrProtocolSsl(cert='certs/clientcert.pem', key='certs/clientkey.pem')

Bases: rddr.protocols.protocol.RddrProtocol

Support for SSL on top of TCP

Parameters
  • cert (str) – Path to the certificate file

  • key (str) – Path to the key file

create_server(host, port)

Coroutine. Creates a server socket with SSL context.

Parameters
  • host (str) – Hostname/IP to bind server to

  • port (int) – Port to bind server to

Return type

None

async open_connection(host, port)

Creates a new socket and wraps it in an encrypted session for forwarding data. Returns (StreamReader, StreamWriter).

Parameters
  • host (str) – Host to connect to

  • port (int) – Port to connect to

Return type

Tuple[StreamReader, StreamWriter]

Filtering Non-Deterministic Noise

You can deploy two identical app instances which will help to filter any non-deterministic noise in the system. See the filter parameter in Configuration.

Consider a query that fetches a random number from the application. Every instance will generate a different random number. Without filtering, the incoming proxy would flag this divergence as a potential bug. However, if the proxy sees that the two identical apps also differ in their responses, we can safely ignore this region of the response and in doing so we’ve filtered out the non-deterministic noise.

Support for Diffing Various Application Data

Filtering non-deterministic noise as described above requires the proxies to do a minimal amount of parsing of the data. For example, if the data is in JSON format, we may want ignore certain non-deterministic keys of the data structure. For a text file, we may instead want to ignore certain lines. We need a way to tokenize the data being transferred so that we can ignore particular tokens. Since the tokenizing algorithm is likely to vary across applications, we have defined a simple interface for others to extend.

Interface Specification

Simply implement a class that inherits from AbstractRddrDiff and implement the functions diff_traffic, modify_traffic, and optionally render_denial and validate_params:

class rddr.AbstractRddrDiff(mp_manager, shared_state, do_filter=False, logger=None, params=None)

Bases: abc.ABC

Defines the interface for all RDDR diff plugins. Users may extend this class to add support for a particular protocol to RDDR. Diff plugins may optionally specify configuration parameters that a user may provide through the config YAML file. The diff-params key of the YAML file is reserved on each proxy for use by the diff plugin applied to that proxy. The schema expected by the plugin should be well-specified. Diff plugins should implement validate_params to validate the schema of the user-provided diff-params.

Parameters
  • do_filter (bool) – If True, will use the first two traffic streams as a filter pair to filter out non-deterministic noise.

  • logger (Optional[Logger]) – The logger instance to use for printing messages.

  • params (Optional[dict]) – Miscellaneous user-provided config for the plugin, from the user’s YAML config file. Subclasses should define clearly what they expect to be passed as parameters.

diff_traffic(traffic)

Diffs the traffic from N instances. Also indicates how many bytes of each traffic stream has been processed and whether or not more bytes of the stream are needed to process it. This default implementation will never detect divergence, always processes the entire stream and never requests more bytes. Subclasses may raise the RddrInsufficientData exception if diff_traffic was called on partial data (i.e. more data is required from the instances to make a decision). The proxy tunnel will handle this exception by reading from the instances once more before calling diff_traffic again.

Parameters

traffic (List[bytes]) – List of bytestrings from N instances.

Return type

List[Tuple[int, bool]]

Returns

A list of 2-tuples, one tuple for each traffic stream provided through the “traffic” argument. Each tuple is of the form (int, bool). The first element of the tuple is the number of bytes of that stream that have been differenced and can safely be sent along to the client. If this value is zero, no bytes have yet been parsed. If this value is less than zero, then the streams differ from one another, and the traffic SHOULD NOT be forward to the client. The second element of the tuple is a flag indicating whether or not more bytes are required from the traffic source in order to parse this stream. This is useful if the plugin tokenizes the streams and has to this point received a partial token and requires more bytes to fully difference everything.

modify_traffic(traffic, n_instances)

This function replicates one incoming stream into N for each of the N application variants. In the process, it may make modifications to the replica for each instance as necessary. This can be necessary if there are unique tokens that need to be substituted for each instance, as in the case of CSRF tokens in HTML forms. This default implementation makes no modifications to the traffic.

Parameters
  • traffic (bytes) – Request to modify per recipient in addrlist.

  • n_instances (int) – Number of app instances in this deployment

Return type

List[bytes]

Returns

List of the traffic to send to each of the app variants.

render_denial()

The diff interface can implement a custom error message appropriate for the application layer protocol being handled. An error message, for example. Default implementation returns empty byte string.

Return type

bytes

Returns

Bytestring to be sent back to the client if divergent behavior is seen.

validate_params()

Validates the diff-params key in the user config file. By default, does nothing.

We have packaged four classes which implement the interface for JSON, HTTP, raw bytes, and Postgres respectively. These are shown below:

JSON

class rddr_diff_builtins.RddrJsonDiff(mp_manager, shared_state, do_filter=False, logger=None, params=None)

Bases: rddr.diff_interface.AbstractRddrDiff

Diff tool for JSON documents that ships with RDDR. JSON is expected to be embedded in an HTTP response. Differences key by key. Does not modify incoming traffic.

Parameters
  • do_filter (bool) –

  • logger (Optional[Logger]) –

  • params (Optional[dict]) –

diff_traffic(traffic)

Parses JSON documents embedded in HTTP responses. May request more bytes of a given stream if a partial JSON document has been received and cannot yet be parsed. Differences key by key.

See interface definition rddr.AbstractRddrDiff.diff_traffic() for more.

Parameters

traffic (List[bytes]) – List of traffic from app instances.

Return type

List[Tuple[int, bool]]

render_denial()

Returns an HTTP response string containing a 500 error and an “access denied” message, with the RDDR logo. See static/denied.html for the content.

Return type

bytes

HTTP

class rddr_diff_builtins.RddrHttpDiff(mp_manager, shared_state, do_filter=False, logger=None, params=None)

Bases: rddr.diff_interface.AbstractRddrDiff

Diff tool for HTTP that ships with RDDR. Capable of handling CSRF tokens. N instances may generate form tokens or other per-instance tokens. Plugin will save these tokens and send one along to the client. Upon seeing the client’s token later, will substitute the token appropriate for each server.

diff_traffic(traffic)

Diffs HTML delimited by line breaks.

Upon encountering noise within a line (i.e. the filter pair differ), will extract the largest contiguous set of characters within the line that differ and save the value reported by each server. These tokens can be reinserted in a user’s subsequent requests on sight. The reinsertion is implemented by modify_traffic. This is necessary when an application being N-versioned uses anti-CSRF tokens in its user input forms. The proxy must send the appropriate token back to each instance of the application for it to service the user’s request.

See interface definition rddr.AbstractRddrDiff.diff_traffic() for more.

Parameters

traffic (List[bytes]) – List of traffic from app instances.

Return type

List[Tuple[int, bool]]

modify_traffic(traffic, n_instances)

Return a list of bytestrings, one to send to each application instance.

This method will re-insert any saved tokens it finds in the user’s traffic with the token originally sent by each instance. See diff_traffic for further explanation of the utility of this feature.

Parameters
  • traffic (bytes) – Request to modify per recipient in addrlist.

  • n_instances (int) – Number of app instances in this deployment

Return type

List[bytes]

render_denial()

Returns an HTTP response string containing a 500 error and an “access denied” message, with the RDDR logo. See static/denied.html for the content.

Return type

bytes

validate_params()

Validates the diff-params config field for this particular class.

Bytewise

class rddr_diff_builtins.RddrByteDiff(mp_manager, shared_state, do_filter=False, logger=None, params=None)

Bases: rddr.diff_interface.AbstractRddrDiff

Parameters
  • do_filter (bool) –

  • logger (Optional[Logger]) –

  • params (Optional[dict]) –

diff_traffic(traffic)

Validates that messages match byte for byte.

See interface definition rddr.AbstractRddrDiff.diff_traffic() for more.

Parameters

traffic (List[bytes]) – List of traffic from app instances. Key = instance address “host:port” Value = Bytes response

Return type

List[Tuple[int, bool]]

PostgreSQL

class rddr_diff_builtins.RddrPostgresDiff(mp_manager, shared_state, do_filter=False, logger=None, params=None)

Bases: rddr.diff_interface.AbstractRddrDiff

This class enables support for diffing Postgres traffic across N application instances. This diff plugin supports diff-params. diff-params should be a dictionary with one key: tokens. tokens is a list of lists of bytestrings, one bytestring per application instance. This allows you to preconfigure tokens you expect to be different among the Postgres instances. An example is the string reported for the server version – different variants will provide different strings. By specifying that here, you can avoid flagging that as divergent behavior.

Parameters
  • do_filter (bool) –

  • logger (Optional[Logger]) –

  • params (Optional[dict]) –

diff_traffic(traffic)

Validates that Postgres messages match. Ignores certain packet types. See member _backend_pkt_types_to_ignore for the full list of ignored packet types. Prior to diffing, will substitute tokens preconfigured in the config file under the diff-params key for the associated proxy.

See interface definition rddr.AbstractRddrDiff.diff_traffic() for more.

Parameters

traffic (List[bytes]) – List of traffic from app instances.

Return type

List[Tuple[int, bool]]

render_denial()

The diff interface can implement a custom error message appropriate for the application layer protocol being handled. An error message, for example. Default implementation returns empty byte string.

Return type

bytes

Returns

Bytestring to be sent back to the client if divergent behavior is seen.

validate_params()

Validates the diff-params config field for this particular class.

These classes each implement the functions diff_traffic and modify_traffic.

Users may write their own classes that implement the above functions to tailor RDDR’s filtering engine for their own application. Simply include the name of your class via the diff-class field of your config file.