Architecture¶
Here I will describe how RDDR is structured and link to some of the classes that make it up.
Overview¶
RDDR Deployment Block Diagram¶
Above is a simplified view of an RDDR deployment. RDDR sits on either side of a set of instances of the same application. Incoming requests to the proxy are replicated to all application instances, and their responses are diffed and merged before being sent back. Any outgoing requests from the application instances pass through an outgoing proxy which diffs and merges the requests before passing the request on to the target microservice. The response back is then replicated to all app instances. If the application relies on multiple separate backend services, an outgoing request proxy should be created for each. The entire RDDR deployment is placed behind a production-grade Envoy proxy.
RDDR shines when the application instances differ from one another. Subtle variations in the application that do not change nominal behavior can help to catch bugs and prevent exploitation of them.
An ideal deployment will also consist of two copies of the application that are identical. These form the “filter pair” which helps RDDR to distinguish between random noise and real bugs. Because the filter pair are known to be identical, any differences in their behavior will be ignored across the entire set. See Filtering Non-Deterministic Noise for more information.
Running RDDR¶
RDDR is packaged as a Python module with a main method.
-
rddr.__main__.main()¶ Entrypoint for the RDDR proxy. Parses the config file and starts RDDR.
It should be executed like so once installed:
python -m rddr [-c path/to/config.yaml]
RDDR accepts the following arguments:
usage: python -m rddr [-h] [-c CONFIG]
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
The main method parses the config file and starts the top level RDDR class, shown below:
Proxies¶
RDDR implements separate proxies for incoming versus outgoing requests.
Both of these proxy classes share the
RddrProxy parent class.
-
class
rddr.proxies.proxy.RddrProxy(config)¶ Bases:
abc.ABCThis is an abstract class, a parent to the incoming and outgoing proxies used by RDDR.
Proxies are built on the asyncio library in Python 3.8. This framework was found to be faster and cleaner than the prior state machine-based implementations.
- Parameters
config (
dict) – Dictionary of user config provided to RDDR at the command line
Incoming Proxy¶
The RddrIncomingProxy
class implements one proxy between a client and N variants
of a server.
-
class
rddr.proxies.incoming_proxy.RddrIncomingProxy(config)¶ Bases:
rddr.proxies.proxy.RddrProxyImplements an incoming proxy for RDDR. Replicates incoming requests to N applications and diffs their responses before forwarding their response.
- Parameters
config (
dict) – Dictionary of user config provided to RDDR at the command line
-
async
init_server()¶ Start an asyncio server for this proxy. Passes the _new_client member method as the new client callback.
-
async
run()¶ Serves forever.
Outgoing Proxy¶
The RddrOutgoingProxy
class implements one proxy between N variants of an application
and one server they want to query.
-
class
rddr.proxies.outgoing_proxy.RddrOutgoingProxy(config, dest)¶ Bases:
rddr.proxies.proxy.RddrProxyImplements an outgoing proxy for RDDR. Merges outgoing requests from N applications to some other microservice and replicates the response back.
- Parameters
config (
dict) – Dictionary of user config provided to RDDR at the command linedest (
str) – Destination address where incoming requests will be forwarded. String format: “<HOST>:<PORT>”
-
async
init_server()¶ Start an asyncio server for this proxy. Passes the _new_client member method as the new client callback.
-
async
run()¶ Serves forever.
Support for Different Transport Protocols¶
We intend RDDR to support a variety of transport protocols. The latest version of RDDR supports:
Unencrypted TCP
SSL
Each proxy (incoming and outgoing) can be configured for a different transport protocol. This is useful in the cloud, where applications can be stitched from many smaller microservices that all speak different protocols. See Configuration for more on configuring the protocol.
-
class
rddr.protocols.protocol.RddrProtocol¶ Bases:
abc.ABCAbstract class to be used as a base for concrete protocols.
-
abstract async
create_server(new_client_cb, host, port)¶ Coroutine. Creates a server using the asyncio interface.
- Parameters
new_client_cb (
Callable[[StreamReader,StreamWriter],None]) – Callback, called when new client connects. Receives instance of StreamReader and StreamReader to communicate with client.host (
str) – Hostname/IP to bind server toport (
int) – Port to bind server to
- Return type
None
-
get_stream_addr(stream)¶ Given an asyncio stream (either StreamReader or StreamWriter) returns a string representing the host and port of the party on the other end of the streams. Format: “<host>:<port>” May return None if address cannot be retrieved.
- Parameters
stream (
Union[StreamReader,StreamWriter]) – StreamReader or StreamWriter used to communicate with remote party- Return type
Optional[str]
-
abstract async
open_connection(host, port)¶ Abstract coroutine. Opens a connection to host:port. Returns (StreamReader, StreamWriter)
- Parameters
host (
str) – Host to connect toport (
int) – Port to connect to
- Return type
Tuple[StreamReader,StreamWriter]
-
abstract async
TCP¶
-
class
rddr.protocols.protocol_tcp.RddrProtocolTcp¶ Bases:
rddr.protocols.protocol.RddrProtocolSupport for unencrypted TCP.
-
async
create_server(new_client_cb, host, port)¶ Coroutine. Creates a server using the asyncio interface with no encryption.
- Parameters
new_client_cb (
Callable[[StreamReader,StreamWriter],None]) – Callback, called when new client connects. Receives instance of StreamReader and StreamReader to communicate with client.host (
str) – Hostname/IP to bind server toport (
int) – Port to bind server to
- Return type
None
-
async
open_connection(host, port)¶ Creates a new unencrypted socket for forwarding data. Returns (StreamReader, StreamWriter).
- Parameters
host (
str) – Host to connect toport (
int) – Port to connect to
- Return type
Tuple[StreamReader,StreamWriter]
-
async
SSL¶
-
class
rddr.protocols.protocol_ssl.RddrProtocolSsl(cert='certs/clientcert.pem', key='certs/clientkey.pem')¶ Bases:
rddr.protocols.protocol.RddrProtocolSupport for SSL on top of TCP
- Parameters
cert (
str) – Path to the certificate filekey (
str) – Path to the key file
-
async
create_server(new_client_cb, host, port)¶ Coroutine. Creates a server using the asyncio interface with SSL context.
- Parameters
new_client_cb (
Callable[[StreamReader,StreamWriter],None]) – Callback, called when new client connects. Receives instance of StreamReader and StreamReader to communicate with client.host (
str) – Hostname/IP to bind server toport (
int) – Port to bind server to
- Return type
None
-
async
open_connection(host, port)¶ Creates a new socket and wraps it in an encrypted session for forwarding data. Returns (StreamReader, StreamWriter).
- Parameters
host (
str) – Host to connect toport (
int) – Port to connect to
- Return type
Tuple[StreamReader,StreamWriter]
Filtering Non-Deterministic Noise¶
You can deploy two identical app instances which will help to
filter any non-deterministic noise in the system.
See the filter parameter in Configuration.
Consider a query that fetches a random number from the application. Every instance will generate a different random number. Without filtering, the incoming proxy would flag this divergence as a potential bug. However, if the proxy sees that the two identical apps also differ in their responses, we can safely ignore this region of the response and in doing so we’ve filtered out the non-deterministic noise.
Support for Diffing Various Application Data¶
Filtering non-deterministic noise as described above requires the proxies to do a minimal amount of parsing of the data. For example, if the data is in JSON format, we may want ignore certain non-deterministic keys of the data structure. For a text file, we may instead want to ignore certain lines. We need a way to tokenize the data being transferred so that we can ignore particular tokens. Since the tokenizing algorithm is likely to vary across applications, we have defined a simple interface for others to extend.
Simply implement a class that inherits from
AbstractRddrDiff
and implement the functions diff_traffic,
modify_traffic, and optionally render_denial:
-
class
rddr.AbstractRddrDiff(do_filter=False, logger=None, params=None)¶ Bases:
abc.ABCDefines the interface for all RDDR diff plugins. Users may extend this class to add support for a particular protocol to RDDR.
- Parameters
do_filter (
bool) – If True, will use the first two traffic streams as a filter pair to filter out non-deterministic noise.logger (
Optional[Logger]) – The logger instance to use for printing messages.params (
Optional[dict]) – Miscellaneous parameters for the diff class. Subclasses should define clearly what they expect to be passed as parameters.
-
diff_traffic(traffic)¶ Diffs the traffic from N instances. Returns True if traffic is consistent. Returns False if traffic diverges. This default implementation always returns True. Subclasses may raise the RddrInsufficientData exception if diff_traffic was called on partial data (i.e. more data is required from the instances to make a decision). The proxy tunnel will handle this exception by reading from the instances once more before calling diff_traffic again.
- Parameters
traffic (
List[bytes]) – List of bytestrings from N instances.- Return type
bool
-
modify_traffic(traffic, n_instances)¶ Return a list of the traffic to send to each of the app variants. This default implementation makes no modifications to the traffic.
- Parameters
traffic (
bytes) – Request to modify per recipient in addrlist.n_instances (
int) – Number of app instances in this deployment
- Return type
List[bytes]
-
render_denial()¶ Return a bytestring to be sent back to the client if divergent behavior is seen. An error message, for example.
- Return type
bytes
We have packaged four classes which implement the interface for JSON, HTTP, raw bytes, and Postgres respectively. These are shown below:
JSON¶
-
class
rddr_diff_builtins.RddrJsonDiff(do_filter=False, logger=None, params=None)¶ Bases:
rddr.diff_interface.AbstractRddrDiff- Parameters
do_filter (
bool) –logger (
Optional[Logger]) –params (
Optional[dict]) –
-
diff_traffic(traffic)¶ Return True iff the traffic are the same modulo deterministic behavior if present
- Parameters
traffic (
List[bytes]) – List of traffic from app instances.- Return type
List[Tuple[int,bool]]
-
render_denial()¶ Returns an HTTP response string containing a 500 error and an “access denied” message, with the RDDR logo. See static/denied.html for the content.
- Return type
bytes
HTTP¶
-
class
rddr_diff_builtins.RddrHttpDiff(do_filter=False, logger=None, params=None)¶ Bases:
rddr.diff_interface.AbstractRddrDiff- Parameters
do_filter (
bool) –
-
diff_traffic(traffic)¶ Diffs HTML delimited by line breaks.
Upon encountering noise within a line (i.e. the filter pair differ), will extract the largest contiguous set of characters within the line that differ and save the value reported by each server. These tokens can be reinserted in a user’s subsequent requests on sight. The reinsertion is implemented by modify_traffic. This is necessary when an application being N-versioned uses anti-CSRF tokens in its user input forms. The proxy must send the appropriate token back to each instance of the application for it to service the user’s request.
- Parameters
traffic (
List[bytes]) – List of traffic from app instances.- Return type
List[Tuple[int,bool]]
-
modify_traffic(traffic, n_instances)¶ Return a list of bytestrings, one to send to each application instance.
This method will re-insert any saved tokens it finds in the user’s traffic with the token originally sent by each instance. See diff_traffic for further explanation of the utility of this feature.
- Parameters
traffic (
bytes) – Request to modify per recipient in addrlist.n_instances (
int) – Number of app instances in this deployment
- Return type
List[bytes]
-
render_denial()¶ Returns an HTTP response string containing a 500 error and an “access denied” message, with the RDDR logo. See static/denied.html for the content.
- Return type
bytes
Bytewise¶
-
class
rddr_diff_builtins.RddrByteDiff(do_filter=False, logger=None, params=None)¶ Bases:
rddr.diff_interface.AbstractRddrDiff- Parameters
do_filter (
bool) –logger (
Optional[Logger]) –params (
Optional[dict]) –
-
diff_traffic(traffic)¶ Validates that messages match byte for byte.
- Parameters
traffic (
List[bytes]) – List of traffic from app instances. Key = instance address “host:port” Value = Bytes response- Return type
List[Tuple[int,bool]]
PostgreSQL¶
-
class
rddr_diff_builtins.RddrPostgresDiff(do_filter=False, logger=None, params=None)¶ Bases:
rddr.diff_interface.AbstractRddrDiffThis class enables support for diffing Postgres traffic across N application instances. This diff plugin supports
diff-params.diff-paramsshould be a dictionary with one key:tokens.tokensis a list of lists of bytestrings, one bytestring per application instance. This allows you to preconfigure tokens you expect to be different among the Postgres instances. An example is the string reported for the server version – different variants will provide different strings. By specifying that here, you can avoid flagging that as divergent behavior.- Parameters
do_filter (
bool) –logger (
Optional[Logger]) –params (
Optional[dict]) –
-
diff_traffic(traffic)¶ Validates that Postgres messages match. Ignores certain packet types. See member _backend_pkt_types_to_ignore for the full list of ignored packet types. Prior to diffing, will substitute tokens preconfigured in the config file under the
diff-paramskey for the associated proxy.- Parameters
traffic (
List[bytes]) – List of traffic from app instances.- Return type
List[Tuple[int,bool]]
These classes each implement the functions
diff_traffic and modify_traffic.
Users may write their own classes that implement
the above functions to tailor RDDR’s filtering engine
for their own application. Simply include the name
of your class via the diff-class field of your config file.