Application-level error blog - Page 2

MYTH #1

Application-level error doesn’t matter because
it does not affect the sequence of reportable events.

 

Some people have argued as follows:

The point of the timestamp-accuracy requirements of RTS 25 is to establish a clearer sequence of events. As long as nothing happens between an event and its timestamp, there's no practical importance to delays in timestamping because they do not change the sequence of events. If my business logic is blocked from timestamping, it is also blocked from doing anything else, so there is no significance to the timestamping delay. For example, suppose my app makes a decision to deal, then sends out a corresponding order. And suppose that due to some OS scheduling issue, the timestamp on the decision to deal is assigned two seconds later than the actual decision (to use an extreme number). As long as the order does not go out until after the timestamp on the decision to deal has been assigned, there is no importance to the fact that the timestamp was delayed. The timestamp on the order-send event will still be after the decision to deal.

This argument is indeed effective in certain situations. The problem is that is not effective in others. In general, plenty can happen between the event and the timestamp outside the context of the logic thread that is handling the event. The crucial point is that RTS 25 is meant to establish a clearer sequence of events not only within each thread but also across threads, across applications, and across institutions.

For example, suppose that in a given app, receipt of market data updates is timestamped by a thread that reads from a market data feed, while decisions to deal are timestamped by another thread that executes a trading algorithm. (As we've discussed elsewhere, MiFID 2 does not explicitly require you to treat market data updates as reportable events, but if you want to use market data to justify your trading decisions, you'd better record them with RTS 25-compliant timestamps.) Now suppose that the algo decides to deal based on market data update A but suffers a large timestamp-assignment delay when recording that decision to deal. By the time the timestamp is assigned, the market data thread has timestamped updates B, C, and D. To the observer looking at the event log, the state of the market at the time of the decision to deal appears to be D, when in fact it was A.

For another example, suppose an application timestamps the sending of an order. To ensure that the timestamp is not assigned before the message has been sent, the application uses a blocking send call and requests the timestamp after the call returns. But suppose there is a large delay in assigning that timestamp. The event log will say that the event was sent long after it actually was. (In practice, low-latency apps don't usually use blocking sends. But non-blocking sends have a similar issue. The send can be delayed until long after the timestamp.)

The need to consider the broader goal of event reconstruction is the same reason that "stop the world" garbage collection in Java is not some sort of exemption from RTS 25. The only world that stops is the world within the particular JVM instance. The rest of the world keeps on moving.

<Next: MYTH #2 - Application-level error isn't significant enough to matter.>

<Prev: Start of blog>