How do I design high-frequency trading systems and its architecture. Part III

This is the last part of 3 series of articles I’ve been writing. In this part, I’m going to explain what I’ve found as the best approach to ultra-low latency systems.

Even though I’ve been focused on trading systems, this can be applied to any low-latency systems: communication, audio, video, etc

So, the pattern I use is the following:

Busy/Wait or spinning: this is not categorized as a pattern, actually, it is considered the “anti-pattern” and usually is not recommended. But, when you are designing low latency systems, we don’t care how nice it is or if it follows the good practices. We only care about latency.


The process will be in a tight loop waiting for something, and that loop will consume 100% of CPU cycles. In our case, we are going to be reading market data, from our limit order book module, and if we meet certain strategy criteria, we will send specific orders to execute that trade. This is by far, the fastest way to get the data available from other modules.

But not only that, having this kind of process, we mostly will be avoiding cache misses and CPU’s context switching. Something that I have talked about in my last article.

The following is a basic code snippet on how it works


But, not everything as good as it seems, busy/wait processes are very hard to design and are too dangerous for the overall performance since it could take the entire CPU power bringing down the entire system’s performance.

Now, the key part to use busy/waits in our systems is set a thread affinity to a specific CPU core. That means, that we will say to our system to run this busy/wait process in only one CPU core (could be core 1, 2, 3, etc). And we will be able to “pin” as many processes as CPU cores we have. If we don’t do this because how the thread schedule works, will use the entire CPU power.

Using this type of methods, threading model, I/O model and memory management should be designed to collaborate with each other to achieve best overall performance. This goes against the OOP concept of loose coupling, but it’s necessary to avoid runtime cost of dynamic polymorphism.


Of course, you still need to take care of synchronization methods (locks) where is needed. My approach to this is to design my data structures in a way that I will need to have a low amount of synchronization.

Conclusion, this part is the most sensitive thing in our system, and using this technique will give you the best latency.


6. Position & Risk Management

All orders sent by the strategy should be consolidated in positions, so you can keep track of your open/close orders and most important, how your exposure to the market is. Ideally, your strategy should keep a flat exposure, but in certain strategies, like market making, you may allow having controlled exposure (if holding inventory)._fig14

From having stop loss per position or for the overall exposure to portfolio management, the risk module it is an important piece that will interact with your strategy and will be monitoring in real-time all open positions and the overall exposure to the market.

The following are some popular risk management rules:

  • Position limit: Control the upper limit of the position of a specified instrument, or the sum of all positions of instruments for a specified product.
  • Single-order limit: Control the upper limit of the volume of single order. Sometimes, control the lower limit of the volume of single order, which means that the quantity of your order must be a multiple of it.
  • Money control: Control the margin of all positions not to exceed the balance of the account.
  • Illegal price detection: Ensure the price is within a reasonable range, such as not exceed price limit, or not too far from the current price.
  • Self-trading detection: Ensure the orders from different strategies will not cause any possibility of trade between them.
  • Order cancellation rate: Calculate the order cancellation situation and ensure it does not exceed the limitation of exchange.


Also, within this module, you may want to analyze different allocations on strategies or trades. There have been many studies proving that having an allocation strategy could lead you to lower volatility in your returns and a great insurance if things go wrong.


7. Monitoring systems

Since we are building a fully automated system that must be able to open and close position within milliseconds, we must ensure proper monitoring systems to control the overall operation._fig15

Imagine what would happen if a human realize that some strategy is not doing what it should, or if any venue is not providing prices as it should. When we must stop the system, unrecoverable losses may already be made.

How many minutes will take to this person to shut the system down? 5 minutes? 1 minutes?

We can have more than thousands of wrong open orders within that time frame. Scary!

That’s why we need to put monitoring systems in place, to check some of the following:

  • Overall PnL: if there is, let’s say, a flash crash, out system must be able to close all open position and shut it down itself.
  • Connectivity between venues: making sure that no one has been disconnected, activating reconnection systems in place.
  • Monitoring latencies: let’s say some switch start to fail, and you start to receive data with some delays. You will never realize of that until you start to analyze some logs. We need to monitor latencies between venues, to assure data delivery and alert us in the case of any issue.


If you have read these articles, I would like to hear from you, and discuss new or different approaches. Please share !!


Ariel Silahian

Keywords: #hft #quants #forex #fx #risk $EURUSD $EURGBP $EURJPY #trading


5 thoughts on “How do I design high-frequency trading systems and its architecture. Part III

  1. Nice info about infrastructure.

    I am also working on building infrastructure for HFT trading. I am thinking of using Python and Spark for the same. Do you have any experience on these and what do you think the performance would be?

    Also what DataBase you suggest for this kind of Real-Time data.


  2. Zubair

    Hi Ariel,

    Very informative article. I wanted to get your suggestion regarding my current project. We need to implement a trading gateway which will accept orders from thousands of traders. Means it can have like 100k users. Gateway is supposed to receive orders, perform risk checks, forward to ECN, calculate positions/PNL and pass back order and position updates. What should I consider to implement such system ? Currently I am using NetMQ for network communication between gateway and trading font end but am not sure how it will work when there will be lot of users like 100k.


  3. Mohamed SEKTAOUI

    That’s a very interesting article.
    I would add that one other advantages of the event-driven model is the handling of market data bursts, when you have these bursts you will have to consume them one by one and by the time you get to the last events the market has kept moving and you are already late to react to the new events…


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s