Skip to main content

High Performance Websocket broker

High-performance WebSocket service for streaming events to end users in a Centralized Exchange (CEX) system, robust, scalable, and low-latency architecture.

1.Core Requirements

Functional Requirements

  • Dynamic Integration with external event streams (e.g., Kafka, NATS, Redis Stream, etc.)

  • Routing Logic per event stream (e.g., queue-to-WebSocket channel binding)

  • Mapping/Transformation Layer per stream (e.g., raw -> client-specific format, aggregation, filtering)

  • Fault tolerance: if a server fails, clients can reconnect to another available server in the cluster, ensuring uninterrupted service.

Non-Functional Requirements

  • Massive Concurrency Support: 100K+ concurrent WebSocket connections

  • Low Latency: < 100ms end-to-end event delivery

  • Horizontal scalability

  • Multi-region support (optional)

Incoming Event Stream Types

  • Real-time trade events (trade stream)

  • Order book updates (depth stream)

  • Price ticker updates (ticker)

  • User-specific streams (authenticated via token)

2.Technology Stack

LayerTech Choice
LanguageRust
WebSocket Servertokio, rust-libp2p, axum
Broker/QueueNATS, Kafka, or Redis Streams
AuthenticationJWT / OAuth2 (Token-based over initial handshake) / API-KEY
OrchestrationKubernetes (with auto-scaling)
ObservabilityPrometheus + Grafana + Loki/ELK

3. Implementation Strategy

A. WebSocket Gateway (WS Gateway)

  • Written in Go or uWebSockets.js for high concurrency.

  • Handle subscribe/unsubscribe messages.

  • Authenticate with JWT.

  • Maintain channel to client list mapping.

  • Push data from broker to clients using in-memory pub/sub.

B. Event Broker

  • Matching engine publishes events to Kafka/NATS per topic: trades.btc_usdt, depth.eth_usdt, ticker.global.

  • Use partitions (in Kafka) or topics (in NATS) for high fanout performance.

C. Data Channel Format

{ "stream": "depth.btc_usdt", "data": { "asks": [["36200.5", "1.2"]], "bids": [["36190.1", "1.1"]], "timestamp": 1724548345123 } }

4. Scalability & Resilience

ConcernSolution
Scaling WSStateless WS servers with sticky sessions or centralized pub/sub
ResilienceRetry on publish failure, circuit breakers
BackpressureDrop slow clients, use bounded queues
Horizontal ScaleDeploy WS Gateway with Kubernetes HPA + custom autoscaler on CPU & connection count

5. Security

  • WebSocket authentication during handshake:

    • Clients provide JWT.

    • JWT is validated by Auth service.

  • Message signing for sensitive topics (like orders).

  • WAF or Layer-7 firewall (AWS WAF, Cloudflare, etc).

6. Observability

Track:

  • Number of active connections

  • Events per second

  • Latency (matching engine → user)

  • Dropped events, client disconnects

Use:

  • Prometheus metrics in Go

  • Logs to Loki or ELK stack

  • Grafana dashboards

Deliverables in MVP

  • WebSocket Gateway with topic subscriptions

  • Kafka/NATS integration

  • Sample clients (web + CLI)

  • Dashboard for live metrics

  • Test with 10K-100K concurrent simulated clients using Artillery or k6

  1. Acceptance Usecases

1. Use Case: WebSocket Handshake with Authorization

Actors: Web Client / Mobile Client, WebSocket Gateway, Auth Service

Description: Clients initiate WebSocket connection with a JWT or API key.

Flow:

  1. Client connects to /ws endpoint.

  2. During the upgrade request, the client includes a JWT in headers or query param.

  3. WebSocket Gateway validates the JWT with the Auth Service.

  4. On success:

    • User info (userId, roles, allowed pairs) is extracted and cached.

    • Connection is accepted.

  5. On failure: return HTTP 401 and reject upgrade.

Events:

  • onConnectionOpen()

  • onAuthValidate(token)

  • onConnectionAccepted() / onConnectionRejected()

2. Use Case: Subscribe to Allowed Topics

Actors: Authenticated Client, WebSocket Gateway

Description: After successful handshake, client sends a subscription request.

Example Topics:

  • ticker.btc_usdt

  • depth.btc_usdt

  • orders.user123 (private)

Flow:

  1. Client sends:

{ "action": "subscribe", "streams": ["ticker.btc_usdt", "orders.user123"] }

  1. WebSocket Gateway:

  2. Validates topic access (e.g., user can only subscribe to orders.user123)

  3. Registers client to internal channel (e.g., Redis pub/sub or in-memory topic map)

Events:

  • onSubscribeRequest(streams)

  • onTopicPermissionCheck()

  • onSubscriptionConfirmed()

3. Use Case: Receive Streamed Data from Kafka and Push to Clients

Actors: Kafka, WebSocket Gateway

Description: Internal services (e.g., Matching Engine) publish messages to Kafka. The Gateway consumes and routes them to subscribed clients.

Flow:

  1. Matching Engine publishes message:

{ "stream": "ticker.btc_usdt", "data": { "price": "36200.5", "volume": "1.25", "timestamp": 1724548381123 } }

  1. Kafka consumer inside WebSocket Gateway receives the message.

  2. Gateway looks up clients subscribed to ticker.btc_usdt.

  3. Message is broadcasted to all connected WebSocket clients of that topic.

Events:

  • onKafkaMessage(topic, message)

  • onMessageDispatch(topic → clients[])

4. Use Case: Unsubscribe from a Topic

Actors: Authenticated Client

Description: Clients can unsubscribe from specific topics to save bandwidth.

Flow:

  1. Client sends:

{ "action": "unsubscribe", "streams": ["ticker.btc_usdt"] }

  1. Gateway removes client from internal routing list for that topic.

5. Use Case: Backpressure & Dropping Slow Clients

Actors: WebSocket Gateway

Description: If client cannot consume messages fast enough, drop it to avoid memory leaks.

Flow:

  1. Gateway maintains a message queue per client (e.g., ring buffer or channel).

  2. If queue exceeds threshold or client write times out:

    • Send disconnect with reason

    • Close WebSocket

6. Use Case: Private Stream for Authenticated User

Actors: Authenticated Client, Order Service

Description: Users receive their order updates through orders.user123 stream.

Flow:

  1. Order Service publishes private order update to Kafka topic: orders.user123

  2. WebSocket Gateway reads this Kafka topic

  3. Routes message only to the client authenticated as user123

7. Use Case: Reconnect with Resume Support

Actors: Authenticated Client, WebSocket Gateway

Description: If connection is lost, allow client to reconnect and resume from last known sequence.

Flow:

  1. Client reconnects and sends:

{ "action": "resume", "streams": ["depth.btc_usdt"], "lastSequence": 21253412 }

  1. Gateway queries cache or buffer (e.g., Redis or local memory) for missed messages.

  2. Sends missed updates, then resumes real-time stream.

Summary Table

Use CaseDescriptionKey Events
Handshake + AuthJWT-based connection initonAuthValidate
Subscribe TopicsSubscribe to market/private topicsonSubscribeRequest
Kafka to ClientsConsume Kafka & push to clientsonKafkaMessage
UnsubscribeStop receiving a streamonUnsubscribeRequest
BackpressureDrop or pause slow clientsonClientOverload
Private StreamsPer-user updatesonUserStreamPush
Resume SupportReconnect and resumeonResumeStream