April 11, 2026

Building Real-Time Chat with Go WebSockets

How I shipped a production-grade WebSocket chat service—Hub pattern, presence tracking, and memory-safe design—on Day 26 of my 100-day coding streak.

Quick navigation


Why Real-Time Is Different

Most HTTP handlers are stateless. A request arrives, you read from a database, you write a response, you forget the caller exists. That simplicity is a gift—it's why horizontal scaling is easy.

WebSockets throw that away on purpose. A connection is a long-lived, stateful object that can receive messages at any moment. The server has to remember every open connection, know which room it belongs to, and be able to push data to it without the client asking.

That change in model is small on paper and enormous in practice. Before writing a single line of this service, I had to decide: who owns the connections?


The Hub Pattern

The answer is a Hub—a single goroutine-safe registry that every connection reports to.

type Hub struct {
    mu         sync.RWMutex
    clients    map[*Client]bool
    rooms      map[string]map[*Client]bool
    messages   map[string][]Message
    register   chan *Client
    unregister chan *Client
    broadcast  chan broadcastRequest
    stopChan   chan struct{}
}

Three channels do all the work:

ChannelDirectionPurpose
registerclient → hubAdd a new connection
unregisterclient → hubRemove a closing connection
broadcastclient → hubDeliver a message to a room

A single run() goroutine reads from all three. Because only one goroutine mutates the maps, there are no data races on the hot path. The sync.RWMutex is only needed for the handful of read-only query methods (GetActiveUsers, GetHistory) that run from HTTP handlers concurrently.

One subtlety caught me early: sendOnlineNotification calls Broadcast, which sends to the broadcast channel. If that channel is unbuffered and run() is busy, you deadlock—run() is waiting to receive but can't because it's the only reader and it's blocked on sending. The fix is a buffer:

broadcast: make(chan broadcastRequest, 256),

Small detail, nasty bug to debug at midnight.


Presence Tracking

"Who's online?" sounds like a simple question. The naïve answer—iterate every connected client and count—is O(n) on every HTTP request to /stats. Fine at ten users, painful at ten thousand.

The Hub's rooms map gives O(1) answers for free:

func (h *Hub) GetActiveUsersCount(roomID string) int {
    h.mu.RLock()
    defer h.mu.RUnlock()
    if clients, ok := h.rooms[roomID]; ok {
        return len(clients)
    }
    return 0
}

The count is a side-effect of the structure, not a separate counter that has to be kept in sync. When a client joins, it's added to rooms[roomID]. When it leaves, it's removed. len() is always correct.

The full presence shape includes enough for a UI to render a "who's here" list without extra queries:

type UserPresence struct {
    UserID   string    `json:"user_id"`
    Username string    `json:"username"`
    Rooms    []string  `json:"rooms"`
    LastSeen time.Time `json:"last_seen"`
    IsOnline bool      `json:"is_online"`
}

Memory Safety in Practice

A chat service with no cleanup is a memory leak. Rooms accumulate history indefinitely; abandoned rooms never disappear. Two goroutines handle this.

Room cleanup runs every five minutes and deletes history older than 24 hours:

func (h *Hub) cleanupExpiredRooms() {
    ticker := time.NewTicker(5 * time.Minute)
    defer ticker.Stop()
    for {
        select {
        case <-ticker.C:
            h.mu.Lock()
            now := time.Now()
            for roomID, messages := range h.messages {
                if len(messages) == 0 ||
                   now.Sub(messages[0].CreatedAt) > 24*time.Hour {
                    delete(h.messages, roomID)
                }
            }
            h.mu.Unlock()
        case <-h.stopChan:
            return
        }
    }
}

History is bounded at read time, not write time. The slice grows freely, but GetHistory returns at most limit entries from the tail. This means you pay for storage up to 24 hours but never return more than the caller asked for.

Slow clients are handled inline during broadcast:

select {
case client.Send <- payload:
default:
    close(client.Send)
}

If the send buffer is full—a client that's connected but not reading—we close its channel. The WritePump goroutine detects this and terminates the connection. No goroutine leak, no blocking the broadcast loop.


Heartbeats and Stale Connections

TCP connections can disappear silently. A phone loses wifi; a corporate firewall kills idle sockets; a process crashes. The server holds a *Client pointing at a conn that will never respond, and it doesn't know yet.

The WritePump sends a WebSocket ping every second:

ticker := time.NewTicker(time.Second)
defer ticker.Stop()
for {
    select {
    case <-ticker.C:
        c.mu.Lock()
        _ = c.conn.WriteMessage(websocket.PingMessage, nil)
        c.mu.Unlock()
    // ...
    }
}

If the write fails—because the conn is dead—WritePump returns, stopCh closes, ReadPump exits, and unregister is sent to the hub. The client disappears from the rooms map within one second of the dead connection being detected.

One second is aggressive. In practice you'd extend this to 10–30 seconds to avoid unnecessary load. But for a dev service, it surfaces stale connections immediately.


The Frontend in One File

The entire demo UI lives in public/index.html—no build step, no framework, no bundler. Vanilla JS with three responsibilities:

  1. Connect to ws://host/ws/{room}?username=... on load and reconnect after 3 seconds on drop.
  2. Send a JSON payload when the user hits Enter or the Send button.
  3. Render incoming messages into styled bubbles, distinguishing me, other, and [System] messages.

Room switching closes the current socket and opens a new one:

document.getElementById('room-select').addEventListener('change', () => {
    if (socket) socket.close();
    setTimeout(() => connect(false), 100);
});

The 100ms delay lets the close frame complete before the new handshake starts. Small detail, avoids a race where the new connection arrives before the server fully processes the close.

Presence counts poll /api/v1/chat/rooms/{room}/stats on open and every 5 seconds. This is fine for a demo—a production system would push counts over the WebSocket itself to avoid polling.


Deployment

The Dockerfile is a two-stage build:

FROM golang:1.22-alpine AS build
WORKDIR /src
COPY go.mod go.sum* ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /out/server ./cmd/server

FROM alpine:3.20
WORKDIR /app
RUN apk add --no-cache ca-certificates
COPY --from=build /out/server ./server
COPY migrations ./migrations
COPY public ./public
EXPOSE 8080
ENTRYPOINT ["./server"]

The final image is an Alpine binary with no Go toolchain. CGO_ENABLED=0 produces a static binary that runs without glibc, which is the whole point of the Alpine base.

compose.yaml wires Postgres and Redis with health checks so the app doesn't start until both dependencies are actually ready—not just "container running."

For cloud deployment, the same image goes to AWS ECS Fargate with DB_AUTO_MIGRATE=true. The migration runner reads SQL files in filename order, so schema changes are a new numbered file in migrations/.


What I'd Do Differently

Persist messages to Postgres. Right now history is in-memory. A restart wipes it. The messages table exists in the schema; hooking it up is a few inserts in the broadcast handler and a query in GetHistory. I left it for day 27.

Push presence over WebSocket. Polling /stats every 5 seconds works but wastes requests. The hub could fan out a presence update to all room members whenever someone joins or leaves.

Add authentication. The WebSocket endpoint accepts any username from the query string—trivially spoofed. JWT validation in the upgrade handler would fix this. The auth service from week 3 already issues tokens; connecting them is mostly plumbing.

Rate-limit the broadcast. A single client can flood a room by sending messages in a tight loop. A token bucket per client (100 messages/minute is generous) would prevent this without blocking legitimate use.


Repository: golang-chat-service

Related (other posts from the same streak, with source):