Building a Real-time Transcript System with WebSocket

- Published on
- /2 mins read/---
Introduction
Building a real-time transcript system requires careful consideration of architecture, state management, and performance optimization. In this article, we'll explore how to design and implement a robust transcript system that can handle real-time updates efficiently.
System Architecture
The transcript system follows a distributed architecture with a source-of-truth pattern:
graph TD
Backend[Backend TranscriptService] --> |WebSocket Events| FrontendService[Frontend TranscriptsService]
FrontendService --> |Notify| Store[Store Layer]
Store --> |Render| UI[UI Components]
UI --> |User Actions| FrontendService
FrontendService --> |WebSocket Events| Backend
Backend Design
The backend serves as the source of truth for all transcript data. It maintains:
User-specific State Maps
- Last spoke time tracking
- Current text buffers
- Translation states
- Transcript history
Core Components
class TranscriptService {
/** Maximum time gap (ms) between speech segments */
private readonly MAX_GAP = 3000;
/** User-specific state maps */
private readonly lastSpokeTime: Map<userId, {
interviewer: number;
candidate: number
}>;
private readonly currentText: Map<userId, {
interviewer: string;
candidate: string
}>;
private readonly transcripts: Map<userId, Map<transcriptId, TranscriptType>>;
}
Frontend Architecture
The frontend maintains a synchronized local cache and handles:
- State Management
class TranscriptsService {
/** Local cache of transcripts */
private transcripts: TranscriptType[];
/** Feature configuration */
private isAIEnabled: boolean;
private isTranslateEnabled: boolean;
private targetLanguage: 'en' | 'zh';
}
- Message Types
interface TranscriptMessage {
type: 'transcript';
transcript: TranscriptType;
aiEnabled: boolean;
role: Role;
sessionId: string;
}
Real-time Communication
WebSocket Events
Backend to Frontend
transcript_update
: New or updated transcriptsync_transcripts
: Initial state syncclear_transcripts
: Reset notification
Frontend to Backend
- Configuration updates
- User actions
- State synchronization requests
Data Flow
- Initial Load
class TranscriptsService {
private setupSocketListeners() {
this.socket.on('sync_transcripts', (data) => {
this.transcripts = data;
this.notifyStores();
});
}
}
- Real-time Updates
@SubscribeMessage('transcript')
async handleTranscript(message: TranscriptMessage) {
// Update source of truth
const updated = await this.transcriptService.update(message);
// Broadcast to all clients
this.broadcast(updated);
}
Error Handling
Connection Management
- Automatic Reconnection
class TranscriptsService {
private setupReconnection() {
this.socket.on('disconnect', () => {
this.reconnectAttempts = 0;
this.scheduleReconnect();
});
}
private scheduleReconnect() {
if (this.reconnectAttempts >= MAX_RECONNECT_ATTEMPTS) {
this.notifyError('Connection failed');
return;
}
setTimeout(() => this.connect(), this.getBackoffDelay());
}
}
- State Recovery
class TranscriptService {
async recoverState(userId: string) {
const transcripts = this.transcripts.get(userId);
return Array.from(transcripts.values());
}
}
← Previous postBuilding a Scalable Real-time Speech Recognition System