# DataSet Binary Middleware Protocol – Cross-Language API Reference
## Overview
The protocol provides real-time binary access to `TDataSet` instances over TCP. Both server and client implement predictable packet-based communication with a fixed header, command set, status codes, and structured payloads. Any language can interact by conforming to the packet layout in the reference Pascal unit.
---
## Packet Structure
Each packet consists of:
1. **Header (`TPacketHeader` equivalent)**
- `Version` (Word): Protocol version (`0x0100`)
- `Command` (Byte): Command identifier (see command codes).
- `Status` (Byte): Response status (OK, ERROR, etc.).
- `SessionId` (LongWord): Random session identifier generated when opening.
- `DataSize` (LongWord): Length of payload following header.
- `Checksum` (LongWord): Additive checksum of payload bytes.
2. **Payload**
- Binary stream structured by commands, including integers, strings with length prefixes, and raw blobs.
All integers are Little-Endian (Free Pascal default). Strings are sent as:
- 32-bit length (Integer)
- UTF-8/ANSI bytes (no null terminator)
Nullability is indicated per field:
- Field Null Flag: Boolean (`1` byte)
- If `True`, no additional bytes follow for that field.
---
## Command Codes
| Command | Value | Description |
|--------|-------|-------------|
| `CMD_OPEN_DATASET` | `0x01` | Request to open dataset (reserved). |
| `CMD_CLOSE_DATASET` | `0x02` | Close dataset connection. |
| `CMD_FETCH_DATA` | `0x03` | Fetch dataset or record payload. |
| `CMD_UPDATE_DATA` | `0x04` | Update record (with serialized fields). |
| `CMD_DELETE_DATA` | `0x05` | Delete record request. |
| `CMD_INSERT_DATA` | `0x06` | Insert record. |
| `CMD_METADATA` | `0x07` | Exchange schema metadata. |
| `CMD_DELTA_UPDATE` | `0x08` | Server broadcast showing field change. |
| `CMD_PING` | `0x09` | Heartbeat. |
| `CMD_ACK` | `0x0A` | Acknowledgment. |
| `CMD_ERROR` | `0xFF` | Error notification.
---
## Status Codes
| Name | Value | Meaning |
|------|-------|---------|
| `STATUS_OK` | `0x00` | Success |
| `STATUS_ERROR` | `0x01` | General failure |
| `STATUS_NO_DATA` | `0x02` | No dataset records |
| `STATUS_MORE_DATA` | `0x03` | Expect more segments |
---
## Serialization Rules
### Fields
For each `TField`, payload must include:
1. Null flag (`Boolean`)
2. Field-specific data:
- Strings (`ftString`, `ftMemo`, `ftWideString`): `[Length: Integer][Bytes]`
- Integers (`ftInteger`, `ftSmallint`, `ftWord`): `Integer`
- Large integers (`ftLargeint`): `Int64`
- Floats (`ftFloat`, `ftCurrency`, `ftBCD`): `Double`
- Boolean (`ftBoolean`): `Boolean`
- Date/Time (`ftDate`, `ftTime`, `ftDateTime`): `TDateTime` (`Double`)
- Blobs (`ftBlob`, `ftGraphic`): `[Size: LongInt][Raw bytes]`
### Metadata Packet (`CMD_METADATA`)
- Field count (`Integer`)
- For each field:
- Field name length (`Integer`) + name bytes
- Field type (`TFieldType`, underlying `Integer`)
- Field size (`Integer`)
### Data Packet (`CMD_FETCH_DATA`)
- Record count (`Integer`)
- For each record, fields serialized sequentially (see above)
### Delta Packet (`CMD_DELTA_UPDATE`)
- Field name length/bytes
- Old value length/bytes (string)
- New value length/bytes (string)
---
## Example Sequence (Client Perspective)
1. **Connect** to server socket.
2. Send `CMD_METADATA` packet → Receive header with `CMD_METADATA` and payload describing schema.
3. Send `CMD_FETCH_DATA` → Receive dataset stream with record count + serialized fields.
4. For updates, send `CMD_UPDATE_DATA`, `CMD_INSERT_DATA`, or `CMD_DELETE_DATA` with serialized record.
---
## Cross-Language Implementation Notes
- **Checksum**: Simple additive sum of payload bytes (can be replaced with CRC). Sender writes checksum in header; receiver recalculates and compares.
- **Session Management**: `SessionId` ties requests/responses; reuse for all packets in same connection.
- **Endianness**: Assume little-endian. Convert explicitly when using big-endian languages.
- **Null fields**: Always send the null flag; skip further data if `True`.
- **Streaming**: Packets capped at `MAX_PACKET_SIZE` (1 MB). Implement buffering to match the behavior of Pascal’s `TMemoryStream`.
---
## Language-Specific Tips
- **C/C++/Go**: Map `TPacketHeader` with packed structs. Use `write()`/`read()` for headers/payload and implement additive checksum.
- **Python**: Use `struct.pack('<HBBLLL')` for the header. Strings can be managed with length prefix + `.encode('utf-8')`.
- **Java/Delphi**: Mirror field serialization logic with `DataOutputStream`/`TStream`. Ensure `Boolean` is single byte.
- **Rust**: Use byte buffers and `LittleEndian` helpers. Wrap packet handling in a struct similar to `TDataPacket`.
- **Node.js**: Use `Buffer` for header/field serialization; maintain a session ID per client.
---
## Integration Considerations
- Use `CMD_PING`/`CMD_ACK` for keep-alive.
- Handle `STATUS_MORE_DATA` if server splits responses.
- For real-time sync, listen for `CMD_DELTA_UPDATE` packets and update local data accordingly.
- Implement reconnection logic (FPC client sets `AutoReconnect`).
- Maintain field order; server and client must agree on schema or leverage metadata handshake.
---
## Example Wire Diagram
Client → Server : [Header CMD_METADATA] → Server responds with schema Client → Server : [Header CMD_FETCH_DATA] → Server responds record count + rows Server → Client : [Header CMD_DELTA_UPDATE] whenever a field changes Client → Server : [Header CMD_UPDATE_DATA] when pushing changes
---
## Best Practices
- Validate `DataSize` before reading payload to prevent buffer overflows.
- Confirm `Checksum` matches before trusting data.
- Reuse `SessionId` for correlated requests/responses within same connection.
- Serialize strings with explicit lengths to support ASCII/UTF-8 and avoid null-termination issues.
- For large blobs, consider chunking (not implemented but extendable with command variations).
---
By adhering to this structure, any language that can read/write raw TCP streams can participate in this real-time dataset protocol.