In the opening part of this series we outlined the basics of Socket.IO and discussed the importance of documenting Socket.IO APIs. Now it’s time to bring AsyncAPI into play.
In this post we’re going to cover:
- A modelling exercise, in which Socket.IO semantics are mapped to AsyncAPI structures
- A tutorial involving the creation of an AsyncAPI specification given an existing Socket.IO API
- Asynction, a Socket.IO server framework driven by the AsyncAPI specification
# Modelling the Socket.IO protocol using AsyncAPI
Don’t let the title of this section intimidate you. This modelling exercise ended up being relatively straightforward and I think it makes a great example of how AsyncAPI was designed to fit any event-driven protocol. If you are not interested in the thought process behind this exercise, you may jump straight to the Summary paragraph of this section, which presents the solution.
I will approach this problem by traversing the AsyncAPI object structure, attempting to map each of the objects to a semantic of the Socket.IO client API.
The root object of the specification is the AsyncAPI Object. The fields of this object that require special attention are channels
and servers
.
# Channels
The Channels Object is a map structure that relates a channel path (relative URI) to a Channel Item Object.
channels:
/: {} # Channel Item Object
/admin: {} # Channel Item Object
Channels are addressable components where messages/events flow through. The specification suggests that a server may support multiple channel instances enabling an application to separate its concerns. This sounds very much like the definition of the Socket.IO namespace. Namespaces are indeed addressable components that follow the relative URI convention. Since Socket.IO supports multiplexing, a client may emit messages to multiple namespaces over a single shared connection. However, it could also force a seperate connection per namespace (using the forceNew
option). Thus, a Socket.IO namespace could either be a virtual or physical channel.
Given that connections are established on the namespace level, the Channel Item Object is the only object of the specification that MAY include bindings
. For a Socket.IO API, the Channel Bindings Object should only contain the ws
field, in which one can specify the handshake context (HTTP headers and query params) that a client should provide when connecting to that particular channel/namespace.
channels:
/:
publish: {} # Operation object - Ignore this for now
subscribe: {} # Operation object - Ignore this for now
bindings:
ws:
query:
type: object
properties:
token:
type: string
required: [token]
Since a single connection (and thus binding) is going to be used across multiple channels, there is no need to repeat the same bindings
object under each channel/namespace. We can introduce the convention of always including bindings under the main (/
) namespace but omitting them under the custom ones. At this point I would also like to propose the following bonus semantic: If a custom namespace includes bindings, then the client should always force a new connection when connecting to it.
You have probably noticed that I chose to stick to the WebSockets Channel Binding as the only possible binding that a Socket.IO API may define. One could ask why not use an HTTP Channel Binding object alongside the WebSockets one, since the protocol could also be implemented via HTTP long-polling. There are 2 answers to this question:
- The current latest version of the AsyncAPI bindings specifications does not allow HTTP bindings to be defined at the channel level.
- The HTTP long-polling implementation of Socket.IO is essentially a pseudo WebSocket. It is implemented in such a way to resemble the WebSocket implementation. The same HTTP headers and query params are sent to the server no matter the transport mechanism.
Hence, it is safe to use the ws bindings even for the HTTP long-polling fallback. However, in an ideal world, we would have AsyncAPI supporting SocketIO bindings through an explicit socketio
field. In fact, I have created a github issue to pitch this proposal.
Along with bindings
, the Channel Item Object includes the publish
and subscribe
fields, in which one defines the operations that a namespace supports. The publish
Operation Object lists all the possible events that the client may emit (socket.emit
), while the subscribe
operation defines the events that the client may listen to (socket.on
).
A Socket.IO event can be expressed using the Message Object, where the name
field describes the eventName
and the payload
field describes the schema of the args
that the client passes as part of the socket.emit
invocation: socket.emit(eventName[, …args][, ack])
. For subscribe
events, payload
defines the structure of the arguments that the event handler callback expects: socket.on(eventName, (...args) => {})
.
The structure of the payload value depends on the number of arguments expected:
Scenario | Sender-side code | Payload value structure | AsyncAPI Message Object |
---|---|---|---|
No args expected |
socket.emit("hello")
|
n/a — Payload field should be omitted |
|
Single arg expected |
socket.emit("hello", {foo: “bar”})
|
Any type other than tuple |
|
Multiple args expected |
socket.emit("hello", {foo: “bar”}, 1)
|
Tuple type |
|
To account for multiple events (Message Objects) per namespace, the message
field of each Operation Object allows the oneOf
array structure. For example, in the message of the publish operation of the /admin
namespace, the oneOf
array lists all the available eventName
and args
payload pairs that a client can pass to the adminNamespace.emit
call:
channels:
/admin:
publish:
message:
oneOf:
- $ref: "#/components/messages/MessageOne"
- $ref: "#/components/messages/MessageTwo"
Now, let’s move on to the acknowledgement semantics of the protocol: The basic unit of information in the Socket.IO protocol is the packet. There are 7 distinct packet types. The payloads of the publish and subscribe Message Objects described above correspond to the EVENT
and BINARY_EVENT
packet types. These are essentially the packets that are transmitted when the Socket.IO sender invokes the emit
API function of the Socket.IO library (regardless of implementation). In turn, the Socket.IO event receiver handles the received event using the on
API function of the Socket.IO library. As part of the on
handler, the receiver may choose to return an acknowledgement of the received message. This acknowledgement is conveyed back to the sender via the ACK
and BINARY_ACK
packet types. The ack data is passed as input to the callback that the message sender has provided through the emit
invocation.
In order to express the above semantics, the Message Object (eventName and args payload pair) should be linked to an optional acknowledgement object. Since the specification in its current form does not support such a structure, I am proposing the following Specification Extension:
- Message Objects MAY include the
x-ack
field. The value of this field SHOULD be a Message Ack Object. - Components Object MAY include the
x-messageAcks
field. The value of this field should be of type:Map[string, Message Ack Object | Reference Object]
.
# Message Ack Object
Field Name | Type | Description |
---|---|---|
args | Schema Object | Schema of the arguments that are passed as input to the acknowledgement callback function. In the case of multiple arguments, use the array type to express the tuple. |
In the case of a publish
message, the x-ack
field informs the client that it should expect an acknowledgement from the server, and that this acknowledgement should adhere to the agreed schema. Likewise, for subscribe
messages the x-ack
field encourages the client to send a structured acknowledgement, for each message it receives.
# Servers
The Servers Object is – surprise surprise – a map of Server Objects. Each Server Object contains a url
field from which the client may infer the custom path to the Socket.IO server. This custom path should then be provided via the path
option upon the initialisation of the Socket.IO connection manager, alongside the url
arg. The protocol
field of the Server Object is also required, and specifies the scheme part of that url
arg. Its value should equal any of the ws
, wss
, http
or https
protocols. For a Socket.IO client, it does not really matter whether the scheme is http or ws, due to the upgrade mechanism. Thus, for Socket.IO APIs, the only purpose of the protocol
field is to indicate the use (or absence) of SSL.
# Summary
We made it to the end of the modelling exercise the outcome of which is the following table, relating Socket.IO semantics to AsyncAPI structures.
Socket.IO | AsyncAPI |
---|---|
Namespace | Channel (described through the Channel Item Object) |
IO options | WebSockets Channel Binding |
namespaceSocket.emit(eventName[, …args][, ack]) |
Operation Object defined under the publish field of a Channel Item Object. The available eventName & args pairs for this emit invocation are listed under the message field, through the oneOf array structure. |
namespaceSocket.on(eventName, callback) |
Operation Object defined under the subscribe field of a Channel Item Object. The available eventName & callback argument pairs for this on invocation are listed under the message field, through the oneOf array structure. |
Event | Message (described through the Message Object) |
eventName |
The name field of the Message Object) |
Event args |
The payload field of the Message Object |
ack |
The x-ack field of the Message Object. Requires an extension of the specification. The field may be populated for both publish and subscribe messages. |
Custom path (path option) |
The url field of the Server Object |
Use of TLS (regardless of transport mechanism) | The protocol field of the Server Object |
# In practice
With the modelling exercise out of the way, I’m now going to guide you through the process of creating an AsyncAPI spec from scratch given an existing Socket.IO API. For the purposes of this simple tutorial, let’s use this minimal chat application, which is one of the get-started demos featured in the Socket.IO website.
Below is the source of our Socket.IO server:
// Setup basic express server
const express = require("express");
const app = express();
const path = require("path");
const server = require("http").createServer(app);
const io = require("socket.io")(server);
const port = process.env.PORT || 3000;
.listen(port, () => {
serverconsole.log("Server listening at port %d", port);
;
})
// Chatroom
let numUsers = 0;
.on("connection", (socket) => {
iolet addedUser = false;
// when the client emits 'new message', this listens and executes
.on("new message", (data) => {
socket// we tell the client to execute 'new message'
.broadcast.emit("new message", {
socketusername: socket.username,
message: data,
;
});
})
// when the client emits 'add user', this listens and executes
.on("add user", (username, cb) => {
socketif (addedUser) {
cb({ error: "User is already added" });
return;
}
// we store the username in the socket session for this client
.username = username;
socket++numUsers;
= true;
addedUser .emit("login", {
socketnumUsers: numUsers,
;
})// echo globally (all clients) that a person has connected
.broadcast.emit("user joined", {
socketusername: socket.username,
numUsers: numUsers,
;
})cb({ error: null });
;
})
// when the client emits 'typing', we broadcast it to others
.on("typing", () => {
socket.broadcast.emit("typing", {
socketusername: socket.username,
;
});
})
// when the client emits 'stop typing', we broadcast it to others
.on("stop typing", () => {
socket.broadcast.emit("stop typing", {
socketusername: socket.username,
;
});
})
// when the user disconnects.. perform this
.on("disconnect", () => {
socketif (addedUser) {
--numUsers;
// echo globally that this client has left
.broadcast.emit("user left", {
socketusername: socket.username,
numUsers: numUsers,
;
})
};
});
})
// Admin
.of("/admin").on("connection", (socket) => {
iolet token = socket.handshake.query.token;
if (token !== "admin") socket.disconnect();
.emit("server metric", {
socketname: "CPU_COUNT",
value: require("os").cpus().length,
;
}); })
I’ve slightly tweaked the original source located at https://github.com/socketio/socket.io/tree/master/examples/chat to include acknowledgments and bindings, so that I can showcase the full spectrum of the AsyncAPI specification.
Let’s start by defining the version of the specification as well as the info object which provides metadata about the service:
asyncapi: 2.3.0
info:
title: Socket.IO chat service
version: 1.0.0
description: |
This is one of the get-started demos listed in the socket.io website: https://socket.io/demos/chat/
Moving on to the servers section, where one should provide connectivity information for all the instances of their service. In the case of our simple chat application, there is only one demo server accessible at socketio-chat-h9jt.herokuapp.com:
servers:
demo:
url: socketio-chat-h9jt.herokuapp.com/socket.io
protocol: wss
Things get a bit more interesting when it comes to channels. Skimming through the server code we find 2 namespace instances (default and /admin), which means that the channel mapping should consist of 2 entries:
channels:
/: {}
/admin: {}
Within each namespace connection block, there are multiple socket.on
, and socket.emit
references. For each unique reference, we need to append a Message Object under the publish and subscribe operations respectively:
channels:
/:
publish:
message:
oneOf:
- $ref: "#/components/messages/NewMessage"
- $ref: "#/components/messages/Typing"
- $ref: "#/components/messages/StopTyping"
- $ref: "#/components/messages/AddUser"
subscribe:
message:
oneOf:
- $ref: "#/components/messages/NewMessageReceived"
- $ref: "#/components/messages/UserTyping"
- $ref: "#/components/messages/UserStopTyping"
- $ref: "#/components/messages/UserJoined"
- $ref: "#/components/messages/UserLeft"
- $ref: "#/components/messages/LogIn"
/admin:
subscribe:
message: # No need to use `oneOf` since there is only a single event
$ref: "#/components/messages/ServerMetric"
From the server code, we can also see that the connection handler of the admin namespace applies some very sophisticated authorization based on the token
query parameter. The spec should hence document that the API requires the presence of a valid token query param upon the handshake:
channels:
/:
publish:
# ...
subscribe:
# ...
/admin:
subscribe:
# ...
bindings:
$ref: "#/components/channelBindings/AuthenticatedWsBindings"
Putting everything together into a single document:
asyncapi: 2.3.0
info:
title: Socket.IO chat demo service
version: 1.0.0
description: |
This is one of the get-started demos presented in the socket.io website: https://socket.io/demos/chat/
servers:
demo:
url: socketio-chat-h9jt.herokuapp.com/socket.io
protocol: wss
channels:
/:
publish:
message:
oneOf:
- $ref: "#/components/messages/NewMessage"
- $ref: "#/components/messages/Typing"
- $ref: "#/components/messages/StopTyping"
- $ref: "#/components/messages/AddUser"
subscribe:
message:
oneOf:
- $ref: "#/components/messages/NewMessageReceived"
- $ref: "#/components/messages/UserTyping"
- $ref: "#/components/messages/UserStopTyping"
- $ref: "#/components/messages/UserJoined"
- $ref: "#/components/messages/UserLeft"
- $ref: "#/components/messages/LogIn"
/admin:
subscribe:
message: # No need to use `oneOf` since there is only a single event
$ref: "#/components/messages/ServerMetric"
bindings:
$ref: "#/components/channelBindings/AuthenticatedWsBindings"
components:
messages:
NewMessage:
name: new message
payload:
type: string
Typing:
name: typing
StopTyping:
name: stop typing
AddUser:
name: add user
payload:
type: string
x-ack: # Documents that this event is always acknowledged by the receiver
args:
type: object
properties:
error:
type: [string, "null"]
NewMessageReceived:
name: new message
payload:
type: object
properties:
username:
type: string
message:
type: string
UserTyping:
name: typing
payload:
type: object
properties:
username:
type: string
UserStopTyping:
name: stop typing
payload:
type: object
properties:
username:
type: string
UserJoined:
name: user joined
payload:
type: object
properties:
username:
type: string
numUsers:
type: integer
UserLeft:
name: user left
payload:
type: object
properties:
username:
type: string
numUsers:
type: integer
LogIn:
name: login
payload:
type: object
properties:
numUsers:
type: integer
ServerMetric:
name: server metric
payload:
type: object
properties:
name:
type: string
value:
type: number
channelBindings:
AuthenticatedWsBindings:
ws:
query:
type: object
properties:
token:
type: string
required: [token]
The modified server source code is pushed at https://github.com/dedoussis/asyncapi-socket.io-example, along with the above AsyncAPI spec, which can be viewed using the AsyncAPI playground.
Note that there is no point in documenting the reserved events since all Socket.IO APIs support these by default.
# Asynction
In parallel to this exercise I have been developing Asynction, a Socket.IO python framework that is driven by the AsyncAPI specification. Asynction is built on top of Flask-Socket.IO and inspired by Connexion. It guarantees that your API will work in accordance with its documentation. In essence, Asynction is to AsyncAPI and Flask-SocketIO, what Connexion is to OpenAPI and Flask.
In this example, I forked the minimal chat application that we documented above and re-implemented the server in python, using Asynction. Be mindful of the x-handler
and x-handlers
extensions that have been introduced to relate AsyncAPI entities (such as message or channel objects) to python callables (event handlers).
You may find extensive documentation of Asynction at: https://asynction.dedouss.is
The framework is still at a beta stage, so please get in touch before using it in a production setup.
Any piece of feedback would be much appreciated.
# The end
Hopefully, you found this series of blog posts useful and motivating enough to start documenting your own Socket.IO API(s). For any questions, comments, or corrections, feel free to reach out to me at dimitrios@dedouss.is.
A special shout out to derberq and the wider AsyncAPI community for being particularly helpful and responsive. 🙇