| contents | ||
| client.lua | ||
| codes.lua | ||
| description.txt | ||
| index.txt | ||
| main.lua | ||
| readme.md | ||
| s_options.lua | ||
| server.lua | ||
| utilities.lua | ||
TEXT+ PROTOCOL
INTRODUCTION
This project is an attempt to create something between Gopher protocol and the classic BBS (Bulletin Board System).
I'm trying to keep what so good about Gopher (like its simplicity) adding what in my opinion is missing, like a more flexible system and ANSI support.
The main goal is to create something accessible by vintage machines too, and with vintage I mean at least the Commodore 64.
That's why TLS or advanced encryption algorythms are not supported.
Speaking of privacy, the idea is to provide some simple obfuscation system or very light encryption routines so that even a weak system like the Commodore 64 can access the resources.
WHY
I started to work on this project because of the current Internet enshittification state.
I've tried to go back to Gopher, but it's really too limited, while Gemini is too strict about https:// requirements.
I think we need something flexible that allow people some degree of customization/creativity even if still using plain text (and ANSI escape sequences).
STATELESS, LIKE GOPHER
The communication protocol is very simple:
- The client connects to a server
- The server replies with a welcome message
- The client send a request
- The server replies and close the connection.
PROTOCOL DETAILS
STEP 1 : Client connects to the server
(*) Nothing to say here, it's just a TCP connection on a specific port (not yet defined).
STEP 2 : Server replies with a welcome message
(*) The server will send a welcome message with the server version and some options that mey be turned on by default.
The welcome message is a plain text file \n terminated.
Any option must be specified before the \n and must be enclose in curly brackets, comma separated.
Here is an example:
Welcome to this server! {compression=1,encoding=UTF8}\n
STEP 3 : The client send a request
Here is where the real communication begins: the client send a request using one of the available command, the client must send 4 prefix characters and the request, with optional options.
- 3 Bytes for the message size in hexadecimal format.
- 1 Byte for the xor checksum of the payload (request)
- n Bytes for the request
The request is composed by a plain text line composed by a command, optional arguments and optional options, like this:
REQUEST INDEX {encoding=ASCII,compression=0}\n
With the above command the client has requested the site's index using the ASCII encoding and no compression.
STEP 4 : The server sends the reply
The server sends the following data to the client:
- 1 Byte for the response code
- 1 Byte for the xor payload's checksum
- 8 Bytes for the payload size (in hexadecimal format)
- n Bytes for the requested data or an error message, depending by the response code.
After this step the server will close the connection.
SIMPLE CHUNK BASED COMMUNICATION
To achieve the goal about targeting vintage machine too, I thought to provide a way to to retrieve data in chucks.
Let's say there is a link that points to a text file that weights 80Kb: this is impossible to handle in a single shot by an 80's home computer.
The idea is to provide a mechanism to let these machines to be able to request chunks of data in flexible sizes that they can handle.
In the above example the vintage machine could request a chunk of variable size (let's say from byte 0 to byte 8000) and handle it at its best, then it can request the successive chunk, and so on... The client can handle the chunk's size and adapt the request size to the host display size.
Since the protocol is line based, almost as Gopher was, the server will have the ability to adjust the requested chunk size to do not break lines.
Of course more capable machines can request the entire content in one shot.
TEXT+ FORMAT
The page you serve must use this format.
Here are some specifications about what I called TEXT+ format.
Each line must be prefixed by a 4 character prefix that is used to determine the content's type.
The first 3 character are actually the indentifier, while the 4th character is the pipe character (|) that is used only to make the page contents readable.
Here is an example:
txt|This is a plain text line.
txt|This is another line of plain text.
ans|This is a line with §[4mANSI§[0m support (showing underlined text).
mrk|This is a line with **Markdown** support.
mon|This is a line in plain text that the client must render using a mono-spaced font.
As you can see the escape character has been indicated using the § character for formatting needs.
We will see more about it in the dedicated section.
Each line must be terminated by a new line character \n, in a sequence \r\n the character \r will be discarded.
SUPPORTED LINE TYPES
Here is a detailed list of all supported line types.
EMPTY LINES
An empty line is just an empty line and should be rendered as is: a blank line.
SERVER : Sends a blank line (a single \n) character.
CLIENT : Renders an empty line
COMMENTS
To increase readibility of your source pages, you can use comment lines, a comment line is prefixed by ###|, for example:
###| *** THIS IS A COMMENT ***
SERVER : Do not send comment lines.
CLIENT : Will never see any comment lines.
PLAIN TEXT
You can use plain text and this line type does not support ANSI escape sequences so you should not insert them into these lines.
Use the txt| prefix to send plain text, like this:
txt|My very nice line of plain text!
txt|...and here is another one!
There is a second syntax for these type:
txt|->/path/to/file.txt
With this syntax you can include a text file into the current page, if the lines of the file to include are not prefixed by txt| the server will add the required headers.
SERVER : `txt|some text` -> Sends the line normally.
SERVER : `txt|/path/to/file.txt` -> Includes the `file.txt` contents into the page.
CLIENT : Renders the text using a proportional font if he wants to.
MARKDOWN
Markdown lines are supported and must be prefixed with the mrk| header, like this:
mrk|Markdown **text** in action.
You could encounter situations where tags are opened across lines (which is legal and must be supported by the clients):
mrk|This is a very long line I'm using as a **working
mrk|example**
...but it's better to close each tag before the end of line, like this:
mrk|This is a very long line I'm using as a **working**
mrk|**example**
To include a file formatted using markdown you can do:
txt|->/path/to/file.md
With this syntax you can include a markdown file into the current page, if the lines of the file to include are not prefixed by mrk| the server will add the required headers.
SERVER : `mrk|some **text**` -> Sends the line normally.
SERVER : `mrk|/path/to/file.md` -> Includes the `file.md` contents into the page.
CLIENT : Renders the text using a proportional font if he wants to but respecting code sections requiring mono spaced fonts.
The following markdown features are not supported:
- LINKS
TEXT+ has its own way to handle links, which must be provided in a single line. - IMAGES
TEXT+ has its own way to handle images, there is no need to overload the client with duplicated features.
If a markdown file to include contains these tags they should be converted by the server.
ANSI TEXT
Using the header ans| you can use ANSI escape sequences into the text lines and the client can do its best to render them on the host machine.
This functionality is restricted to a sub-set of the available commands (still a work in progress).
SUPPORTED SEQUENCES
Note that this list is temporary and not all the escape codes will be mandatory, also there will be a fallback rendering mode if a particular sequence in not supported by the client.
| Name | Sequence | Notes | Fallback |
|---|---|---|---|
| Reset | ␛[0m | Clear all styles | Mandatory |
| Bold | ␛[1m | Bold style | Client predefined color |
| Shadow | ␛[1:2m | Shadow | Bold |
| Dim | ␛[2m | Half-brightness | Default color |
| Italic | ␛[3m | Italic style | Client predefined color |
| Underline | ␛[4m | Underline style | Client predefined color |
| Underline Off | ␛[4:0m | Switch off underline | |
| Slow Blink | ␛[5m | Client predefined color | |
| Fast Blink | ␛[6m | Client predefined color | |
| Reverse | ␛[7m | Swap BG & FG colors | Mandatory |
| Strk/through | ␛[9m | Strikethrough | Client predefined color |
| Reset Bold | ␛[22m | Switch off Bold, Dim | Mandatory |
| Reset Italic | ␛[23m | Switch off italic | Mandatory |
| Reset U/line | ␛[24m | Switch off uderline | Mandatory |
| Reset Blink | ␛[25m | Switch off blink | Mandatory |
| Reset Reverse | ␛[27m | Switch off reverse | Mandatory |
| Reset S/throu | ␛[29m | Switch off strikethr. | Mandatory |
| --- COLORS --- | Can be remapped by the client | ||
| Black FG | ␛[30m | ||
| Red FG | ␛[31m | ||
| Green FG | ␛[32m | ||
| Yellow FG | ␛[33m | ||
| Blue FG | ␛[34m | ||
| Magenta FG | ␛[35m | ||
| Cyan FG | ␛[36m | ||
| White FG | ␛[37m | ||
| RGB FG | ␛[38;2;<r>;<g>;<b>m | RGB 8 bit color | Client can remap the color |
| Color FG | ␛[38;5;<n>m | Color from 256 color palette | Client can remap the color |
| Default FG | ␛[39m | Restores default f/ground color | Mandatory |
| Black BG | ␛[40m | ||
| Red BG | ␛[41m | ||
| Green BG | ␛[42m | ||
| Yellow BG | ␛[43m | ||
| Blue BG | ␛[44m | ||
| Magenta BG | ␛[45m | ||
| Cyan BG | ␛[46m | ||
| White BG | ␛[47m | ||
| RGB BG | ␛[48;2;<r>;<g>;<b>m | RGB 8 bit color | Client can remap the color |
| Color BG | ␛[48;5;<n>m | Color from 256 color palette | Client can remap the color |
| Default BG | ␛[49m | Restores default b/ground color | |
| Br Black FG | ␛[90m | Bright colors | Client can remap the color |
| Br Red FG | ␛[91m | Client can remap the color | |
| Br Green FG | ␛[92m | Client can remap the color | |
| Br Yellow FG | ␛[93m | Client can remap the color | |
| Br Blue FG | ␛[94m | Client can remap the color | |
| Br Magenta FG | ␛[95m | Client can remap the color | |
| Br Cyan FG | ␛[96m | Client can remap the color | |
| Br White FG | ␛[97m | Client can remap the color | |
| Br Black BG | ␛[100m | Bright colors | Client can remap the color |
| Br Red BG | ␛[101m | Client can remap the color | |
| Br Green BG | ␛[102m | Client can remap the color | |
| Br Yellow BG | ␛[103m | Client can remap the color | |
| Br Blue BG | ␛[104m | Client can remap the color | |
| Br Magenta BG | ␛[105m | Client can remap the color | |
| Br Cyan BG | ␛[106m | Client can remap the color | |
| Br White BG | ␛[107m | Client can remap the color | |
| ---CURSOR--- | |||
| H Curs Pos Abs | ␛[<col>` | Absolute horizontal position | Mandatory |
| H Curs Pos Rel | ␛[<cols>a | Relative horizontal movement | Mandatory |
| V Curs Pos Abs | ␛[<row>d | Absolute vertical position | Mandatory |
| V Curs Pos Rel | ␛[<rows>e | Relative vertical movement | Mandatory |
| Set Curs Pos | ␛[<row>;<col>f | Set cursor position | Mandatory |
| Erase CURS->EOL | ␛[0J | Erase cursor->End Of Line | Mandatory |
| Erase CURS->BOS | ␛[1J | Erase cursor->Bottom Of Screen | Mandatory |
| Erase Screen | ␛[2J | Erase screen | Mandatory |
| Erase CURS Line | ␛[2K | Erase cursor's line | Mandatory |
REVIEW THE TEXT BELOW
SYNTAX
ans|contents
ans|->file.ans
The second syntax will force the server to insert the specified file into the page at the give position, all lines inserted will be prefixed automatically with the ans| prefix.
DIRECTORY
This type is replaced by the server with a directory listing, each item is of the LINK type.
SYNTAX
dir|path|include|exclude|options
Where is the local directory to list, include is a filter to include only the matching files and exclude is a filter to exclude all the matching files.
Include filter is applied before the exclude filter.
Options is a string that is used to further filter the directory contents and it accepts the following characters:
D : List sub directories
F : List files
For example to list all the text files in the directory my_blog you need to use:
dir|my_blog|*.txt||F
That generates something like this:
lnk|about|doc:txt|L|12004|my_blog/about.txt
lnk|projects|doc:txt|L|45221|my_blog/projects.txt
...
USER INPUT
Whenever you want that the user inputs some text, for example to perform searches or to leave a message, you have to use this line type:
inp|caption|service
For example:
inp|Type your name|user_name
Since this format does not have any script to process user input, you have to rely on server-side scripts and to avoid exploit, in this example, user_name is the alias of a service that will process the user input.
When the client perform an ASK request, like in this example, it happens the following:
CLIENT-SIDE
ASK user_name|user_input[|options]
SERVER-SIDE
When the server receives such request it checks what script is associated with the user_name alias, executes it passing as parameter the user's input and send the reply to the client.
This scripts are Lua scripts loaded into the server when the application is started.
options is an optional field with further options like checksum, encryption, obfuscation, compression, etc...
FORMS
If you need to let the user compile multiple fields (forms) you can use the following syntax:
inp|form_caption|{caption_1}{caption_2}...{caption_n}|service
For example:
inp|Insert your data|{Name}{Surname}{Age}|verify_user_data
The data sent by the client must have this format:
ASK verify_user_data|{Jhon}{Doe}{34}
Captions can optionally support ANSI colors.
SYNTAX
For a single input:
inp|caption|service
For multiple fields:
inp|form_caption|{caption_1}{caption_2}...{caption_n}|service
LINKS
Links point to resources, both local or remote.
The client can request these resources using the REQUEST set commands.
When a client requests data can also specify an alternative format, the server can provide it or reply that it is not available.
For example, a link pointing to a PNG file could be requested in a different format, let's say IFF.
The preferred format can be specified in the REQUEST options field, like this:
The client receives this link in an page:
lnk|Nice sunset|image:png|34000|/mysite/images/sunset.png
Normally the client should ask for it with:
REQUEST /mysite/images/sunset.png
But it can also try to receive the content in another format that it can handle:
REQUEST /mysite/images/sunset.png [format:iff]
In this scenario the server can try to convert the image and send it to the client or it can reply that the format is not available.
Options could cover several scenarios, for example, if the client is running in a low res device, it could ask for a max size, like this:
REQUEST /mysite/images/sunset.png [max-size:640x480]
...and the server should try to resize the image, send the image if its size it's under the maximun specified or send an error message.
SYNTAX
lnk|caption|type|size|link
| | | +---> Address of the resource without the protocol
| | +---------> Resource's size in bytes (only local links) or -
| +--------------> page Another local page
| +--------------> protocol:type (telnet,http,https,gemini,gopher,text+,...)
| +--------------> image:type (iff,png,jpg,tiff,...)
| +--------------> sound:type (8svx,iff,mp3,wav,ogg,...)
| +--------------> music:type (mod,mp3,wav,midi,...)
| +--------------> animation:type (iff,apng,mp4,avi,...)
| +--------------> doc:type (text,ansi,pdf,markdown,source,...)
| +--------------> archive:type (zip,lha,rar,tar,gzip,...)
+---------------------> caption:link's caption
SPECIAL LINES
This lines can also be rendered as plain text if the client host cannot use special coloring modes, they are meant to indicate information types:
- Informative line
inf|<text> - Important line
imp|<text> - Warning line
wrn|<text> - Error line
err|<text>
These lines could be used when embedding content into the served page, for example including a markdown files could generate errors, the server could insert an information line using the above format.
SERVER COMMANDS AND OPTIONS
The client can request the documents using the following template:
REQUEST[-VARIANT] document [options]
For example, to request the main page (the index):
REQUEST INDEX
To request a part of the index:
REQUEST-CHUNK INDEX [from-byte=0,to-byte=512]
To request the size of the index:
REQUEST-SIZE INDEX
and so on...
COMMON OPTIONS
DOCUMENT RANGE AND SIZE
To allow client running on weak machines, it can make use of range and size options.
the client should request the document size with:
REQUEST-SIZE INDEX
*** ROBA DA SISTEMARE ***
Document size query
Command: REQUEST-SIZE <document>
Response: server MUST return both total bytes and total lines for the requested document in a single, simple response body (not in the binary framing). Suggested plain-text format (CR/LF-normalized, terminated by \n):
SIZE: <bytes> <lines>\n
Example: SIZE: 80000 245\n (80000 bytes, 245 lines)
Semantics:
bytes = total raw byte length of the resource.
lines = number of TEXT+ lines after server-side includes/expansions and canonical line normalization (every line ends with \n; \r discarded).
If resource is binary/has no meaningful lines, server must return lines = 0.
Error handling:
If the document does not exist or is inaccessible, return the normal error code (e.g., NOT_FOUND) and no SIZE line.
Backward compatibility:
Servers that cannot compute lines MAY return SIZE: <bytes> 0\n and advertise lack of line-count support via welcome option {line-count=0}; clients receiving lines=0 should treat line-range requests as unsupported.
Then it could request the document using ranges like described below.
Range options (byte and line)
Range options (mutually exclusive):
{start-byte:n,end-byte:n} — byte-range, inclusive.
Example: {start-byte:0,end-byte:1023} returns bytes 0..1023.
{start-line:n,end-line:n} — line-range, inclusive, lines counted from 0.
Example: {start-line:0,end-line:24} returns the first 25 lines.
Semantics:
Clients MAY supply either a byte-range or a line-range; servers MUST reject requests
that include both with error code RANGE_INVALID (numeric code to be defined).
start and end are non-negative integers; servers MUST return RANGE_INVALID if
start > end.
If end exceeds resource size or line-count, server returns RANGE_INVALID.
Framing and integrity:
The server’s existing payload-length and checksum fields reflect the actual bytes
sent; clients must use these to detect shorter-than-requested responses.
Error handling:
Define error code RANGE_INVALID (numeric code to be defined) for malformed or unsupported
range requests.
---
# ADVANCED FEATURES & IDEAS
These are just ideas to develop further:
* Bot & AI Protection
* Auto-banning and resources usage limits
* Protected documents
* Data encryption
* Draw primitives (box and lines at least)
* Request commands options
* In-text hyperlinks idea:
txt|This is a button [1:click me!] ans|This is another button [2:Push me now!] ###|The following line template is used to define the button links. hyp|id|... link