FEATURES.md (8241B)
1 # Introduction and motivation 2 3 "Features" is a set of feature names supported by a specific version of 4 Unison implementation. Over time, each incompatible change -- whether 5 mandatory or an optional add-on -- is assigned a unique feature name. 6 7 Features allow a client to connect to and properly work with a server of 8 different version, older or newer. When setting up the connection, both 9 server and client negotiate a commonly supported set of features. 10 11 Using features instead of a version makes the implementation agnostic of any 12 versioning schemes, forks and third party implementations. It also allows 13 for more flexible code changes over time, without the code being polluted by 14 adding more and more conditionals for various version combinations, such as 15 "if version < X then", "if version >= Y and version < Z then", and so on. 16 17 # Negotiation 18 19 Feature negotiation takes place immediately after the RPC connection has 20 been fully set up. See `negotiate.ml`. 21 22 1. Client sends its full feature set to the server. 23 2. Server validates the intersection of its and client's feature sets. 24 - If error then server sends NOK to client. The client closes connection. 25 3. If OK then server sends intersection of feature sets to the client. 26 4. Client validates the intersection. 27 - If error then client closes connection. 28 5. If OK then the negotiation is complete and both server and client will 29 use only features fully supported by both. 30 31 ## Feature registration 32 33 A feature is added to the set by registering it. This can be done by any 34 part of the code that "owns" a feature, similar to how user preferences are 35 registered. See `features.mli` and `features.ml`. 36 37 Registering a feature requires a unique feature name and an optional 38 validation function. 39 40 ## Feature validation 41 42 Each feature can provide a separate validation function. When validating 43 the intersection of client's and server's feature sets, validation 44 functions for each included feature are run in arbitrary sequence. 45 46 A validation function will be able to see the entire intersection and can 47 freely decide whether the intersection is ok or not. Examples of possible 48 validation scenarios: 49 50 - A mandatory feature is not in the intersection 51 - This typically means that counterparty is too old, but could also mean 52 that the counterparty is too new and the feature has been removed. 53 - User preference enabled for a feature not in intersection 54 (the preferences have not been sent to the server yet, so this 55 validation is not carried out by the server) 56 - A feature depends on another feature not in intersection 57 58 Some features in the intersection can conflict with each other. This can 59 happen for example when two different implementations of a function are 60 both supported but must not be used simultaneously. All such conflicts 61 are benign in nature and will not cause feature intersection validation 62 to fail. (Since the intersection is a subset of the entire feature set 63 then failing a conflict would mean that the set of features is conflicting 64 to begin with.) 65 66 # Development 67 68 Every incompatible code change must result in a change to the set of 69 features: 70 71 - New code that is mandatory to use (effectively breaks compatibility 72 with older versions despite feature negotiation) -> 73 - Register a new feature with a validation function that rejects any 74 feature intersection that does not include this feature 75 - New code that is optional to use -> 76 - Register a new feature 77 - New code that replaces existing code -> 78 - Register a new feature and remove one or more features 79 - Remove existing code -> 80 - Remove one or more features 81 - Code is not removed but it can be deprecated -> 82 - Add or change a validation function to output a deprecation warning 83 84 ## User preferences 85 86 User preferences are sent from client to server after establishing a 87 connection. The server must know all preferences received from the client, 88 otherwise the connection fails. 89 90 When new preferences are created with a new feature, it is possible (and in 91 most cases required) to add a guard function that determines if the 92 preference is sent to the server or not. Typically, this guard function 93 will take the form `fun () -> Features.enabled somefeature`, meaning that 94 the preference is sent to server if and only if 'somefeature' is known by 95 the server. 96 97 ## Code evolution, conflicting features 98 99 With features, new code does not have to replace existing code even if 100 they seemingly conflict. Both an existing feature and a new feature can 101 co-exist. The code must be guarded by checking which features are enabled 102 at runtime for each remote connection. 103 104 For example: 105 106 - Existing code implements feature hash-1. 107 - New code implements a new hashing algorithm and adds feature hash-2. 108 - Even though two different hashing algorithms must not be used at the 109 same time, both implementations can co-exist as in the following 110 pseudocode example. 111 112 ``` 113 function hash 114 if (feature hash-2 enabled) then 115 new algorithm 116 else if (feature hash-1 enabled) then 117 previous algorithm 118 end 119 ``` 120 121 - If both server and client support hash-2 then the new implementation 122 will always be used, even if both server and client also support hash-1 123 at the same time. 124 - If either server or client does not support hash-2 then the feature 125 intersection will only contain hash-1 and the previous implementation 126 will be used. 127 128 Now let's imagine that in addition to hashing algorithm changing with 129 the new feature, also the result type changes. This is trickier to implement 130 but clearly not impossible. 131 132 There are multiple ways of handling parallel implementation of conflicting 133 types. These are not the topic of this document, but a few possibilities 134 are provided for inspiration: 135 136 - Abstract types and type variables 137 - Variant types (aka sum types) 138 - Extensible variant types 139 - First class modules 140 - GADTs 141 - Classes/objects 142 143 ### Archive file 144 145 Most changes will ultimately result in type changes. This will directly 146 impact data encoded in wire format and stored in archive file format. 147 148 Data on the wire is transient. As both client and server have agreed on 149 a common feature set, they know how to marshal and unmarshal data on the 150 wire without any issues. 151 152 Data in the archive file is persistent and could have been written while 153 a different set of features was agreed upon. There are a couple of ways 154 to read and write archive files in this scenario: 155 156 - Not even attempt to read an incompatible archive file. The exact used 157 feature set is written into the archive file. As long as both client and 158 server keep negotiating the same feature set, they can read existing 159 archive files. When the negotiated feature set changes (due to upgrades), 160 the previous archive files can be ignored (requires a complete rescan). 161 This may be acceptable, as such upgrades are assumed to be quite rare. 162 163 - A subset of the used feature set is written into the archive file. Only 164 features that change the data structures written in the archive file are 165 stored in the file. The reading can work in two ways. Either as a slightly 166 more forgiving variant of the point above, or actually reading and 167 unmarshaling the archive according to the features used to write it -- 168 even if not all the same features are included in the currently negotiated 169 feature set. The latter is the currently chosen approach. It does require 170 types and code be tailored for this, the same as with the next point below, 171 but to a lesser degree. 172 173 - The archive file on-disk format includes information about the types and 174 structure of the written data (you can think like a DB with a relatively 175 dynamic but still typed schema). The data can be read back selectively, 176 and even converted as necessary (for example, can read a stored int32 into 177 in-memory int64). 178 The selective reading can mean two things. First, the archive was written 179 with a feature that is no longer enabled. The data that was only relevant 180 to that feature is just skipped. Second, the archive was written without 181 a feature that is now enabled. For the newly-enabled feature there is no 182 data in the archive but this does not break reading the file, as long as 183 the new feature can deal with default or "empty" values for its data 184 structures. 185