unison

Fork of Unison, a bi-directional file synchronization tool
git clone git://git.laack.co/unison.git
Log | Files | Refs | README | LICENSE

FEATURES.md (8241B)


      1 # Introduction and motivation
      2 
      3 "Features" is a set of feature names supported by a specific version of
      4 Unison implementation. Over time, each incompatible change -- whether
      5 mandatory or an optional add-on -- is assigned a unique feature name.
      6 
      7 Features allow a client to connect to and properly work with a server of
      8 different version, older or newer. When setting up the connection, both
      9 server and client negotiate a commonly supported set of features.
     10 
     11 Using features instead of a version makes the implementation agnostic of any
     12 versioning schemes, forks and third party implementations. It also allows
     13 for more flexible code changes over time, without the code being polluted by
     14 adding more and more conditionals for various version combinations, such as
     15 "if version < X then", "if version >= Y and version < Z then", and so on.
     16 
     17 # Negotiation
     18 
     19 Feature negotiation takes place immediately after the RPC connection has
     20 been fully set up. See `negotiate.ml`.
     21 
     22 1. Client sends its full feature set to the server.
     23 2. Server validates the intersection of its and client's feature sets.
     24    - If error then server sends NOK to client. The client closes connection.
     25 3. If OK then server sends intersection of feature sets to the client.
     26 4. Client validates the intersection.
     27    - If error then client closes connection.
     28 5. If OK then the negotiation is complete and both server and client will
     29    use only features fully supported by both.
     30 
     31 ## Feature registration
     32 
     33 A feature is added to the set by registering it. This can be done by any
     34 part of the code that "owns" a feature, similar to how user preferences are
     35 registered. See `features.mli` and `features.ml`.
     36 
     37 Registering a feature requires a unique feature name and an optional
     38 validation function.
     39 
     40 ## Feature validation
     41 
     42 Each feature can provide a separate validation function. When validating
     43 the intersection of client's and server's feature sets, validation
     44 functions for each included feature are run in arbitrary sequence.
     45 
     46 A validation function will be able to see the entire intersection and can
     47 freely decide whether the intersection is ok or not. Examples of possible
     48 validation scenarios:
     49 
     50 - A mandatory feature is not in the intersection
     51   - This typically means that counterparty is too old, but could also mean
     52     that the counterparty is too new and the feature has been removed.
     53 - User preference enabled for a feature not in intersection
     54   (the preferences have not been sent to the server yet, so this
     55   validation is not carried out by the server)
     56 - A feature depends on another feature not in intersection
     57 
     58 Some features in the intersection can conflict with each other. This can
     59 happen for example when two different implementations of a function are
     60 both supported but must not be used simultaneously. All such conflicts
     61 are benign in nature and will not cause feature intersection validation
     62 to fail. (Since the intersection is a subset of the entire feature set
     63 then failing a conflict would mean that the set of features is conflicting
     64 to begin with.)
     65 
     66 # Development
     67 
     68 Every incompatible code change must result in a change to the set of
     69 features:
     70 
     71 - New code that is mandatory to use (effectively breaks compatibility
     72   with older versions despite feature negotiation) ->
     73   - Register a new feature with a validation function that rejects any
     74     feature intersection that does not include this feature
     75 - New code that is optional to use ->
     76   - Register a new feature
     77 - New code that replaces existing code ->
     78   - Register a new feature and remove one or more features
     79 - Remove existing code ->
     80   - Remove one or more features
     81 - Code is not removed but it can be deprecated ->
     82   - Add or change a validation function to output a deprecation warning
     83 
     84 ## User preferences
     85 
     86 User preferences are sent from client to server after establishing a
     87 connection. The server must know all preferences received from the client,
     88 otherwise the connection fails.
     89 
     90 When new preferences are created with a new feature, it is possible (and in
     91 most cases required) to add a guard function that determines if the
     92 preference is sent to the server or not. Typically, this guard function
     93 will take the form `fun () -> Features.enabled somefeature`, meaning that
     94 the preference is sent to server if and only if 'somefeature' is known by
     95 the server.
     96 
     97 ## Code evolution, conflicting features
     98 
     99 With features, new code does not have to replace existing code even if
    100 they seemingly conflict. Both an existing feature and a new feature can
    101 co-exist. The code must be guarded by checking which features are enabled
    102 at runtime for each remote connection.
    103 
    104 For example:
    105 
    106 - Existing code implements feature hash-1.
    107 - New code implements a new hashing algorithm and adds feature hash-2.
    108 - Even though two different hashing algorithms must not be used at the
    109   same time, both implementations can co-exist as in the following
    110   pseudocode example.
    111 
    112 ```
    113 function hash
    114   if (feature hash-2 enabled) then
    115     new algorithm
    116   else if (feature hash-1 enabled) then
    117     previous algorithm
    118   end
    119 ```
    120 
    121 - If both server and client support hash-2 then the new implementation
    122   will always be used, even if both server and client also support hash-1
    123   at the same time.
    124 - If either server or client does not support hash-2 then the feature
    125   intersection will only contain hash-1 and the previous implementation
    126   will be used.
    127 
    128 Now let's imagine that in addition to hashing algorithm changing with
    129 the new feature, also the result type changes. This is trickier to implement
    130 but clearly not impossible.
    131 
    132 There are multiple ways of handling parallel implementation of conflicting
    133 types. These are not the topic of this document, but a few possibilities
    134 are provided for inspiration:
    135 
    136 - Abstract types and type variables
    137 - Variant types (aka sum types)
    138 - Extensible variant types
    139 - First class modules
    140 - GADTs
    141 - Classes/objects
    142 
    143 ### Archive file
    144 
    145 Most changes will ultimately result in type changes. This will directly
    146 impact data encoded in wire format and stored in archive file format.
    147 
    148 Data on the wire is transient. As both client and server have agreed on
    149 a common feature set, they know how to marshal and unmarshal data on the
    150 wire without any issues.
    151 
    152 Data in the archive file is persistent and could have been written while
    153 a different set of features was agreed upon. There are a couple of ways
    154 to read and write archive files in this scenario:
    155 
    156 - Not even attempt to read an incompatible archive file. The exact used
    157   feature set is written into the archive file. As long as both client and
    158   server keep negotiating the same feature set, they can read existing
    159   archive files. When the negotiated feature set changes (due to upgrades),
    160   the previous archive files can be ignored (requires a complete rescan).
    161   This may be acceptable, as such upgrades are assumed to be quite rare.
    162 
    163 - A subset of the used feature set is written into the archive file. Only
    164   features that change the data structures written in the archive file are
    165   stored in the file. The reading can work in two ways. Either as a slightly
    166   more forgiving variant of the point above, or actually reading and
    167   unmarshaling the archive according to the features used to write it --
    168   even if not all the same features are included in the currently negotiated
    169   feature set. The latter is the currently chosen approach. It does require
    170   types and code be tailored for this, the same as with the next point below,
    171   but to a lesser degree.
    172 
    173 - The archive file on-disk format includes information about the types and
    174   structure of the written data (you can think like a DB with a relatively
    175   dynamic but still typed schema). The data can be read back selectively,
    176   and even converted as necessary (for example, can read a stored int32 into
    177   in-memory int64).
    178   The selective reading can mean two things. First, the archive was written
    179   with a feature that is no longer enabled. The data that was only relevant
    180   to that feature is just skipped. Second, the archive was written without
    181   a feature that is now enabled. For the newly-enabled feature there is no
    182   data in the archive but this does not break reading the file, as long as
    183   the new feature can deal with default or "empty" values for its data
    184   structures.
    185