cart-elc

Source code for CART-ELC
git clone git://git.laack.co/cart-elc.git
Log | Files | Refs | README | LICENSE

README.md (62365B)


      1 # Eigen Tensors {#eigen_tensors}
      2 
      3 Tensors are multidimensional arrays of elements. Elements are typically scalars,
      4 but more complex types such as strings are also supported.
      5 
      6 ## Tensor Classes
      7 
      8 You can manipulate a tensor with one of the following classes.  They all are in
      9 the namespace `::Eigen.`
     10 
     11 
     12 ### Class Tensor<data_type, rank>
     13 
     14 This is the class to use to create a tensor and allocate memory for it.  The
     15 class is templatized with the tensor datatype, such as float or int, and the
     16 tensor rank.  The rank is the number of dimensions, for example rank 2 is a
     17 matrix.
     18 
     19 Tensors of this class are resizable.  For example, if you assign a tensor of a
     20 different size to a Tensor, that tensor is resized to match its new value.
     21 
     22 #### Constructor Tensor<data_type, rank>(size0, size1, ...)
     23 
     24 Constructor for a Tensor.  The constructor must be passed `rank` integers
     25 indicating the sizes of the instance along each of the the `rank`
     26 dimensions.
     27 
     28     // Create a tensor of rank 3 of sizes 2, 3, 4.  This tensor owns
     29     // memory to hold 24 floating point values (24 = 2 x 3 x 4).
     30     Tensor<float, 3> t_3d(2, 3, 4);
     31 
     32     // Resize t_3d by assigning a tensor of different sizes, but same rank.
     33     t_3d = Tensor<float, 3>(3, 4, 3);
     34 
     35 #### Constructor Tensor<data_type, rank>(size_array)
     36 
     37 Constructor where the sizes for the constructor are specified as an array of
     38 values instead of an explicitly list of parameters.  The array type to use is
     39 `Eigen::array<Eigen::Index>`.  The array can be constructed automatically
     40 from an initializer list.
     41 
     42     // Create a tensor of strings of rank 2 with sizes 5, 7.
     43     Tensor<string, 2> t_2d({5, 7});
     44 
     45 
     46 ### Class TensorFixedSize<data_type, Sizes<size0, size1, ...>>
     47 
     48 Class to use for tensors of fixed size, where the size is known at compile
     49 time.  Fixed sized tensors can provide very fast computations because all their
     50 dimensions are known by the compiler.  FixedSize tensors are not resizable.
     51 
     52 If the total number of elements in a fixed size tensor is small enough the
     53 tensor data is held onto the stack and does not cause heap allocation and free.
     54 
     55     // Create a 4 x 3 tensor of floats.
     56     TensorFixedSize<float, Sizes<4, 3>> t_4x3;
     57 
     58 ### Class TensorMap<Tensor<data_type, rank>>
     59 
     60 This is the class to use to create a tensor on top of memory allocated and
     61 owned by another part of your code.  It allows to view any piece of allocated
     62 memory as a Tensor.  Instances of this class do not own the memory where the
     63 data are stored.
     64 
     65 A TensorMap is not resizable because it does not own the memory where its data
     66 are stored.
     67 
     68 #### Constructor TensorMap<Tensor<data_type, rank>>(data, size0, size1, ...)
     69 
     70 Constructor for a Tensor.  The constructor must be passed a pointer to the
     71 storage for the data, and "rank" size attributes.  The storage has to be
     72 large enough to hold all the data.
     73 
     74     // Map a tensor of ints on top of stack-allocated storage.
     75     int storage[128];  // 2 x 4 x 2 x 8 = 128
     76     TensorMap<Tensor<int, 4>> t_4d(storage, 2, 4, 2, 8);
     77 
     78     // The same storage can be viewed as a different tensor.
     79     // You can also pass the sizes as an array.
     80     TensorMap<Tensor<int, 2>> t_2d(storage, 16, 8);
     81 
     82     // You can also map fixed-size tensors.  Here we get a 1d view of
     83     // the 2d fixed-size tensor.
     84     TensorFixedSize<float, Sizes<4, 3>> t_4x3;
     85     TensorMap<Tensor<float, 1>> t_12(t_4x3.data(), 12);
     86 
     87 
     88 #### Class TensorRef
     89 
     90 See Assigning to a TensorRef below.
     91 
     92 ## Accessing Tensor Elements
     93 
     94 #### <data_type> tensor(index0, index1...)
     95 
     96 Return the element at position `(index0, index1...)` in tensor
     97 `tensor`.  You must pass as many parameters as the rank of `tensor`.
     98 The expression can be used as an l-value to set the value of the element at the
     99 specified position.  The value returned is of the datatype of the tensor.
    100 
    101     // Set the value of the element at position (0, 1, 0);
    102     Tensor<float, 3> t_3d(2, 3, 4);
    103     t_3d(0, 1, 0) = 12.0f;
    104 
    105     // Initialize all elements to random values.
    106     for (int i = 0; i < 2; ++i) {
    107       for (int j = 0; j < 3; ++j) {
    108         for (int k = 0; k < 4; ++k) {
    109           t_3d(i, j, k) = ...some random value...;
    110         }
    111       }
    112     }
    113 
    114     // Print elements of a tensor.
    115     for (int i = 0; i < 2; ++i) {
    116       LOG(INFO) << t_3d(i, 0, 0);
    117     }
    118 
    119 
    120 ## TensorLayout
    121 
    122 The tensor library supports 2 layouts: `ColMajor` (the default) and
    123 `RowMajor`.  Only the default column major layout is currently fully
    124 supported, and it is therefore not recommended to attempt to use the row major
    125 layout at the moment.
    126 
    127 The layout of a tensor is optionally specified as part of its type. If not
    128 specified explicitly column major is assumed.
    129 
    130     Tensor<float, 3, ColMajor> col_major;  // equivalent to Tensor<float, 3>
    131     TensorMap<Tensor<float, 3, RowMajor> > row_major(data, ...);
    132 
    133 All the arguments to an expression must use the same layout. Attempting to mix
    134 different layouts will result in a compilation error.
    135 
    136 It is possible to change the layout of a tensor or an expression using the
    137 `swap_layout()` method.  Note that this will also reverse the order of the
    138 dimensions.
    139 
    140     Tensor<float, 2, ColMajor> col_major(2, 4);
    141     Tensor<float, 2, RowMajor> row_major(2, 4);
    142 
    143     Tensor<float, 2> col_major_result = col_major;  // ok, layouts match
    144     Tensor<float, 2> col_major_result = row_major;  // will not compile
    145 
    146     // Simple layout swap
    147     col_major_result = row_major.swap_layout();
    148     eigen_assert(col_major_result.dimension(0) == 4);
    149     eigen_assert(col_major_result.dimension(1) == 2);
    150 
    151     // Swap the layout and preserve the order of the dimensions
    152     array<int, 2> shuffle(1, 0);
    153     col_major_result = row_major.swap_layout().shuffle(shuffle);
    154     eigen_assert(col_major_result.dimension(0) == 2);
    155     eigen_assert(col_major_result.dimension(1) == 4);
    156 
    157 
    158 ## Tensor Operations
    159 
    160 The Eigen Tensor library provides a vast library of operations on Tensors:
    161 numerical operations such as addition and multiplication, geometry operations
    162 such as slicing and shuffling, etc.  These operations are available as methods
    163 of the Tensor classes, and in some cases as operator overloads.  For example
    164 the following code computes the elementwise addition of two tensors:
    165 
    166     Tensor<float, 3> t1(2, 3, 4);
    167     ...set some values in t1...
    168     Tensor<float, 3> t2(2, 3, 4);
    169     ...set some values in t2...
    170     // Set t3 to the element wise sum of t1 and t2
    171     Tensor<float, 3> t3 = t1 + t2;
    172 
    173 While the code above looks easy enough, it is important to understand that the
    174 expression `t1 + t2` is not actually adding the values of the tensors.  The
    175 expression instead constructs a "tensor operator" object of the class
    176 TensorCwiseBinaryOp<scalar_sum>, which has references to the tensors
    177 `t1` and `t2`.  This is a small C++ object that knows how to add
    178 `t1` and `t2`.  It is only when the value of the expression is assigned
    179 to the tensor `t3` that the addition is actually performed.  Technically,
    180 this happens through the overloading of `operator=()` in the Tensor class.
    181 
    182 This mechanism for computing tensor expressions allows for lazy evaluation and
    183 optimizations which are what make the tensor library very fast.
    184 
    185 Of course, the tensor operators do nest, and the expression `t1 + t2 * 0.3f`
    186 is actually represented with the (approximate) tree of operators:
    187 
    188     TensorCwiseBinaryOp<scalar_sum>(t1, TensorCwiseUnaryOp<scalar_mul>(t2, 0.3f))
    189 
    190 
    191 ### Tensor Operations and C++ "auto"
    192 
    193 Because Tensor operations create tensor operators, the C++ `auto` keyword
    194 does not have its intuitive meaning.  Consider these 2 lines of code:
    195 
    196     Tensor<float, 3> t3 = t1 + t2;
    197     auto t4 = t1 + t2;
    198 
    199 In the first line we allocate the tensor `t3` and it will contain the
    200 result of the addition of `t1` and `t2`.  In the second line, `t4`
    201 is actually the tree of tensor operators that will compute the addition of
    202 `t1` and `t2`.  In fact, `t4` is *not* a tensor and you cannot get
    203 the values of its elements:
    204 
    205     Tensor<float, 3> t3 = t1 + t2;
    206     cout << t3(0, 0, 0);  // OK prints the value of t1(0, 0, 0) + t2(0, 0, 0)
    207 
    208     auto t4 = t1 + t2;
    209     cout << t4(0, 0, 0);  // Compilation error!
    210 
    211 When you use `auto` you do not get a Tensor as a result but instead a
    212 non-evaluated expression.  So only use `auto` to delay evaluation.
    213 
    214 Unfortunately, there is no single underlying concrete type for holding
    215 non-evaluated expressions, hence you have to use auto in the case when you do
    216 want to hold non-evaluated expressions.
    217 
    218 When you need the results of set of tensor computations you have to assign the
    219 result to a Tensor that will be capable of holding onto them.  This can be
    220 either a normal Tensor, a fixed size Tensor, or a TensorMap on an existing
    221 piece of memory.  All the following will work:
    222 
    223     auto t4 = t1 + t2;
    224 
    225     Tensor<float, 3> result = t4;  // Could also be: result(t4);
    226     cout << result(0, 0, 0);
    227 
    228     TensorMap<float, 4> result(<a float* with enough space>, <size0>, ...) = t4;
    229     cout << result(0, 0, 0);
    230 
    231     TensorFixedSize<float, Sizes<size0, ...>> result = t4;
    232     cout << result(0, 0, 0);
    233 
    234 Until you need the results, you can keep the operation around, and even reuse
    235 it for additional operations.  As long as you keep the expression as an
    236 operation, no computation is performed.
    237 
    238     // One way to compute exp((t1 + t2) * 0.2f);
    239     auto t3 = t1 + t2;
    240     auto t4 = t3 * 0.2f;
    241     auto t5 = t4.exp();
    242     Tensor<float, 3> result = t5;
    243 
    244     // Another way, exactly as efficient as the previous one:
    245     Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();
    246 
    247 ### Controlling When Expression are Evaluated
    248 
    249 There are several ways to control when expressions are evaluated:
    250 
    251 *   Assignment to a Tensor, TensorFixedSize, or TensorMap.
    252 *   Use of the eval() method.
    253 *   Assignment to a TensorRef.
    254 
    255 #### Assigning to a Tensor, TensorFixedSize, or TensorMap.
    256 
    257 The most common way to evaluate an expression is to assign it to a Tensor.  In
    258 the example below, the `auto` declarations make the intermediate values
    259 "Operations", not Tensors, and do not cause the expressions to be evaluated.
    260 The assignment to the Tensor `result` causes the evaluation of all the
    261 operations.
    262 
    263     auto t3 = t1 + t2;             // t3 is an Operation.
    264     auto t4 = t3 * 0.2f;           // t4 is an Operation.
    265     auto t5 = t4.exp();            // t5 is an Operation.
    266     Tensor<float, 3> result = t5;  // The operations are evaluated.
    267 
    268 If you know the ranks and sizes of the Operation value you can assign the
    269 Operation to a TensorFixedSize instead of a Tensor, which is a bit more
    270 efficient.
    271 
    272     // We know that the result is a 4x4x2 tensor!
    273     TensorFixedSize<float, Sizes<4, 4, 2>> result = t5;
    274 
    275 Simiarly, assigning an expression to a TensorMap causes its evaluation.  Like
    276 tensors of type TensorFixedSize, TensorMaps cannot be resized so they have to
    277 have the rank and sizes of the expression that are assigned to them.
    278 
    279 #### Calling eval().
    280 
    281 When you compute large composite expressions, you sometimes want to tell Eigen
    282 that an intermediate value in the expression tree is worth evaluating ahead of
    283 time.  This is done by inserting a call to the `eval()` method of the
    284 expression Operation.
    285 
    286     // The previous example could have been written:
    287     Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();
    288 
    289     // If you want to compute (t1 + t2) once ahead of time you can write:
    290     Tensor<float, 3> result = ((t1 + t2).eval() * 0.2f).exp();
    291 
    292 Semantically, calling `eval()` is equivalent to materializing the value of
    293 the expression in a temporary Tensor of the right size.  The code above in
    294 effect does:
    295 
    296     // .eval() knows the size!
    297     TensorFixedSize<float, Sizes<4, 4, 2>> tmp = t1 + t2;
    298     Tensor<float, 3> result = (tmp * 0.2f).exp();
    299 
    300 Note that the return value of `eval()` is itself an Operation, so the
    301 following code does not do what you may think:
    302 
    303     // Here t3 is an evaluation Operation.  t3 has not been evaluated yet.
    304     auto t3 = (t1 + t2).eval();
    305 
    306     // You can use t3 in another expression.  Still no evaluation.
    307     auto t4 = (t3 * 0.2f).exp();
    308 
    309     // The value is evaluated when you assign the Operation to a Tensor, using
    310     // an intermediate tensor to represent t3.x
    311     Tensor<float, 3> result = t4;
    312 
    313 While in the examples above calling `eval()` does not make a difference in
    314 performance, in other cases it can make a huge difference.  In the expression
    315 below the `broadcast()` expression causes the `X.maximum()` expression
    316 to be evaluated many times:
    317 
    318     Tensor<...> X ...;
    319     Tensor<...> Y = ((X - X.maximum(depth_dim).reshape(dims2d).broadcast(bcast))
    320                      * beta).exp();
    321 
    322 Inserting a call to `eval()` between the `maximum()` and
    323 `reshape()` calls guarantees that maximum() is only computed once and
    324 greatly speeds-up execution:
    325 
    326     Tensor<...> Y =
    327       ((X - X.maximum(depth_dim).eval().reshape(dims2d).broadcast(bcast))
    328         * beta).exp();
    329 
    330 In the other example below, the tensor `Y` is both used in the expression
    331 and its assignment.  This is an aliasing problem and if the evaluation is not
    332 done in the right order Y will be updated incrementally during the evaluation
    333 resulting in bogus results:
    334 
    335      Tensor<...> Y ...;
    336      Y = Y / (Y.sum(depth_dim).reshape(dims2d).broadcast(bcast));
    337 
    338 Inserting a call to `eval()` between the `sum()` and `reshape()`
    339 expressions ensures that the sum is computed before any updates to `Y` are
    340 done.
    341 
    342      Y = Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast));
    343 
    344 Note that an eval around the full right hand side expression is not needed
    345 because the generated has to compute the i-th value of the right hand side
    346 before assigning it to the left hand side.
    347 
    348 However, if you were assigning the expression value to a shuffle of `Y`
    349 then you would need to force an eval for correctness by adding an `eval()`
    350 call for the right hand side:
    351 
    352      Y.shuffle(...) =
    353         (Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast))).eval();
    354 
    355 
    356 #### Assigning to a TensorRef.
    357 
    358 If you need to access only a few elements from the value of an expression you
    359 can avoid materializing the value in a full tensor by using a TensorRef.
    360 
    361 A TensorRef is a small wrapper class for any Eigen Operation.  It provides
    362 overloads for the `()` operator that let you access individual values in
    363 the expression.  TensorRef is convenient, because the Operation themselves do
    364 not provide a way to access individual elements.
    365 
    366     // Create a TensorRef for the expression.  The expression is not
    367     // evaluated yet.
    368     TensorRef<Tensor<float, 3> > ref = ((t1 + t2) * 0.2f).exp();
    369 
    370     // Use "ref" to access individual elements.  The expression is evaluated
    371     // on the fly.
    372     float at_0 = ref(0, 0, 0);
    373     cout << ref(0, 1, 0);
    374 
    375 Only use TensorRef when you need a subset of the values of the expression.
    376 TensorRef only computes the values you access.  However note that if you are
    377 going to access all the values it will be much faster to materialize the
    378 results in a Tensor first.
    379 
    380 In some cases, if the full Tensor result would be very large, you may save
    381 memory by accessing it as a TensorRef.  But not always.  So don't count on it.
    382 
    383 
    384 ### Controlling How Expressions Are Evaluated
    385 
    386 The tensor library provides several implementations of the various operations
    387 such as contractions and convolutions.  The implementations are optimized for
    388 different environments: single threaded on CPU, multi threaded on CPU, or on a
    389 GPU using cuda.  Additional implementations may be added later.
    390 
    391 You can choose which implementation to use with the `device()` call.  If
    392 you do not choose an implementation explicitly the default implementation that
    393 uses a single thread on the CPU is used.
    394 
    395 The default implementation has been optimized for recent Intel CPUs, taking
    396 advantage of SSE, AVX, and FMA instructions.  Work is ongoing to tune the
    397 library on ARM CPUs.  Note that you need to pass compiler-dependent flags
    398 to enable the use of SSE, AVX, and other instructions.
    399 
    400 For example, the following code adds two tensors using the default
    401 single-threaded CPU implementation:
    402 
    403     Tensor<float, 2> a(30, 40);
    404     Tensor<float, 2> b(30, 40);
    405     Tensor<float, 2> c = a + b;
    406 
    407 To choose a different implementation you have to insert a `device()` call
    408 before the assignment of the result.  For technical C++ reasons this requires
    409 that the Tensor for the result be declared on its own.  This means that you
    410 have to know the size of the result.
    411 
    412     Eigen::Tensor<float, 2> c(30, 40);
    413     c.device(...) = a + b;
    414 
    415 The call to `device()` must be the last call on the left of the operator=.
    416 
    417 You must pass to the `device()` call an Eigen device object.  There are
    418 presently three devices you can use: DefaultDevice, ThreadPoolDevice and
    419 GpuDevice.
    420 
    421 
    422 #### Evaluating With the DefaultDevice
    423 
    424 This is exactly the same as not inserting a `device()` call.
    425 
    426     DefaultDevice my_device;
    427     c.device(my_device) = a + b;
    428 
    429 #### Evaluating with a Thread Pool
    430 
    431     // Create the Eigen ThreadPool
    432     Eigen::ThreadPool pool(8 /* number of threads in pool */)
    433 
    434     // Create the Eigen ThreadPoolDevice.
    435     Eigen::ThreadPoolDevice my_device(&pool, 4 /* number of threads to use */);
    436 
    437     // Now just use the device when evaluating expressions.
    438     Eigen::Tensor<float, 2> c(30, 50);
    439     c.device(my_device) = a.contract(b, dot_product_dims);
    440 
    441 
    442 #### Evaluating On GPU
    443 
    444 This is presently a bit more complicated than just using a thread pool device.
    445 You need to create a GPU device but you also need to explicitly allocate the
    446 memory for tensors with cuda.
    447 
    448 
    449 ## API Reference
    450 
    451 ### Datatypes
    452 
    453 In the documentation of the tensor methods and Operation we mention datatypes
    454 that are tensor-type specific:
    455 
    456 #### <Tensor-Type>::Dimensions
    457 
    458 Acts like an array of ints.  Has an `int size` attribute, and can be
    459 indexed like an array to access individual values.  Used to represent the
    460 dimensions of a tensor.  See `dimensions()`.
    461 
    462 #### <Tensor-Type>::Index
    463 
    464 Acts like an `int`.  Used for indexing tensors along their dimensions.  See
    465 `operator()`, `dimension()`, and `size()`.
    466 
    467 #### <Tensor-Type>::Scalar
    468 
    469 Represents the datatype of individual tensor elements.  For example, for a
    470 `Tensor<float>`, `Scalar` is the type `float`.  See
    471 `setConstant()`.
    472 
    473 #### <Operation>
    474 
    475 We use this pseudo type to indicate that a tensor Operation is returned by a
    476 method.  We indicate in the text the type and dimensions of the tensor that the
    477 Operation returns after evaluation.
    478 
    479 The Operation will have to be evaluated, for example by assigning it to a
    480 tensor, before you can access the values of the resulting tensor.  You can also
    481 access the values through a TensorRef.
    482 
    483 
    484 ## Built-in Tensor Methods
    485 
    486 These are usual C++ methods that act on tensors immediately.  They are not
    487 Operations which provide delayed evaluation of their results.  Unless specified
    488 otherwise, all the methods listed below are available on all tensor classes:
    489 Tensor, TensorFixedSize, and TensorMap.
    490 
    491 ## Metadata
    492 
    493 ### int NumDimensions
    494 
    495 Constant value indicating the number of dimensions of a Tensor.  This is also
    496 known as the tensor "rank".
    497 
    498       Eigen::Tensor<float, 2> a(3, 4);
    499       cout << "Dims " << a.NumDimensions;
    500       => Dims 2
    501 
    502 ### Dimensions dimensions()
    503 
    504 Returns an array-like object representing the dimensions of the tensor.
    505 The actual type of the `dimensions()` result is `<Tensor-Type>::``Dimensions`.
    506 
    507     Eigen::Tensor<float, 2> a(3, 4);
    508     const Eigen::Tensor<float, 2>::Dimensions& d = a.dimensions();
    509     cout << "Dim size: " << d.size << ", dim 0: " << d[0]
    510          << ", dim 1: " << d[1];
    511     => Dim size: 2, dim 0: 3, dim 1: 4
    512 
    513 If you use a C++11 compiler, you can use `auto` to simplify the code:
    514 
    515     const auto& d = a.dimensions();
    516     cout << "Dim size: " << d.size << ", dim 0: " << d[0]
    517          << ", dim 1: " << d[1];
    518     => Dim size: 2, dim 0: 3, dim 1: 4
    519 
    520 ### Index dimension(Index n)
    521 
    522 Returns the n-th dimension of the tensor.  The actual type of the
    523 `dimension()` result is `<Tensor-Type>::``Index`, but you can
    524 always use it like an int.
    525 
    526       Eigen::Tensor<float, 2> a(3, 4);
    527       int dim1 = a.dimension(1);
    528       cout << "Dim 1: " << dim1;
    529       => Dim 1: 4
    530 
    531 ### Index size()
    532 
    533 Returns the total number of elements in the tensor.  This is the product of all
    534 the tensor dimensions.  The actual type of the `size()` result is
    535 `<Tensor-Type>::``Index`, but you can always use it like an int.
    536 
    537     Eigen::Tensor<float, 2> a(3, 4);
    538     cout << "Size: " << a.size();
    539     => Size: 12
    540 
    541 
    542 ### Getting Dimensions From An Operation
    543 
    544 A few operations provide `dimensions()` directly,
    545 e.g. `TensorReslicingOp`.  Most operations defer calculating dimensions
    546 until the operation is being evaluated.  If you need access to the dimensions
    547 of a deferred operation, you can wrap it in a TensorRef (see Assigning to a
    548 TensorRef above), which provides `dimensions()` and `dimension()` as
    549 above.
    550 
    551 TensorRef can also wrap the plain Tensor types, so this is a useful idiom in
    552 templated contexts where the underlying object could be either a raw Tensor
    553 or some deferred operation (e.g. a slice of a Tensor).  In this case, the
    554 template code can wrap the object in a TensorRef and reason about its
    555 dimensionality while remaining agnostic to the underlying type.
    556 
    557 
    558 ## Constructors
    559 
    560 ### Tensor
    561 
    562 Creates a tensor of the specified size. The number of arguments must be equal
    563 to the rank of the tensor. The content of the tensor is not initialized.
    564 
    565     Eigen::Tensor<float, 2> a(3, 4);
    566     cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
    567     => NumRows: 3 NumCols: 4
    568 
    569 ### TensorFixedSize
    570 
    571 Creates a tensor of the specified size. The number of arguments in the Sizes<>
    572 template parameter determines the rank of the tensor. The content of the tensor
    573 is not initialized.
    574 
    575     Eigen::TensorFixedSize<float, Sizes<3, 4>> a;
    576     cout << "Rank: " << a.rank() << endl;
    577     => Rank: 2
    578     cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
    579     => NumRows: 3 NumCols: 4
    580 
    581 ### TensorMap
    582 
    583 Creates a tensor mapping an existing array of data. The data must not be freed
    584 until the TensorMap is discarded, and the size of the data must be large enough
    585 to accommodate the coefficients of the tensor.
    586 
    587     float data[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};
    588     Eigen::TensorMap<Tensor<float, 2>> a(data, 3, 4);
    589     cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
    590     => NumRows: 3 NumCols: 4
    591     cout << "a(1, 2): " << a(1, 2) << endl;
    592     => a(1, 2): 7
    593 
    594 
    595 ## Contents Initialization
    596 
    597 When a new Tensor or a new TensorFixedSize are created, memory is allocated to
    598 hold all the tensor elements, but the memory is not initialized.  Similarly,
    599 when a new TensorMap is created on top of non-initialized memory the memory its
    600 contents are not initialized.
    601 
    602 You can use one of the methods below to initialize the tensor memory.  These
    603 have an immediate effect on the tensor and return the tensor itself as a
    604 result.  These are not tensor Operations which delay evaluation.
    605 
    606 ### <Tensor-Type> setConstant(const Scalar& val)
    607 
    608 Sets all elements of the tensor to the constant value `val`.  `Scalar`
    609 is the type of data stored in the tensor.  You can pass any value that is
    610 convertible to that type.
    611 
    612 Returns the tensor itself in case you want to chain another call.
    613 
    614     a.setConstant(12.3f);
    615     cout << "Constant: " << endl << a << endl << endl;
    616     =>
    617     Constant:
    618     12.3 12.3 12.3 12.3
    619     12.3 12.3 12.3 12.3
    620     12.3 12.3 12.3 12.3
    621 
    622 Note that `setConstant()` can be used on any tensor where the element type
    623 has a copy constructor and an `operator=()`:
    624 
    625     Eigen::Tensor<string, 2> a(2, 3);
    626     a.setConstant("yolo");
    627     cout << "String tensor: " << endl << a << endl << endl;
    628     =>
    629     String tensor:
    630     yolo yolo yolo
    631     yolo yolo yolo
    632 
    633 
    634 ### <Tensor-Type> setZero()
    635 
    636 Fills the tensor with zeros.  Equivalent to `setConstant(Scalar(0))`.
    637 Returns the tensor itself in case you want to chain another call.
    638 
    639     a.setZero();
    640     cout << "Zeros: " << endl << a << endl << endl;
    641     =>
    642     Zeros:
    643     0 0 0 0
    644     0 0 0 0
    645     0 0 0 0
    646 
    647 
    648 ### <Tensor-Type> setValues({..initializer_list})
    649 
    650 Fills the tensor with explicit values specified in a std::initializer_list.
    651 The type of the initializer list depends on the type and rank of the tensor.
    652 
    653 If the tensor has rank N, the initializer list must be nested N times.  The
    654 most deeply nested lists must contains P scalars of the Tensor type where P is
    655 the size of the last dimension of the Tensor.
    656 
    657 For example, for a `TensorFixedSize<float, 2, 3>` the initializer list must
    658 contains 2 lists of 3 floats each.
    659 
    660 `setValues()` returns the tensor itself in case you want to chain another
    661 call.
    662 
    663     Eigen::Tensor<float, 2> a(2, 3);
    664     a.setValues({{0.0f, 1.0f, 2.0f}, {3.0f, 4.0f, 5.0f}});
    665     cout << "a" << endl << a << endl << endl;
    666     =>
    667     a
    668     0 1 2
    669     3 4 5
    670 
    671 If a list is too short, the corresponding elements of the tensor will not be
    672 changed.  This is valid at each level of nesting.  For example the following
    673 code only sets the values of the first row of the tensor.
    674 
    675     Eigen::Tensor<int, 2> a(2, 3);
    676     a.setConstant(1000);
    677     a.setValues({{10, 20, 30}});
    678     cout << "a" << endl << a << endl << endl;
    679     =>
    680     a
    681     10   20   30
    682     1000 1000 1000
    683 
    684 ### <Tensor-Type> setRandom()
    685 
    686 Fills the tensor with random values.  Returns the tensor itself in case you
    687 want to chain another call.
    688 
    689     a.setRandom();
    690     cout << "Random: " << endl << a << endl << endl;
    691     =>
    692     Random:
    693       0.680375    0.59688  -0.329554    0.10794
    694      -0.211234   0.823295   0.536459 -0.0452059
    695       0.566198  -0.604897  -0.444451   0.257742
    696 
    697 You can customize `setRandom()` by providing your own random number
    698 generator as a template argument:
    699 
    700     a.setRandom<MyRandomGenerator>();
    701 
    702 Here, `MyRandomGenerator` must be a struct with the following member
    703 functions, where Scalar and Index are the same as `<Tensor-Type>::``Scalar`
    704 and `<Tensor-Type>::``Index`.
    705 
    706 See `struct UniformRandomGenerator` in TensorFunctors.h for an example.
    707 
    708     // Custom number generator for use with setRandom().
    709     struct MyRandomGenerator {
    710       // Default and copy constructors. Both are needed
    711       MyRandomGenerator() { }
    712       MyRandomGenerator(const MyRandomGenerator& ) { }
    713 
    714       // Return a random value to be used.  "element_location" is the
    715       // location of the entry to set in the tensor, it can typically
    716       // be ignored.
    717       Scalar operator()(Eigen::DenseIndex element_location,
    718                         Eigen::DenseIndex /*unused*/ = 0) const {
    719         return <randomly generated value of type T>;
    720       }
    721 
    722       // Same as above but generates several numbers at a time.
    723       typename internal::packet_traits<Scalar>::type packetOp(
    724           Eigen::DenseIndex packet_location, Eigen::DenseIndex /*unused*/ = 0) const {
    725         return <a packet of randomly generated values>;
    726       }
    727     };
    728 
    729 You can also use one of the 2 random number generators that are part of the
    730 tensor library:
    731 *   UniformRandomGenerator
    732 *   NormalRandomGenerator
    733 
    734 
    735 ## Data Access
    736 
    737 The Tensor, TensorFixedSize, and TensorRef classes provide the following
    738 accessors to access the tensor coefficients:
    739 
    740     const Scalar& operator()(const array<Index, NumIndices>& indices)
    741     const Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)
    742     Scalar& operator()(const array<Index, NumIndices>& indices)
    743     Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)
    744 
    745 The number of indices must be equal to the rank of the tensor. Moreover, these
    746 accessors are not available on tensor expressions. In order to access the
    747 values of a tensor expression, the expression must either be evaluated or
    748 wrapped in a TensorRef.
    749 
    750 
    751 ### Scalar* data() and const Scalar* data() const
    752 
    753 Returns a pointer to the storage for the tensor.  The pointer is const if the
    754 tensor was const.  This allows direct access to the data.  The layout of the
    755 data depends on the tensor layout: RowMajor or ColMajor.
    756 
    757 This access is usually only needed for special cases, for example when mixing
    758 Eigen Tensor code with other libraries.
    759 
    760 Scalar is the type of data stored in the tensor.
    761 
    762     Eigen::Tensor<float, 2> a(3, 4);
    763     float* a_data = a.data();
    764     a_data[0] = 123.45f;
    765     cout << "a(0, 0): " << a(0, 0);
    766     => a(0, 0): 123.45
    767 
    768 
    769 ## Tensor Operations
    770 
    771 All the methods documented below return non evaluated tensor `Operations`.
    772 These can be chained: you can apply another Tensor Operation to the value
    773 returned by the method.
    774 
    775 The chain of Operation is evaluated lazily, typically when it is assigned to a
    776 tensor.  See "Controlling when Expression are Evaluated" for more details about
    777 their evaluation.
    778 
    779 ### <Operation> constant(const Scalar& val)
    780 
    781 Returns a tensor of the same type and dimensions as the original tensor but
    782 where all elements have the value `val`.
    783 
    784 This is useful, for example, when you want to add or subtract a constant from a
    785 tensor, or multiply every element of a tensor by a scalar.
    786 
    787     Eigen::Tensor<float, 2> a(2, 3);
    788     a.setConstant(1.0f);
    789     Eigen::Tensor<float, 2> b = a + a.constant(2.0f);
    790     Eigen::Tensor<float, 2> c = b * b.constant(0.2f);
    791     cout << "a" << endl << a << endl << endl;
    792     cout << "b" << endl << b << endl << endl;
    793     cout << "c" << endl << c << endl << endl;
    794     =>
    795     a
    796     1 1 1
    797     1 1 1
    798 
    799     b
    800     3 3 3
    801     3 3 3
    802 
    803     c
    804     0.6 0.6 0.6
    805     0.6 0.6 0.6
    806 
    807 ### <Operation> random()
    808 
    809 Returns a tensor of the same type and dimensions as the current tensor
    810 but where all elements have random values.
    811 
    812 This is for example useful to add random values to an existing tensor.
    813 The generation of random values can be customized in the same manner
    814 as for `setRandom()`.
    815 
    816     Eigen::Tensor<float, 2> a(2, 3);
    817     a.setConstant(1.0f);
    818     Eigen::Tensor<float, 2> b = a + a.random();
    819     cout << "a" << endl << a << endl << endl;
    820     cout << "b" << endl << b << endl << endl;
    821     =>
    822     a
    823     1 1 1
    824     1 1 1
    825 
    826     b
    827     1.68038   1.5662  1.82329
    828     0.788766  1.59688 0.395103
    829 
    830 
    831 ## Unary Element Wise Operations
    832 
    833 All these operations take a single input tensor as argument and return a tensor
    834 of the same type and dimensions as the tensor to which they are applied.  The
    835 requested operations are applied to each element independently.
    836 
    837 ### <Operation> operator-()
    838 
    839 Returns a tensor of the same type and dimensions as the original tensor
    840 containing the opposite values of the original tensor.
    841 
    842     Eigen::Tensor<float, 2> a(2, 3);
    843     a.setConstant(1.0f);
    844     Eigen::Tensor<float, 2> b = -a;
    845     cout << "a" << endl << a << endl << endl;
    846     cout << "b" << endl << b << endl << endl;
    847     =>
    848     a
    849     1 1 1
    850     1 1 1
    851 
    852     b
    853     -1 -1 -1
    854     -1 -1 -1
    855 
    856 ### <Operation> sqrt()
    857 
    858 Returns a tensor of the same type and dimensions as the original tensor
    859 containing the square roots of the original tensor.
    860 
    861 ### <Operation> rsqrt()
    862 
    863 Returns a tensor of the same type and dimensions as the original tensor
    864 containing the inverse square roots of the original tensor.
    865 
    866 ### <Operation> square()
    867 
    868 Returns a tensor of the same type and dimensions as the original tensor
    869 containing the squares of the original tensor values.
    870 
    871 ### <Operation> inverse()
    872 
    873 Returns a tensor of the same type and dimensions as the original tensor
    874 containing the inverse of the original tensor values.
    875 
    876 ### <Operation> exp()
    877 
    878 Returns a tensor of the same type and dimensions as the original tensor
    879 containing the exponential of the original tensor.
    880 
    881 ### <Operation> log()
    882 
    883 Returns a tensor of the same type and dimensions as the original tensor
    884 containing the natural logarithms of the original tensor.
    885 
    886 ### <Operation> abs()
    887 
    888 Returns a tensor of the same type and dimensions as the original tensor
    889 containing the absolute values of the original tensor.
    890 
    891 ### <Operation> pow(Scalar exponent)
    892 
    893 Returns a tensor of the same type and dimensions as the original tensor
    894 containing the coefficients of the original tensor to the power of the
    895 exponent.
    896 
    897 The type of the exponent, Scalar, is always the same as the type of the
    898 tensor coefficients.  For example, only integer exponents can be used in
    899 conjuntion with tensors of integer values.
    900 
    901 You can use cast() to lift this restriction.  For example this computes
    902 cubic roots of an int Tensor:
    903 
    904     Eigen::Tensor<int, 2> a(2, 3);
    905     a.setValues({{0, 1, 8}, {27, 64, 125}});
    906     Eigen::Tensor<double, 2> b = a.cast<double>().pow(1.0 / 3.0);
    907     cout << "a" << endl << a << endl << endl;
    908     cout << "b" << endl << b << endl << endl;
    909     =>
    910     a
    911     0   1   8
    912     27  64 125
    913 
    914     b
    915     0 1 2
    916     3 4 5
    917 
    918 ### <Operation>  operator * (Scalar scale)
    919 
    920 Multiplies all the coefficients of the input tensor by the provided scale.
    921 
    922 ### <Operation>  cwiseMax(Scalar threshold)
    923 TODO
    924 
    925 ### <Operation>  cwiseMin(Scalar threshold)
    926 TODO
    927 
    928 ### <Operation>  unaryExpr(const CustomUnaryOp& func)
    929 TODO
    930 
    931 
    932 ## Binary Element Wise Operations
    933 
    934 These operations take two input tensors as arguments. The 2 input tensors should
    935 be of the same type and dimensions. The result is a tensor of the same
    936 dimensions as the tensors to which they are applied, and unless otherwise
    937 specified it is also of the same type. The requested operations are applied to
    938 each pair of elements independently.
    939 
    940 ### <Operation> operator+(const OtherDerived& other)
    941 
    942 Returns a tensor of the same type and dimensions as the input tensors
    943 containing the coefficient wise sums of the inputs.
    944 
    945 ### <Operation> operator-(const OtherDerived& other)
    946 
    947 Returns a tensor of the same type and dimensions as the input tensors
    948 containing the coefficient wise differences of the inputs.
    949 
    950 ### <Operation> operator*(const OtherDerived& other)
    951 
    952 Returns a tensor of the same type and dimensions as the input tensors
    953 containing the coefficient wise products of the inputs.
    954 
    955 ### <Operation> operator/(const OtherDerived& other)
    956 
    957 Returns a tensor of the same type and dimensions as the input tensors
    958 containing the coefficient wise quotients of the inputs.
    959 
    960 This operator is not supported for integer types.
    961 
    962 ### <Operation> cwiseMax(const OtherDerived& other)
    963 
    964 Returns a tensor of the same type and dimensions as the input tensors
    965 containing the coefficient wise maximums of the inputs.
    966 
    967 ### <Operation> cwiseMin(const OtherDerived& other)
    968 
    969 Returns a tensor of the same type and dimensions as the input tensors
    970 containing the coefficient wise mimimums of the inputs.
    971 
    972 ### <Operation> Logical operators
    973 
    974 The following logical operators are supported as well:
    975 
    976 *   operator&&(const OtherDerived& other)
    977 *   operator||(const OtherDerived& other)
    978 *   operator<(const OtherDerived& other)
    979 *   operator<=(const OtherDerived& other)
    980 *   operator>(const OtherDerived& other)
    981 *   operator>=(const OtherDerived& other)
    982 *   operator==(const OtherDerived& other)
    983 *   operator!=(const OtherDerived& other)
    984 
    985 They all return a tensor of boolean values.
    986 
    987 
    988 ## Selection (select(const ThenDerived& thenTensor, const ElseDerived& elseTensor)
    989 
    990 Selection is a coefficient-wise ternary operator that is the tensor equivalent
    991 to the if-then-else operation.
    992 
    993     Tensor<bool, 3> if = ...;
    994     Tensor<float, 3> then = ...;
    995     Tensor<float, 3> else = ...;
    996     Tensor<float, 3> result = if.select(then, else);
    997 
    998 The 3 arguments must be of the same dimensions, which will also be the dimension
    999 of the result.  The 'if' tensor must be of type boolean, the 'then' and the
   1000 'else' tensor must be of the same type, which will also be the type of the
   1001 result.
   1002 
   1003 Each coefficient in the result is equal to the corresponding coefficient in the
   1004 'then' tensor if the corresponding value in the 'if' tensor is true. If not, the
   1005 resulting coefficient will come from the 'else' tensor.
   1006 
   1007 
   1008 ## Contraction
   1009 
   1010 Tensor *contractions* are a generalization of the matrix product to the
   1011 multidimensional case.
   1012 
   1013     // Create 2 matrices using tensors of rank 2
   1014     Eigen::Tensor<int, 2> a(2, 3);
   1015     a.setValues({{1, 2, 3}, {6, 5, 4}});
   1016     Eigen::Tensor<int, 2> b(3, 2);
   1017     b.setValues({{1, 2}, {4, 5}, {5, 6}});
   1018 
   1019     // Compute the traditional matrix product
   1020     Eigen::array<Eigen::IndexPair<int>, 1> product_dims = { Eigen::IndexPair<int>(1, 0) };
   1021     Eigen::Tensor<int, 2> AB = a.contract(b, product_dims);
   1022 
   1023     // Compute the product of the transpose of the matrices
   1024     Eigen::array<Eigen::IndexPair<int>, 1> transposed_product_dims = { Eigen::IndexPair<int>(0, 1) };
   1025     Eigen::Tensor<int, 2> AtBt = a.contract(b, transposed_product_dims);
   1026 
   1027     // Contraction to scalar value using a double contraction.
   1028     // First coordinate of both tensors are contracted as well as both second coordinates, i.e., this computes the sum of the squares of the elements.
   1029     Eigen::array<Eigen::IndexPair<int>, 2> double_contraction_product_dims = { Eigen::IndexPair<int>(0, 0), Eigen::IndexPair<int>(1, 1) };
   1030     Eigen::Tensor<int, 0> AdoubleContractedA = a.contract(a, double_contraction_product_dims);
   1031 
   1032     // Extracting the scalar value of the tensor contraction for further usage
   1033     int value = AdoubleContractedA(0);
   1034 
   1035 ## Reduction Operations
   1036 
   1037 A *Reduction* operation returns a tensor with fewer dimensions than the
   1038 original tensor.  The values in the returned tensor are computed by applying a
   1039 *reduction operator* to slices of values from the original tensor.  You specify
   1040 the dimensions along which the slices are made.
   1041 
   1042 The Eigen Tensor library provides a set of predefined reduction operators such
   1043 as `maximum()` and `sum()` and lets you define additional operators by
   1044 implementing a few methods from a reductor template.
   1045 
   1046 ### Reduction Dimensions
   1047 
   1048 All reduction operations take a single parameter of type
   1049 `<TensorType>::``Dimensions` which can always be specified as an array of
   1050 ints.  These are called the "reduction dimensions."  The values are the indices
   1051 of the dimensions of the input tensor over which the reduction is done.  The
   1052 parameter can have at most as many element as the rank of the input tensor;
   1053 each element must be less than the tensor rank, as it indicates one of the
   1054 dimensions to reduce.
   1055 
   1056 Each dimension of the input tensor should occur at most once in the reduction
   1057 dimensions as the implementation does not remove duplicates.
   1058 
   1059 The order of the values in the reduction dimensions does not affect the
   1060 results, but the code may execute faster if you list the dimensions in
   1061 increasing order.
   1062 
   1063 Example: Reduction along one dimension.
   1064 
   1065     // Create a tensor of 2 dimensions
   1066     Eigen::Tensor<int, 2> a(2, 3);
   1067     a.setValues({{1, 2, 3}, {6, 5, 4}});
   1068     // Reduce it along the second dimension (1)...
   1069     Eigen::array<int, 1> dims({1 /* dimension to reduce */});
   1070     // ...using the "maximum" operator.
   1071     // The result is a tensor with one dimension.  The size of
   1072     // that dimension is the same as the first (non-reduced) dimension of a.
   1073     Eigen::Tensor<int, 1> b = a.maximum(dims);
   1074     cout << "a" << endl << a << endl << endl;
   1075     cout << "b" << endl << b << endl << endl;
   1076     =>
   1077     a
   1078     1 2 3
   1079     6 5 4
   1080 
   1081     b
   1082     3
   1083     6
   1084 
   1085 Example: Reduction along two dimensions.
   1086 
   1087     Eigen::Tensor<float, 3, Eigen::ColMajor> a(2, 3, 4);
   1088     a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},
   1089                   {7.0f, 6.0f, 5.0f, 4.0f},
   1090                   {8.0f, 9.0f, 10.0f, 11.0f}},
   1091                  {{12.0f, 13.0f, 14.0f, 15.0f},
   1092                   {19.0f, 18.0f, 17.0f, 16.0f},
   1093                   {20.0f, 21.0f, 22.0f, 23.0f}}});
   1094     // The tensor a has 3 dimensions.  We reduce along the
   1095     // first 2, resulting in a tensor with a single dimension
   1096     // of size 4 (the last dimension of a.)
   1097     // Note that we pass the array of reduction dimensions
   1098     // directly to the maximum() call.
   1099     Eigen::Tensor<float, 1, Eigen::ColMajor> b =
   1100         a.maximum(Eigen::array<int, 2>({0, 1}));
   1101     cout << "b" << endl << b << endl << endl;
   1102     =>
   1103     b
   1104     20
   1105     21
   1106     22
   1107     23
   1108 
   1109 #### Reduction along all dimensions
   1110 
   1111 As a special case, if you pass no parameter to a reduction operation the
   1112 original tensor is reduced along *all* its dimensions.  The result is a
   1113 scalar, represented as a zero-dimension tensor.
   1114 
   1115     Eigen::Tensor<float, 3> a(2, 3, 4);
   1116     a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},
   1117                   {7.0f, 6.0f, 5.0f, 4.0f},
   1118                   {8.0f, 9.0f, 10.0f, 11.0f}},
   1119                  {{12.0f, 13.0f, 14.0f, 15.0f},
   1120                   {19.0f, 18.0f, 17.0f, 16.0f},
   1121                   {20.0f, 21.0f, 22.0f, 23.0f}}});
   1122     // Reduce along all dimensions using the sum() operator.
   1123     Eigen::Tensor<float, 0> b = a.sum();
   1124     cout << "b" << endl << b << endl << endl;
   1125     =>
   1126     b
   1127     276
   1128 
   1129 
   1130 ### <Operation> sum(const Dimensions& new_dims)
   1131 ### <Operation> sum()
   1132 
   1133 Reduce a tensor using the sum() operator.  The resulting values
   1134 are the sum of the reduced values.
   1135 
   1136 ### <Operation> mean(const Dimensions& new_dims)
   1137 ### <Operation> mean()
   1138 
   1139 Reduce a tensor using the mean() operator.  The resulting values
   1140 are the mean of the reduced values.
   1141 
   1142 ### <Operation> maximum(const Dimensions& new_dims)
   1143 ### <Operation> maximum()
   1144 
   1145 Reduce a tensor using the maximum() operator.  The resulting values are the
   1146 largest of the reduced values.
   1147 
   1148 ### <Operation> minimum(const Dimensions& new_dims)
   1149 ### <Operation> minimum()
   1150 
   1151 Reduce a tensor using the minimum() operator.  The resulting values
   1152 are the smallest of the reduced values.
   1153 
   1154 ### <Operation> prod(const Dimensions& new_dims)
   1155 ### <Operation> prod()
   1156 
   1157 Reduce a tensor using the prod() operator.  The resulting values
   1158 are the product of the reduced values.
   1159 
   1160 ### <Operation> all(const Dimensions& new_dims)
   1161 ### <Operation> all()
   1162 Reduce a tensor using the all() operator.  Casts tensor to bool and then checks
   1163 whether all elements are true.  Runs through all elements rather than
   1164 short-circuiting, so may be significantly inefficient.
   1165 
   1166 ### <Operation> any(const Dimensions& new_dims)
   1167 ### <Operation> any()
   1168 Reduce a tensor using the any() operator.  Casts tensor to bool and then checks
   1169 whether any element is true.  Runs through all elements rather than
   1170 short-circuiting, so may be significantly inefficient.
   1171 
   1172 
   1173 ### <Operation> reduce(const Dimensions& new_dims, const Reducer& reducer)
   1174 
   1175 Reduce a tensor using a user-defined reduction operator.  See `SumReducer`
   1176 in TensorFunctors.h for information on how to implement a reduction operator.
   1177 
   1178 
   1179 ## Trace
   1180 
   1181 A *Trace* operation returns a tensor with fewer dimensions than the original
   1182 tensor. It returns a tensor whose elements are the sum of the elements of the
   1183 original tensor along the main diagonal for a list of specified dimensions, the
   1184 "trace dimensions". Similar to the `Reduction Dimensions`, the trace dimensions
   1185 are passed as an input parameter to the operation, are of type `<TensorType>::``Dimensions`
   1186 , and have the same requirements when passed as an input parameter. In addition,
   1187 the trace dimensions must have the same size.
   1188 
   1189 Example: Trace along 2 dimensions.
   1190 
   1191     // Create a tensor of 3 dimensions
   1192     Eigen::Tensor<int, 3> a(2, 2, 3);
   1193     a.setValues({{{1, 2, 3}, {4, 5, 6}}, {{7, 8, 9}, {10, 11, 12}}});
   1194     // Specify the dimensions along which the trace will be computed.
   1195     // In this example, the trace can only be computed along the dimensions
   1196     // with indices 0 and 1
   1197     Eigen::array<int, 2> dims({0, 1});
   1198     // The output tensor contains all but the trace dimensions.
   1199     Tensor<int, 1> a_trace = a.trace(dims);
   1200     cout << "a_trace:" << endl;
   1201     cout << a_trace << endl;
   1202     =>
   1203     a_trace:
   1204     11
   1205     13
   1206     15
   1207 
   1208 
   1209 ### <Operation> trace(const Dimensions& new_dims)
   1210 ### <Operation> trace()
   1211 
   1212 As a special case, if no parameter is passed to the operation, trace is computed
   1213 along *all* dimensions of the input tensor.
   1214 
   1215 Example: Trace along all dimensions.
   1216 
   1217     // Create a tensor of 3 dimensions, with all dimensions having the same size.
   1218     Eigen::Tensor<int, 3> a(3, 3, 3);
   1219     a.setValues({{{1, 2, 3}, {4, 5, 6}, {7, 8, 9}},
   1220                 {{10, 11, 12}, {13, 14, 15}, {16, 17, 18}},
   1221                 {{19, 20, 21}, {22, 23, 24}, {25, 26, 27}}});
   1222     // Result is a zero dimension tensor
   1223     Tensor<int, 0> a_trace = a.trace();
   1224     cout<<"a_trace:"<<endl;
   1225     cout<<a_trace<<endl;
   1226     =>
   1227     a_trace:
   1228     42
   1229 
   1230 
   1231 ## Scan Operations
   1232 
   1233 A *Scan* operation returns a tensor with the same dimensions as the original
   1234 tensor. The operation performs an inclusive scan along the specified
   1235 axis, which means it computes a running total along the axis for a given
   1236 reduction operation.
   1237 If the reduction operation corresponds to summation, then this computes the
   1238 prefix sum of the tensor along the given axis.
   1239 
   1240 Example:
   1241 dd a comment to this line
   1242 
   1243     // Create a tensor of 2 dimensions
   1244     Eigen::Tensor<int, 2> a(2, 3);
   1245     a.setValues({{1, 2, 3}, {4, 5, 6}});
   1246     // Scan it along the second dimension (1) using summation
   1247     Eigen::Tensor<int, 2> b = a.cumsum(1);
   1248     // The result is a tensor with the same size as the input
   1249     cout << "a" << endl << a << endl << endl;
   1250     cout << "b" << endl << b << endl << endl;
   1251     =>
   1252     a
   1253     1 2 3
   1254     4 5 6
   1255 
   1256     b
   1257     1  3  6
   1258     4  9 15
   1259 
   1260 ### <Operation> cumsum(const Index& axis)
   1261 
   1262 Perform a scan by summing consecutive entries.
   1263 
   1264 ### <Operation> cumprod(const Index& axis)
   1265 
   1266 Perform a scan by multiplying consecutive entries.
   1267 
   1268 
   1269 ## Convolutions
   1270 
   1271 ### <Operation> convolve(const Kernel& kernel, const Dimensions& dims)
   1272 
   1273 Returns a tensor that is the output of the convolution of the input tensor with the kernel,
   1274 along the specified dimensions of the input tensor. The dimension size for dimensions of the output tensor
   1275 which were part of the convolution will be reduced by the formula:
   1276 output_dim_size = input_dim_size - kernel_dim_size + 1 (requires: input_dim_size >= kernel_dim_size).
   1277 The dimension sizes for dimensions that were not part of the convolution will remain the same.
   1278 Performance of the convolution can depend on the length of the stride(s) of the input tensor dimension(s) along which the
   1279 convolution is computed (the first dimension has the shortest stride for ColMajor, whereas RowMajor's shortest stride is
   1280 for the last dimension).
   1281 
   1282     // Compute convolution along the second and third dimension.
   1283     Tensor<float, 4, DataLayout> input(3, 3, 7, 11);
   1284     Tensor<float, 2, DataLayout> kernel(2, 2);
   1285     Tensor<float, 4, DataLayout> output(3, 2, 6, 11);
   1286     input.setRandom();
   1287     kernel.setRandom();
   1288 
   1289     Eigen::array<ptrdiff_t, 2> dims({1, 2});  // Specify second and third dimension for convolution.
   1290     output = input.convolve(kernel, dims);
   1291 
   1292     for (int i = 0; i < 3; ++i) {
   1293       for (int j = 0; j < 2; ++j) {
   1294         for (int k = 0; k < 6; ++k) {
   1295           for (int l = 0; l < 11; ++l) {
   1296             const float result = output(i,j,k,l);
   1297             const float expected = input(i,j+0,k+0,l) * kernel(0,0) +
   1298                                    input(i,j+1,k+0,l) * kernel(1,0) +
   1299                                    input(i,j+0,k+1,l) * kernel(0,1) +
   1300                                    input(i,j+1,k+1,l) * kernel(1,1);
   1301             VERIFY_IS_APPROX(result, expected);
   1302           }
   1303         }
   1304       }
   1305     }
   1306 
   1307 
   1308 ## Geometrical Operations
   1309 
   1310 These operations return a Tensor with different dimensions than the original
   1311 Tensor.  They can be used to access slices of tensors, see them with different
   1312 dimensions, or pad tensors with additional data.
   1313 
   1314 ### <Operation> reshape(const Dimensions& new_dims)
   1315 
   1316 Returns a view of the input tensor that has been reshaped to the specified
   1317 new dimensions.  The argument new_dims is an array of Index values.  The
   1318 rank of the resulting tensor is equal to the number of elements in new_dims.
   1319 
   1320 The product of all the sizes in the new dimension array must be equal to
   1321 the number of elements in the input tensor.
   1322 
   1323     // Increase the rank of the input tensor by introducing a new dimension
   1324     // of size 1.
   1325     Tensor<float, 2> input(7, 11);
   1326     array<int, 3> three_dims{{7, 11, 1}};
   1327     Tensor<float, 3> result = input.reshape(three_dims);
   1328 
   1329     // Decrease the rank of the input tensor by merging 2 dimensions;
   1330     array<int, 1> one_dim{{7 * 11}};
   1331     Tensor<float, 1> result = input.reshape(one_dim);
   1332 
   1333 This operation does not move any data in the input tensor, so the resulting
   1334 contents of a reshaped Tensor depend on the data layout of the original Tensor.
   1335 
   1336 For example this is what happens when you `reshape()` a 2D ColMajor tensor
   1337 to one dimension:
   1338 
   1339     Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);
   1340     a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
   1341     Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});
   1342     Eigen::Tensor<float, 1, Eigen::ColMajor> b = a.reshape(one_dim);
   1343     cout << "b" << endl << b << endl;
   1344     =>
   1345     b
   1346       0
   1347     300
   1348     100
   1349     400
   1350     200
   1351     500
   1352 
   1353 This is what happens when the 2D Tensor is RowMajor:
   1354 
   1355     Eigen::Tensor<float, 2, Eigen::RowMajor> a(2, 3);
   1356     a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
   1357     Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});
   1358     Eigen::Tensor<float, 1, Eigen::RowMajor> b = a.reshape(one_dim);
   1359     cout << "b" << endl << b << endl;
   1360     =>
   1361     b
   1362       0
   1363     100
   1364     200
   1365     300
   1366     400
   1367     500
   1368 
   1369 The reshape operation is a lvalue. In other words, it can be used on the left
   1370 side of the assignment operator.
   1371 
   1372 The previous example can be rewritten as follow:
   1373 
   1374     Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);
   1375     a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
   1376     Eigen::array<Eigen::DenseIndex, 2> two_dim({2, 3});
   1377     Eigen::Tensor<float, 1, Eigen::ColMajor> b(6);
   1378     b.reshape(two_dim) = a;
   1379     cout << "b" << endl << b << endl;
   1380     =>
   1381     b
   1382       0
   1383     300
   1384     100
   1385     400
   1386     200
   1387     500
   1388 
   1389 Note that "b" itself was not reshaped but that instead the assignment is done to
   1390 the reshape view of b.
   1391 
   1392 
   1393 ### <Operation> shuffle(const Shuffle& shuffle)
   1394 
   1395 Returns a copy of the input tensor whose dimensions have been
   1396 reordered according to the specified permutation. The argument shuffle
   1397 is an array of Index values. Its size is the rank of the input
   1398 tensor. It must contain a permutation of 0, 1, ..., rank - 1. The i-th
   1399 dimension of the output tensor equals to the size of the shuffle[i]-th
   1400 dimension of the input tensor. For example:
   1401 
   1402     // Shuffle all dimensions to the left by 1.
   1403     Tensor<float, 3> input(20, 30, 50);
   1404     // ... set some values in input.
   1405     Tensor<float, 3> output = input.shuffle({1, 2, 0})
   1406 
   1407     eigen_assert(output.dimension(0) == 30);
   1408     eigen_assert(output.dimension(1) == 50);
   1409     eigen_assert(output.dimension(2) == 20);
   1410 
   1411 Indices into the output tensor are shuffled accordingly to formulate
   1412 indices into the input tensor. For example, one can assert in the above
   1413 code snippet that:
   1414 
   1415     eigen_assert(output(3, 7, 11) == input(11, 3, 7));
   1416 
   1417 In general, one can assert that
   1418 
   1419     eigen_assert(output(..., indices[shuffle[i]], ...) ==
   1420                  input(..., indices[i], ...))
   1421 
   1422 The shuffle operation results in a lvalue, which means that it can be assigned
   1423 to. In other words, it can be used on the left side of the assignment operator.
   1424 
   1425 Let's rewrite the previous example to take advantage of this feature:
   1426 
   1427     // Shuffle all dimensions to the left by 1.
   1428     Tensor<float, 3> input(20, 30, 50);
   1429     // ... set some values in input.
   1430     Tensor<float, 3> output(30, 50, 20);
   1431     output.shuffle({2, 0, 1}) = input;
   1432 
   1433 
   1434 ### <Operation> stride(const Strides& strides)
   1435 
   1436 Returns a view of the input tensor that strides (skips stride-1
   1437 elements) along each of the dimensions.  The argument strides is an
   1438 array of Index values.  The dimensions of the resulting tensor are
   1439 ceil(input_dimensions[i] / strides[i]).
   1440 
   1441 For example this is what happens when you `stride()` a 2D tensor:
   1442 
   1443     Eigen::Tensor<int, 2> a(4, 3);
   1444     a.setValues({{0, 100, 200}, {300, 400, 500}, {600, 700, 800}, {900, 1000, 1100}});
   1445     Eigen::array<Eigen::DenseIndex, 2> strides({3, 2});
   1446     Eigen::Tensor<int, 2> b = a.stride(strides);
   1447     cout << "b" << endl << b << endl;
   1448     =>
   1449     b
   1450        0   200
   1451      900  1100
   1452 
   1453 It is possible to assign a tensor to a stride:
   1454     Tensor<float, 3> input(20, 30, 50);
   1455     // ... set some values in input.
   1456     Tensor<float, 3> output(40, 90, 200);
   1457     output.stride({2, 3, 4}) = input;
   1458 
   1459 
   1460 ### <Operation> slice(const StartIndices& offsets, const Sizes& extents)
   1461 
   1462 Returns a sub-tensor of the given tensor. For each dimension i, the slice is
   1463 made of the coefficients stored between offset[i] and offset[i] + extents[i] in
   1464 the input tensor.
   1465 
   1466     Eigen::Tensor<int, 2> a(4, 3);
   1467     a.setValues({{0, 100, 200}, {300, 400, 500},
   1468                  {600, 700, 800}, {900, 1000, 1100}});
   1469     Eigen::array<int, 2> offsets = {1, 0};
   1470     Eigen::array<int, 2> extents = {2, 2};
   1471     Eigen::Tensor<int, 1> slice = a.slice(offsets, extents);
   1472     cout << "a" << endl << a << endl;
   1473     =>
   1474     a
   1475        0   100   200
   1476      300   400   500
   1477      600   700   800
   1478      900  1000  1100
   1479     cout << "slice" << endl << slice << endl;
   1480     =>
   1481     slice
   1482      300   400
   1483      600   700
   1484 
   1485 
   1486 ### <Operation> chip(const Index offset, const Index dim)
   1487 
   1488 A chip is a special kind of slice. It is the subtensor at the given offset in
   1489 the dimension dim. The returned tensor has one fewer dimension than the input
   1490 tensor: the dimension dim is removed.
   1491 
   1492 For example, a matrix chip would be either a row or a column of the input
   1493 matrix.
   1494 
   1495     Eigen::Tensor<int, 2> a(4, 3);
   1496     a.setValues({{0, 100, 200}, {300, 400, 500},
   1497                  {600, 700, 800}, {900, 1000, 1100}});
   1498     Eigen::Tensor<int, 1> row_3 = a.chip(2, 0);
   1499     Eigen::Tensor<int, 1> col_2 = a.chip(1, 1);
   1500     cout << "a" << endl << a << endl;
   1501     =>
   1502     a
   1503        0   100   200
   1504      300   400   500
   1505      600   700   800
   1506      900  1000  1100
   1507     cout << "row_3" << endl << row_3 << endl;
   1508     =>
   1509     row_3
   1510        600   700   800
   1511     cout << "col_2" << endl << col_2 << endl;
   1512     =>
   1513     col_2
   1514        100   400   700    1000
   1515 
   1516 It is possible to assign values to a tensor chip since the chip operation is a
   1517 lvalue. For example:
   1518 
   1519     Eigen::Tensor<int, 1> a(3);
   1520     a.setValues({{100, 200, 300}});
   1521     Eigen::Tensor<int, 2> b(2, 3);
   1522     b.setZero();
   1523     b.chip(0, 0) = a;
   1524     cout << "a" << endl << a << endl;
   1525     =>
   1526     a
   1527      100
   1528      200
   1529      300
   1530     cout << "b" << endl << b << endl;
   1531     =>
   1532     b
   1533        100   200   300
   1534          0     0     0
   1535 
   1536 
   1537 ### <Operation> reverse(const ReverseDimensions& reverse)
   1538 
   1539 Returns a view of the input tensor that reverses the order of the coefficients
   1540 along a subset of the dimensions.  The argument reverse is an array of boolean
   1541 values that indicates whether or not the order of the coefficients should be
   1542 reversed along each of the dimensions.  This operation preserves the dimensions
   1543 of the input tensor.
   1544 
   1545 For example this is what happens when you `reverse()` the first dimension
   1546 of a 2D tensor:
   1547 
   1548     Eigen::Tensor<int, 2> a(4, 3);
   1549     a.setValues({{0, 100, 200}, {300, 400, 500},
   1550                 {600, 700, 800}, {900, 1000, 1100}});
   1551     Eigen::array<bool, 2> reverse({true, false});
   1552     Eigen::Tensor<int, 2> b = a.reverse(reverse);
   1553     cout << "a" << endl << a << endl << "b" << endl << b << endl;
   1554     =>
   1555     a
   1556        0   100   200
   1557      300   400   500
   1558      600   700   800
   1559      900  1000  1100
   1560     b
   1561      900  1000  1100
   1562      600   700   800
   1563      300   400   500
   1564        0   100   200
   1565 
   1566 
   1567 ### <Operation> broadcast(const Broadcast& broadcast)
   1568 
   1569 Returns a view of the input tensor in which the input is replicated one to many
   1570 times.
   1571 The broadcast argument specifies how many copies of the input tensor need to be
   1572 made in each of the dimensions.
   1573 
   1574     Eigen::Tensor<int, 2> a(2, 3);
   1575     a.setValues({{0, 100, 200}, {300, 400, 500}});
   1576     Eigen::array<int, 2> bcast({3, 2});
   1577     Eigen::Tensor<int, 2> b = a.broadcast(bcast);
   1578     cout << "a" << endl << a << endl << "b" << endl << b << endl;
   1579     =>
   1580     a
   1581        0   100   200
   1582      300   400   500
   1583     b
   1584        0   100   200    0   100   200
   1585      300   400   500  300   400   500
   1586        0   100   200    0   100   200
   1587      300   400   500  300   400   500
   1588        0   100   200    0   100   200
   1589      300   400   500  300   400   500
   1590 
   1591 ### <Operation> concatenate(const OtherDerived& other, Axis axis)
   1592 
   1593 TODO
   1594 
   1595 ### <Operation>  pad(const PaddingDimensions& padding)
   1596 
   1597 Returns a view of the input tensor in which the input is padded with zeros.
   1598 
   1599     Eigen::Tensor<int, 2> a(2, 3);
   1600     a.setValues({{0, 100, 200}, {300, 400, 500}});
   1601     Eigen::array<pair<int, int>, 2> paddings;
   1602     paddings[0] = make_pair(0, 1);
   1603     paddings[1] = make_pair(2, 3);
   1604     Eigen::Tensor<int, 2> b = a.pad(paddings);
   1605     cout << "a" << endl << a << endl << "b" << endl << b << endl;
   1606     =>
   1607     a
   1608        0   100   200
   1609      300   400   500
   1610     b
   1611        0     0     0    0
   1612        0     0     0    0
   1613        0   100   200    0
   1614      300   400   500    0
   1615        0     0     0    0
   1616        0     0     0    0
   1617        0     0     0    0
   1618 
   1619 
   1620 ### <Operation>  extract_patches(const PatchDims& patch_dims)
   1621 
   1622 Returns a tensor of coefficient patches extracted from the input tensor, where
   1623 each patch is of dimension specified by 'patch_dims'. The returned tensor has
   1624 one greater dimension than the input tensor, which is used to index each patch.
   1625 The patch index in the output tensor depends on the data layout of the input
   1626 tensor: the patch index is the last dimension ColMajor layout, and the first
   1627 dimension in RowMajor layout.
   1628 
   1629 For example, given the following input tensor:
   1630 
   1631     Eigen::Tensor<float, 2, DataLayout> tensor(3,4);
   1632     tensor.setValues({{0.0f, 1.0f, 2.0f, 3.0f},
   1633                       {4.0f, 5.0f, 6.0f, 7.0f},
   1634                       {8.0f, 9.0f, 10.0f, 11.0f}});
   1635 
   1636     cout << "tensor: " << endl << tensor << endl;
   1637     =>
   1638     tensor:
   1639      0   1   2   3
   1640      4   5   6   7
   1641      8   9  10  11
   1642 
   1643 Six 2x2 patches can be extracted and indexed using the following code:
   1644 
   1645     Eigen::Tensor<float, 3, DataLayout> patch;
   1646     Eigen::array<ptrdiff_t, 2> patch_dims;
   1647     patch_dims[0] = 2;
   1648     patch_dims[1] = 2;
   1649     patch = tensor.extract_patches(patch_dims);
   1650     for (int k = 0; k < 6; ++k) {
   1651       cout << "patch index: " << k << endl;
   1652       for (int i = 0; i < 2; ++i) {
   1653     	for (int j = 0; j < 2; ++j) {
   1654     	  if (DataLayout == ColMajor) {
   1655     		cout << patch(i, j, k) << " ";
   1656     	  } else {
   1657     		cout << patch(k, i, j) << " ";
   1658     	  }
   1659     	}
   1660     	cout << endl;
   1661       }
   1662     }
   1663 
   1664 This code results in the following output when the data layout is ColMajor:
   1665 
   1666     patch index: 0
   1667     0 1
   1668     4 5
   1669     patch index: 1
   1670     4 5
   1671     8 9
   1672     patch index: 2
   1673     1 2
   1674     5 6
   1675     patch index: 3
   1676     5 6
   1677     9 10
   1678     patch index: 4
   1679     2 3
   1680     6 7
   1681     patch index: 5
   1682     6 7
   1683     10 11
   1684 
   1685 This code results in the following output when the data layout is RowMajor:
   1686 (NOTE: the set of patches is the same as in ColMajor, but are indexed differently).
   1687 
   1688     patch index: 0
   1689     0 1
   1690     4 5
   1691     patch index: 1
   1692     1 2
   1693     5 6
   1694     patch index: 2
   1695     2 3
   1696     6 7
   1697     patch index: 3
   1698     4 5
   1699     8 9
   1700     patch index: 4
   1701     5 6
   1702     9 10
   1703     patch index: 5
   1704     6 7
   1705     10 11
   1706 
   1707 ### <Operation>  extract_image_patches(const Index patch_rows, const Index patch_cols, const Index row_stride, const Index col_stride, const PaddingType padding_type)
   1708 
   1709 Returns a tensor of coefficient image patches extracted from the input tensor,
   1710 which is expected to have dimensions ordered as follows (depending on the data
   1711 layout of the input tensor, and the number of additional dimensions 'N'):
   1712 
   1713 *) ColMajor
   1714 1st dimension: channels (of size d)
   1715 2nd dimension: rows (of size r)
   1716 3rd dimension: columns (of size c)
   1717 4th-Nth dimension: time (for video) or batch (for bulk processing).
   1718 
   1719 *) RowMajor (reverse order of ColMajor)
   1720 1st-Nth dimension: time (for video) or batch (for bulk processing).
   1721 N+1'th dimension: columns (of size c)
   1722 N+2'th dimension: rows (of size r)
   1723 N+3'th dimension: channels (of size d)
   1724 
   1725 The returned tensor has one greater dimension than the input tensor, which is
   1726 used to index each patch. The patch index in the output tensor depends on the
   1727 data layout of the input tensor: the patch index is the 4'th dimension in
   1728 ColMajor layout, and the 4'th from the last dimension in RowMajor layout.
   1729 
   1730 For example, given the following input tensor with the following dimension
   1731 sizes:
   1732  *) depth:   2
   1733  *) rows:    3
   1734  *) columns: 5
   1735  *) batch:   7
   1736 
   1737     Tensor<float, 4> tensor(2,3,5,7);
   1738     Tensor<float, 4, RowMajor> tensor_row_major = tensor.swap_layout();
   1739 
   1740 2x2 image patches can be extracted and indexed using the following code:
   1741 
   1742 *) 2D patch: ColMajor (patch indexed by second-to-last dimension)
   1743 
   1744     Tensor<float, 5> twod_patch;
   1745     twod_patch = tensor.extract_image_patches<2, 2>();
   1746     // twod_patch.dimension(0) == 2
   1747     // twod_patch.dimension(1) == 2
   1748     // twod_patch.dimension(2) == 2
   1749     // twod_patch.dimension(3) == 3*5
   1750     // twod_patch.dimension(4) == 7
   1751 
   1752 *) 2D patch: RowMajor (patch indexed by the second dimension)
   1753 
   1754     Tensor<float, 5, RowMajor> twod_patch_row_major;
   1755     twod_patch_row_major = tensor_row_major.extract_image_patches<2, 2>();
   1756     // twod_patch_row_major.dimension(0) == 7
   1757     // twod_patch_row_major.dimension(1) == 3*5
   1758     // twod_patch_row_major.dimension(2) == 2
   1759     // twod_patch_row_major.dimension(3) == 2
   1760     // twod_patch_row_major.dimension(4) == 2
   1761 
   1762 ## Special Operations
   1763 
   1764 ### <Operation> cast<T>()
   1765 
   1766 Returns a tensor of type T with the same dimensions as the original tensor.
   1767 The returned tensor contains the values of the original tensor converted to
   1768 type T.
   1769 
   1770     Eigen::Tensor<float, 2> a(2, 3);
   1771     Eigen::Tensor<int, 2> b = a.cast<int>();
   1772 
   1773 This can be useful for example if you need to do element-wise division of
   1774 Tensors of integers.  This is not currently supported by the Tensor library
   1775 but you can easily cast the tensors to floats to do the division:
   1776 
   1777     Eigen::Tensor<int, 2> a(2, 3);
   1778     a.setValues({{0, 1, 2}, {3, 4, 5}});
   1779     Eigen::Tensor<int, 2> b =
   1780         (a.cast<float>() / a.constant(2).cast<float>()).cast<int>();
   1781     cout << "a" << endl << a << endl << endl;
   1782     cout << "b" << endl << b << endl << endl;
   1783     =>
   1784     a
   1785     0 1 2
   1786     3 4 5
   1787 
   1788     b
   1789     0 0 1
   1790     1 2 2
   1791 
   1792 
   1793 ### <Operation>     eval()
   1794 
   1795 TODO
   1796 
   1797 
   1798 ## Representation of scalar values
   1799 
   1800 Scalar values are often represented by tensors of size 1 and rank 0.For example
   1801 Tensor<T, N>::maximum() currently returns a Tensor<T, 0>. Similarly, the inner
   1802 product of 2 1d tensors (through contractions) returns a 0d tensor.
   1803 
   1804 ## Limitations
   1805 
   1806 *   The number of tensor dimensions is currently limited to 250 when using a
   1807     compiler that supports cxx11. It is limited to only 5 for older compilers.
   1808 *   The IndexList class requires a cxx11 compliant compiler. You can use an
   1809     array of indices instead if you don't have access to a modern compiler.
   1810 *   On GPUs only floating point values are properly tested and optimized for.
   1811 *   Complex and integer values are known to be broken on GPUs. If you try to use
   1812     them you'll most likely end up triggering a static assertion failure such as
   1813     EIGEN_STATIC_ASSERT(packetSize > 1, YOU_MADE_A_PROGRAMMING_MISTAKE)
   1814 
   1815