notes

Personal notes
git clone git://git.laack.co/notes.git
Log | Files | Refs

Transformers.md (612B)


      1 # Transformers
      2 
      3 **Source:** [https://arxiv.org/abs/1706.03762](https://arxiv.org/abs/1706.03762)
      4 
      5 **Definition:** Transformers (as originally introduced) are a neural network architecture consisting of an encoder and a decoder that use [attention](Attention.md).
      6 
      7 ## Attention is All You Need
      8 
      9 Existing approaches for sequence transduction (input sequence -> output sequence) used used RNNs and CNNs with encoders and decoders. The best models connected these encoders and decoders with attention. Transformers are an architecture that use attention without the recurrence / convolutions of existing approaches.