I eat words@group.lt to AI@lemmy.ml · 2 years agoThe GPT-3 Architecture, on a Napkindugas.chexternal-linkmessage-square2fedilinkarrow-up121arrow-down11file-text
arrow-up120arrow-down1external-linkThe GPT-3 Architecture, on a Napkindugas.chI eat words@group.lt to AI@lemmy.ml · 2 years agomessage-square2fedilinkfile-text
minus-squareBehohippy@lemmy.worldlinkfedilinkarrow-up3·1 year agoI’ve got a background in deep learning and I still struggle to understand the attention mechanism. I know it’s a key/value store but I’m not sure what it’s doing to the tensor when it passes through different layers.
I’ve got a background in deep learning and I still struggle to understand the attention mechanism. I know it’s a key/value store but I’m not sure what it’s doing to the tensor when it passes through different layers.