Introduction
Introduction Statistics Contact Development Disclaimer Help
Post Aes88LKiOoaTGjjvA8 by [email protected]
More posts by [email protected]
Post #AersWwQYXLmrzwBAau by [email protected]
0 likes, 0 repeats
RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/abs/2305.13048…
Post #Aeru1v11w3paYDZggq by [email protected]
0 likes, 0 repeats
i'm a huge fan of RNN's as a concept (it's a neural network with a …
Post #AervCiZSe1uz5tOPQ0 by [email protected]
0 likes, 0 repeats
What's really interesting is that you can sort of "finetune" RWKV…
Post #Aes6uMeKbm5JLLgZBg by [email protected]
0 likes, 0 repeats
@lritter there are some really interesting implications if they have solved som…
Post #Aes7gdM3nkrF3P9jk0 by [email protected]
0 likes, 0 repeats
@DirtyPunk yeah. there's an autoencoder in there (though the encoded state …
Post #Aes88LKiOoaTGjjvA8 by [email protected]
0 likes, 0 repeats
@lritter Stuff like this is vital to get to engineering readiness for these tec…
Post #Aes92MIK0nGiT2KY6a by [email protected]
0 likes, 0 repeats
@DirtyPunk funny enough though, a lot of OSS models are trained from ChatGPT ge…
Post #Aes9ZRnmYY6P6Xp8OO by [email protected]
0 likes, 0 repeats
@DirtyPunk it seems the project also really cares about at-home reproducibility…
Post #AesnEm2Q7VZRx1MqSe by [email protected]
0 likes, 0 repeats
@lritter Difficult to parallelize for GPUs? Dedicated neural net hardware have…
Post #AesnEm6Jt0gq97BxXU by [email protected]
0 likes, 0 repeats
@njrabit no, difficult to parallelize anywhere. and by difficult i mean impossi…
You are viewing proxied material from pleroma.anduin.net. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.