Post Aes88LKiOoaTGjjvA8 by [email protected] | |
More posts by [email protected] | |
Post #AersWwQYXLmrzwBAau by [email protected] | |
0 likes, 0 repeats | |
RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/abs/2305.13048… | |
Post #Aeru1v11w3paYDZggq by [email protected] | |
0 likes, 0 repeats | |
i'm a huge fan of RNN's as a concept (it's a neural network with a … | |
Post #AervCiZSe1uz5tOPQ0 by [email protected] | |
0 likes, 0 repeats | |
What's really interesting is that you can sort of "finetune" RWKV… | |
Post #Aes6uMeKbm5JLLgZBg by [email protected] | |
0 likes, 0 repeats | |
@lritter there are some really interesting implications if they have solved som… | |
Post #Aes7gdM3nkrF3P9jk0 by [email protected] | |
0 likes, 0 repeats | |
@DirtyPunk yeah. there's an autoencoder in there (though the encoded state … | |
Post #Aes88LKiOoaTGjjvA8 by [email protected] | |
0 likes, 0 repeats | |
@lritter Stuff like this is vital to get to engineering readiness for these tec… | |
Post #Aes92MIK0nGiT2KY6a by [email protected] | |
0 likes, 0 repeats | |
@DirtyPunk funny enough though, a lot of OSS models are trained from ChatGPT ge… | |
Post #Aes9ZRnmYY6P6Xp8OO by [email protected] | |
0 likes, 0 repeats | |
@DirtyPunk it seems the project also really cares about at-home reproducibility… | |
Post #AesnEm2Q7VZRx1MqSe by [email protected] | |
0 likes, 0 repeats | |
@lritter Difficult to parallelize for GPUs? Dedicated neural net hardware have… | |
Post #AesnEm6Jt0gq97BxXU by [email protected] | |
0 likes, 0 repeats | |
@njrabit no, difficult to parallelize anywhere. and by difficult i mean impossi… |