TinyStories Diffusion LM

A non-autoregressive language model leveraging parallel block-decoding and SwiGLU networks.

Model Architecture
Display Mode
20 1000