Data bits are prefetched from memory cells in parallel and are read out serially. The memory includes multiple stages (1710) of latches through which the parallel data is transferred before being read out serially. The multiple stages provide suitable delays to satisfy variable latency requirements (e.g....http://www.google.com/patents/US7054215?utm_source=gb-gplus-sharePatent US7054215 - Multistage parallel-to-serial conversion of read data in memories, with the first serial bit skipping at least one stage