spi: Don't cross 16MiB boundaries with long writes

The 16MiB issue still bites us.  Originally, the core of flashprog never
sent more than an erase block at once to write. Now that we write bigger
chunks at once, after all necessary erasure, it can happen that we cross
16MiB boundaries. This is an issue with programmer drivers that can only
send 3-byte addresses.  We use the extended address register with these,
to select which 16MiB area is currently accessed. Should we try to write
across a 16MiB boundary, we'd write with stale extended-address register
contents (basically wrapping around).

This once more troubles old, V1 Dediprog SF100's. Where we can send huge
chunks at once and leave the sequencing to the programmer.  The program-
mer, however,  is unaware of the state of the extended-address register.
Other programmer drivers do the sequencing with  spi_write_chunked() and
shouldn't be affected.

To settle this issue,  copy the loop logic that we already used to avoid
the problem for long reads.

Tested with Dediprog "SF100   V:5.1.9".

Change-Id: I5b9d6779eff5224fb9981fd478dbc94262cd3262
Signed-off-by: Nico Huber <nico.h@gmx.de>
Reviewed-on: https://review.sourcearcade.org/c/flashprog/+/115
Reviewed-by: Urja Rannikko <urjaman@gmail.com>
diff --git a/spi.c b/spi.c
index a6b5124..ac51d87 100644
--- a/spi.c
+++ b/spi.c
@@ -103,6 +103,7 @@
 		/* Do not cross 16MiB boundaries in a single transfer.
 		   This helps with
 		   o multi-die 4-byte-addressing chips,
+		   o 4-byte-addressing chips that use an extended address reg,
 		   o dediprog that has a protocol limit of 32MiB-512B. */
 		to_read = min(ALIGN_DOWN(start + 16*MiB, 16*MiB) - start, len);
 		ret = flash->mst.spi->read(flash, buf, start, to_read);
@@ -121,7 +122,19 @@
 /* real chunksize is up to 256, logical chunksize is 256 */
 int spi_chip_write_256(struct flashctx *flash, const uint8_t *buf, unsigned int start, unsigned int len)
 {
-	return flash->mst.spi->write_256(flash, buf, start, len);
+	int ret;
+	size_t to_write;
+	for (; len; len -= to_write, buf += to_write, start += to_write) {
+		/* Do not cross 16MiB boundaries in a single transfer.
+		   This helps with 4-byte-addressing chips using an
+		   extended-address register that has to match the
+		   current 16MiB area. */
+		to_write = min(ALIGN_DOWN(start + 16*MiB, 16*MiB) - start, len);
+		ret = flash->mst.spi->write_256(flash, buf, start, to_write);
+		if (ret)
+			return ret;
+	}
+	return 0;
 }
 
 int spi_aai_write(struct flashctx *flash, const uint8_t *buf, unsigned int start, unsigned int len)