Refine and fix the delay loop calculation

The current delay loop calculation is still from revision 1 of flashrom,
and since then it had a logic bug which caused all delays to be twice as
long as intended.

Fix the delay duration.

Protect against delay loop overflows.

Detect a non-working delay loop.

Change the delay loop itself to ensure clever compiler optimizers won't
eliminate it (as happens with clang/llvm in the current code). Some
people suggested machine-specific asm, but the empty asm statement with
the loop counter as register/memory input has the benefit of being
perfectly cross-platform and working in gcc and clang.

If time goes backwards (catastrophical NTP time difference, manual
time change), timing measurements were shot because the new-old time
subtraction yielded negative numbers which weren't handled correctly
because the variable is unsigned. Work around that issue (a fix is
mathematically impossible).

If time goes forward too fast, pick the biggest possible timing
measurement with a guaranteed overflow avoidance for all timing
calculations.

Check four times if the calculated timing is at most 10% too fast. This
addresses OS scheduler interactions, e.g. being scheduled out during
measurement which inflates measurements.

If the timing looks like garbage, recalculate the timer values up to
four times before giving up.

Avoid division by zero in rare cases where timing measurements for a 250
ms delay returned 0 us elapsed.

Corresponding to flashrom svn r990.

Signed-off-by: Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net>
Acked-by: Maciej Pijanka <maciej.pijanka@gmail.com>
diff --git a/udelay.c b/udelay.c
index ac58017..ff6620a 100644
--- a/udelay.c
+++ b/udelay.c
@@ -24,13 +24,16 @@
 #include <limits.h>
 #include "flash.h"
 
-// count to a billion. Time it. If it's < 1 sec, count to 10B, etc.
+/* loops per microsecond */
 unsigned long micro = 1;
 
-void myusec_delay(int usecs)
+__attribute__ ((noinline)) void myusec_delay(int usecs)
 {
-	volatile unsigned long i;
-	for (i = 0; i < usecs * micro; i++) ;
+	unsigned long i;
+	for (i = 0; i < usecs * micro; i++) {
+		/* Make sure the compiler doesn't optimize the loop away. */
+		asm volatile ("" : : "rm" (i) );
+	}
 }
 
 unsigned long measure_delay(int usecs)
@@ -43,29 +46,61 @@
 	gettimeofday(&end, 0);
 	timeusec = 1000000 * (end.tv_sec - start.tv_sec) +
 		   (end.tv_usec - start.tv_usec);
+	/* Protect against time going forward too much. */
+	if ((end.tv_sec > start.tv_sec) &&
+	    ((end.tv_sec - start.tv_sec) >= LONG_MAX / 1000000 - 1))
+		timeusec = LONG_MAX;
+	/* Protect against time going backwards during leap seconds. */
+	if ((end.tv_sec < start.tv_sec) || (timeusec > LONG_MAX))
+		timeusec = 1;
 
 	return timeusec;
 }
 
 void myusec_calibrate_delay(void)
 {
-	int count = 1000;
+	unsigned long count = 1000;
 	unsigned long timeusec;
-	int ok = 0;
+	int i, tries = 0;
 
 	printf("Calibrating delay loop... ");
 
-	while (!ok) {
+recalibrate:
+	while (1) {
 		timeusec = measure_delay(count);
+		if (timeusec > 1000000 / 4)
+			break;
+		if (count >= ULONG_MAX / 2) {
+			msg_pinfo("timer loop overflow, reduced precision. ");
+			break;
+		}
 		count *= 2;
-		if (timeusec < 1000000 / 4)
-			continue;
-		ok = 1;
 	}
+	tries ++;
 
-	// compute one microsecond. That will be count / time
-	micro = count / timeusec;
-	msg_pdbg("%ldM loops per second, ", micro);
+	/* Avoid division by zero, but in that case the loop is shot anyway. */
+	if (!timeusec)
+		timeusec = 1;
+	
+	/* Compute rounded up number of loops per microsecond. */
+	micro = (count * micro) / timeusec + 1;
+	msg_pdbg("%luM loops per second, ", micro);
+
+	/* Did we try to recalibrate less than 5 times? */
+	if (tries < 5) {
+		/* Recheck our timing to make sure we weren't just hitting
+		 * a scheduler delay or something similar.
+		 */
+		for (i = 0; i < 4; i++) {
+			if (measure_delay(100) < 90) {
+				msg_pdbg("delay more than 10% too short, "
+					 "recalculating... ");
+				goto recalibrate;
+			}
+		}
+	} else {
+		msg_perr("delay loop is unreliable, trying to continue ");
+	}
 
 	/* We're interested in the actual precision. */
 	timeusec = measure_delay(10);