Properly support two padding machines per circuit

As part of circuitpadding.c, in circpad_add_matching_machines(), the macros FOR_EACH_CIRCUIT_MACHINE_BEGIN and SMARTLIST_FOREACH_REVERSE_BEGIN currently expand to a for loop each. Below a slightly cropped (...) version of circpad_add_matching_machines():

circpad_add_matching_machines(origin_circuit_t *on_circ,
                              smartlist_t *machines_sl)
{
  ...
  FOR_EACH_CIRCUIT_MACHINE_BEGIN(i) {
    ...
    SMARTLIST_FOREACH_REVERSE_BEGIN(machines_sl,
                                    circpad_machine_spec_t *,
                                    machine) {
        ...
        if (circpad_negotiate_padding(on_circ, machine->machine_num,
                                  machine->target_hopnum,
                                  CIRCPAD_COMMAND_START) < 0) {
          log_info(LD_CIRC, "Padding not negotiated. Cleaning machine");
          circpad_circuit_machineinfo_free_idx(circ, i);
          circ->padding_machine[i] = NULL;
          on_circ->padding_negotiation_failed = 1;
        } else {
          /* Success. Don't try any more machines */
          return;
        }
      }
    } SMARTLIST_FOREACH_END(machine);
  } FOR_EACH_CIRCUIT_MACHINE_END;
}

The outer loop goes over each machine index (currently 2, set by CIRCPAD_MAX_MACHINES), while the inner loop looks for a suitable machine for that index to negotiate. As soon as one is found and negotiated, currently, the function returns without looking for a machine for later indices in the outer loop. The return should be replaced by a break to continue looking for a machine for the next index.

See https://github.com/torproject/tor/pull/1168 for a PR.