I'd imagine in a lot of situations, the attackers would take the defenders by surprise in everything but a bridge or castle battle, and that in an open field battle, both sides would probably be about equally prepared. A lot of these villages don't really seem like the townsfolk would have much of a warning for a night raid unless the attackers decided to go in beating on drums and thudding their spears like they were orcs. A forest or a mountain would almost assuredly give a decisive advantage to the attackers, as those are typical ambush locations.
This could be explained away by the fact the attackers have all the time they want to consolidate their forces before launching the attack on the strategus map, but defenders only have a limited window unless they've anticipated the attack. However I doubt that there was any real thought put into explaining the inexplicable omniscience that all defenders seem to have about exactly when and where their enemies are going to attack, thus explaining their ability to always be in formation, even when the visibility is 10 feet due to heavy fog at night. Clearly someone played with Borcha at 10 spotting ability!
edit: I'm in favor of 1 spawn per second, but initial spawns should probably all happen at the same time, except on sieges (if they really were an issue for sieges as I've heard)