That could work in some cases, where nodes are scattered and scarce enough that loss of service is possible to attribute to issues with a single node.
When UMESH had just three nodes, that seemed an easy enough call. Then we later discovered the middle node of the first 3 wasn’t absolutely critical. In many cases, it was possible for the 2 end nodes to see each other without resort to the middle one. Saving a hop as a very good thing when there were only three.
The fact that the middle node was down was only discovered during other routine maintenance. The behavior of the system was such that it pretty well masked something was astray. Admittedly if you knew system well and operated regularly in these rather restricted zones where this would have been noticeable, it could have come up obviously enough to attribute to this single node. Or maybe not.
Now we have six nodes. UMESH 5 was down for some time - in retrospect - when I went to update the firmware to 5.0. The old coverage map showed a constriction of the mesh in the southwest quadrant, upon review. The mesh carried right over and past the node, but didn’t extend further than 3 blocks past it. That would have been a hard call, too, because building height and density grow sharply toward that direction, so I attributed the degraded performance that factors rather than DNS (Dead Node Syndrome.)
OK, but what about when there’s a third line of nodes? It’s advantageous if the first and third lines can see each other at certain points, thus skipping a hop in there that can be used further down the line. But once you get solid signals coming from multiple angles, which is what happens except around the edges of the mesh, it will become more difficult that it already is to spot a dead node at all. This is clearly difficult already and will get harder to do, as the mesh thickens.
What to do? Maybe something that aggregates observations in an area and the forwards them in a consolidated manner to potential noticees that could have affected nodes.