Please be patient. Rich and I are discussing the details.
A sneak peek... Basically what I want to do is convert all scores to a 0 to 100 scale. This is what both the Army and The Marines do for their respective tests, which are similar to what we did at camp. (But our test was more complete.) They each have a slightly different way of doing it.
It isn't as easy as you think. As I suspected, there isn't a linear relationship between counts of an event and the score they give someone on the 100 point scale. And when I look at their tables, it's obvious to me that someone came in and "eyeballed" places where the slope changes in the conversion. It's not very elegant, and difficult to twiddle with later on.
What I am doing is setting up a spreadsheet where I can twiddle with the parameters and see what I want on the back end. This is particularly important for categories such as women's pull-ups, where everyone is clustered around either a hang or 1 or 2 pull-ups except for this wonderful genetic freak who did 12. If I made 12 a "100" on a linear scale, the rest of the women would be screwed. There are ways to do "caps" and "nonlinear" relationships. It just takes some twiddling.
Rich and I already have had conversations over what is reasonable for the "100 score" caps on these events. I also have an idea of what I want for median scores and such. It just a matter now of twiddling with the formulas.
I told you it wasn't as easy as it looks, Fedele!

The way I plan on doing it is the way they do it in the Marines. Once I get the main scale down, then we'll put different cut-offs for different age bands. Just as with the Marines, these cut-off(s) will separate the real men from the men from the boys, and the same for the women.

Lots of progress already, and almost done.

- Bill
P.S. I already know of one change we will make next time. We will do max number of sit-ups in 2 minutes rather than the minute which we did at camp. Abdominals are more endurance muscles, and we need to push people into the glycolysis range of energy production to see what they are made of. That's longer than for simple power events (phosphocreatine system) but shorter than for the endurance events like the mile run (aerobic metabolism). The Marines use that same time limit for their test.
Translation - we will make it burn.

But this will not affect our ability to differentiate people for this first run of the test. It's just an improvement next time around.