So it appears this "Objective Personality" group only have two people ('operators') trained to do the interviews, which means the test is already then loaded with observer bias. In their methods explanation it is certainly not clear how many operators were used over time, but it appears to be just the two.
To be able to eliminate observer bias further, it would require a group of at least 30 trained and untrained observers for every interview subject, because you are also testing the test, not just the subjects. The questions should be identical for all interviews, and there should be questions incorporated to pick up contradictions; for example, if one question concerns whether someone prefers to deal with abstract theory or factual data, another question should test for this subliminally. It is pretty obvious when someone avoids abstract theorising. Or present two different problems to solve; one that incorporates abstraction, and one that incorporates factual data - see which one they prefer to answer, and have several of these throughout the test, masked as standard questions. Even then, someone well versed in personality theory would pick up on it, and could potentially skew the data.
However, how do we know that a certain function is indeed reflective of a particular neurological process? Unless you can test this neurologically, you cannot differentiate the functions objectively - you are merely assuming a correlation, not demonstrating it.
Also, the whole idea about seeing emerging patterns in their results is equally questionable. How do we know, from using the same two operators the whole time, that there wasn't some kind of bias that caused them to select functions based on the subject's looks, and not the other way around? There's a small group of bearded guys and a small group of blonde, smiling ladies looking very similar. Chances are very high that the already trained operators would have more or less unconscious preconceived ideas about gender, looks and personality traits based on certain stereotypes. Even in their voice-only tests, you would still be able to pick up on cues that would result in involuntary associations with certain stereotypes.
They call their method "double-blind", but I don't think they understand what double-blind means. All they did was put two operators in two different rooms. Their results became increasingly accurate, which would be expected when they are, in fact, both trained by the same people. All that is happening is that the confirmation bias is cemented further. A truly unbiased test should be able to be used by anyone, trained or untrained with the same results.
If thirty untrained operators were able to achieve a statistically significant result after only a brief introduction to the test and its interpretations, then you would have a test seemingly reliable enough for use by anyone. But it would have to have the necessary assumptions of neurological accuracy established first.
A friend of mine tested this. He was becoming increasingly frustrated with the highly unreliable method of assessing abrasion on fossil bones. So he put thirty people in a room; both professional and unprofessional, and got them to assess abrasion on the same 50 bones of varying abrasion intensity, using this method that has been blindly trusted for more than two decades.
His results concluded that there was significant variability both intra-professional, and intra-non professional. When he tested the two groups against each other there was as much variability and bias in the professional as in the non-professional group.
His point: trusted scientific tools (of a qualitative nature) are not necessarily trustworthy due to human bias. A more reliable method would be quantitative.
My comment: even if you develop a quantitative tool that supposedly eliminates this bias - what is the true nature of the objective reality that you are measuring against? Every serious scientist should be aware of this problem.
...anyway,
A double-blind study would ideally have operators who are both familiar with, and operators who are unfamiliar with the test's hypotheses and objectives. The same should go for the test subjects. One would then run statistical tests of the two groups against each other, as well as intra-group, and then you would repeat the exact same study in different social and cultural settings.
In other words, you would have several tests: thirty trained observers test n subjects who are familiar with the test's aims; thirty untrained observers test the same subjects. Then, reverse that. Finally, mix them all up, and re-run the test on the same subjects as well as on a complete new group of subjects.
Finally, on that last point, they have not clarified how their "random" subjects were chosen. What is their definition of random? How do we know how many of the subjects had a-priori knowledge of the test's objectives and more importantly, whether the subjects were already familiar with personality theory? For those test subject who were, their answers would naturally be biased towards a personal favourable outcome, so the accuracy in these cases would be questionable.
Even at that point, the test could still be biased due to the questions themselves. Are they leading questions? Are the questions accurately reflective of the emerging traits? How were these traits defined, and by whom?
Finally, what gets the Charlatan warning bells ringing to the point of breaking the sonic barrier: single "coaching" session: $89/hour, monthly: $249/week; weekly class: $19/month.
...
I am not discounting this method completely. In fact, I think it is a welcome improvement. But their claim to empirical science is questionable, for the reasons outlined above. It really bothers me that they are already profiteering on a method that is not scientifically recognised.
Ugh...tl;dr....I've obviously lost my faith in typology....