We compared the predictive validity of two types of Frame-of-Reference personality measures to each other and to a baseline generic measure. Each version of the measures used a unique response-format referred to as frequency-based estimation that allowed the behavioral consistency of responses to be gauged. Generic personality scales, tagged scales with “at school”, and completely modified scales were compared in their prediction of academic performance, counterproductive academic behavior, and participant reactions. Results showed that completely contextualized measures were the most predictively valid and, contrary to our expectations, behavioral consistency did not moderate the relationships. Face validity and to a lesser extent perceived predictive validity improved with increasing contextualization. We discuss the implications of our results for personality assessment in applied settings.