Personality Assessment

University of Notre Dame

This module provides a basic overview to the assessment of personality. It discusses objective personality tests (based on both self-report and informant ratings), projective and implicit tests, and behavioral/performance measures. It describes the basic features of each method, as well as reviewing the strengths, weaknesses, and overall validity of each approach.

Learning Objectives

Appreciate the diversity of methods that are used to measure personality characteristics.
Understand the logic, strengths and weaknesses of each approach.
Gain a better sense of the overall validity and range of applications of personality tests.

Introduction

Personality is the field within psychology that studies the thoughts, feelings, behaviors, goals, and interests of normal individuals. It therefore covers a very wide range of important psychological characteristics. Moreover, different theoretical models have generated very different strategies for measuring these characteristics. For example, humanistically oriented models argue that people have clear, well-defined goals and are actively striving to achieve them (McGregor, McAdams, & Little, 2006). It, therefore, makes sense to ask them directly about themselves and their goals. In contrast, psychodynamically oriented theories propose that people lack insight into their feelings and motives, such that their behavior is influenced by processes that operate outside of their awareness (e.g., McClelland, Koestner, & Weinberger, 1989; Meyer & Kurtz, 2006). Given that people are unaware of these processes, it does not make sense to ask directly about them. One, therefore, needs to adopt an entirely different approach to identify these nonconscious factors. Not surprisingly, researchers have adopted a wide range of approaches to measure important personality characteristics. The most widely used strategies will be summarized in the following sections.

A pencil sketch self-portrait of a young man. — Do people possess the necessary awareness to see themselves as they are and provide accurate insights into their own personalities? [Image: fotEK10, https://goo.gl/GCBDJL, CC BY-NC-SA 2.0, https://goo.gl/Toc0ZF]

Objective Tests

Definition

Objective tests (Loevinger, 1957; Meyer & Kurtz, 2006) represent the most familiar and widely used approach to assessing personality. Objective tests involve administering a standard set of items, each of which is answered using a limited set of response options (e.g., true or false; strongly disagree, slightly disagree, slightly agree, strongly agree). Responses to these items then are scored in a standardized, predetermined way. For example, self-ratings on items assessing talkativeness, assertiveness, sociability, adventurousness, and energy can be summed up to create an overall score on the personality trait of extraversion.

It must be emphasized that the term “objective” refers to the method that is used to score a person’s responses, rather than to the responses themselves. As noted by Meyer and Kurtz (2006, p. 233), “What is objective about such a procedure is that the psychologist administering the test does not need to rely on judgment to classify or interpret the test-taker’s response; the intended response is clearly indicated and scored according to a pre-existing key.” In fact, as we will see, a person’s test responses may be highly subjective and can be influenced by a number of different rating biases.

Basic Types of Objective Tests

Self-report measures

Objective personality tests can be further subdivided into two basic types. The first type—which easily is the most widely used in modern personality research—asks people to describe themselves. This approach offers two key advantages. First, self-raters have access to an unparalleled wealth of information: After all, who knows more about you than you yourself? In particular, self-raters have direct access to their own thoughts, feelings, and motives, which may not be readily available to others (Oh, Wang, & Mount, 2011; Watson, Hubbard, & Weise, 2000). Second, asking people to describe themselves is the simplest, easiest, and most cost-effective approach to assessing personality. Countless studies, for instance, have involved administering self-report measures to college students, who are provided some relatively simple incentive (e.g., extra course credit) to participate.

The items included in self-report measures may consist of single words (e.g., assertive), short phrases (e.g., am full of energy), or complete sentences (e.g., I like to spend time with others). Table 1 presents a sample self-report measure assessing the general traits comprising the influential five-factor model (FFM) of personality: neuroticism, extraversion, openness, agreeableness, and conscientiousness (John & Srivastava, 1999; McCrae, Costa, & Martin, 2005). The sentences shown in Table 1 are modified versions of items included in the International Personality Item Pool (IPIP) (Goldberg et al., 2006), which is a rich source of personality-related content in the public domain (for more information about IPIP, go to: http://ipip.ori.org/).

A sample survey measuring the Big 5 personality traits. The survey uses a 1-5 scale for agreement with 15 items. Each of the Big 5 is measured by three items. For example, one of the neuroticism items reads, "I get upset easily." — Table1: Sample Self-Report Personality Measure

Self-report personality tests show impressive validity in relation to a wide range of important outcomes. For example, self-ratings of conscientiousness are significant predictors of both overall academic performance (e.g., cumulative grade point average; Poropat, 2009) and job performance (Oh, Wang, and Mount, 2011). Roberts, Kuncel, Shiner, Caspi, and Goldberg (2007) reported that self-rated personality predicted occupational attainment, divorce, and mortality. Similarly, Friedman, Kern, and Reynolds (2010) showed that personality ratings collected early in life were related to happiness/well-being, physical health, and mortality risk assessed several decades later. Finally, self-reported personality has important and pervasive links to psychopathology. Most notably, self-ratings of neuroticism are associated with a wide array of clinical syndromes, including anxiety disorders, depressive disorders, substance use disorders, somatoform disorders, eating disorders, personality and conduct disorders, and schizophrenia/schizotypy (Kotov, Gamez, Schmidt, & Watson, 2010; Mineka, Watson, & Clark, 1998).

At the same time, however, it is clear that this method is limited in a number of ways. First, raters may be motivated to present themselves in an overly favorable, socially desirable way (Paunonen & LeBel, 2012). This is a particular concern in “high-stakes testing,” that is, situations in which test scores are used to make important decisions about individuals (e.g., when applying for a job). Second, personality ratings reflect a self-enhancement bias (Vazire & Carlson, 2011); in other words, people are motivated to ignore (or at least downplay) some of their less desirable characteristics and to focus instead on their more positive attributes. Third, self-ratings are subject to the reference group effect (Heine, Buchtel, & Norenzayan, 2008); that is, we base our self-perceptions, in part, on how we compare to others in our sociocultural reference group. For instance, if you tend to work harder than most of your friends, you will see yourself as someone who is relatively conscientious, even if you are not particularly conscientious in any absolute sense.

Informant ratings

Another approach is to ask someone who knows a person well to describe his or her personality characteristics. In the case of children or adolescents, the informant is most likely to be a parent or teacher. In studies of older participants, informants may be friends, roommates, dating partners, spouses, children, or bosses (Oh et al., 2011; Vazire & Carlson, 2011; Watson et al., 2000).

Generally speaking, informant ratings are similar in format to self-ratings. As was the case with self-report, items may consist of single words, short phrases, or complete sentences. Indeed, many popular instruments include parallel self- and informant-rating versions, and it often is relatively easy to convert a self-report measure so that it can be used to obtain informant ratings. Table 2 illustrates how the self-report instrument shown in Table 1 can be converted to obtain spouse-ratings (in this case, having a husband describe the personality characteristics of his wife).

This survey is a variation of the earlier 15 item survey of the Big 5 personality traits. In this version, however, the ratings are not for the person filling out the survey. Instead, the person is rating his or her wife on the various items. This is an example of a spouse-rating form, also called an informant rating. — Table 2: Sample Spouse-Report Personality Measure

Informant ratings are particularly valuable when self-ratings are impossible to collect (e.g., when studying young children or cognitively impaired adults) or when their validity is suspect (e.g., as noted earlier, people may not be entirely honest in high-stakes testing situations). They also may be combined with self-ratings of the same characteristics to produce more reliable and valid measures of these attributes (McCrae, 1994).

Informant ratings offer several advantages in comparison to other approaches to assessing personality. A well-acquainted informant presumably has had the opportunity to observe large samples of behavior in the person he or she is rating. Moreover, these judgments presumably are not subject to the types of defensiveness that potentially can distort self-ratings (Vazire & Carlson, 2011). Indeed, informants typically have strong incentives for being accurate in their judgments. As Funder and Dobroth (1987, p. 409), put it, “Evaluations of the people in our social environment are central to our decisions about who to befriend and avoid, trust and distrust, hire and fire, and so on.”

Informant personality ratings have demonstrated a level of validity in relation to important life outcomes that is comparable to that discussed earlier for self-ratings. Indeed, they outperform self-ratings in certain circumstances, particularly when the assessed traits are highly evaluative in nature (e.g., intelligence, charm, creativity; see Vazire & Carlson, 2011). For example, Oh et al. (2011) found that informant ratings were more strongly related to job performance than were self-ratings. Similarly, Oltmanns and Turkheimer (2009) summarized evidence indicating that informant ratings of Air Force cadets predicted early, involuntary discharge from the military better than self-ratings.

Nevertheless, informant ratings also are subject to certain problems and limitations. One general issue is the level of relevant information that is available to the rater (Funder, 2012). For instance, even under the best of circumstances, informants lack full access to the thoughts, feelings, and motives of the person they are rating. This problem is magnified when the informant does not know the person particularly well and/or only sees him or her in a limited range of situations (Funder, 2012; Beer & Watson, 2010).

A bride and groom happily posing for the camera on their wedding day. — Informant personality ratings are generally a reliable and valid assessment instrument, however in certain cases the informant may have some significant biases that make the rating less reliable. Newly married individuals for example are likely to rate their partners in an unrealistically positive way. [Image: Sociales El Heraldo de Saltillo, https://goo.gl/3g3Qhh, CC BY-NC-SA 2.0, https://goo.gl/Toc0ZF]

Informant ratings also are subject to some of the same response biases noted earlier for self-ratings. For instance, they are not immune to the reference group effect. Indeed, it is well-established that parent ratings often are subject to a sibling contrast effect, such that parents exaggerate the true magnitude of differences between their children (Pinto, Rijsdijk, Frazier-Wood, Asherson, & Kuntsi, 2012). Furthermore, in many studies, individuals are allowed to nominate (or even recruit) the informants who will rate them. Because of this, it most often is the case that informants (who, as noted earlier, may be friends, relatives, or romantic partners) like the people they are rating. This, in turn, means that informants may produce overly favorable personality ratings. Indeed, their ratings actually can be more favorable than the corresponding self-ratings (Watson & Humrichouse, 2006). This tendency for informants to produce unrealistically positive ratings has been termed the letter of recommendation effect (Leising, Erbs, & Fritz, 2010) and the honeymoon effect when applied to newlyweds (Watson & Humrichouse, 2006).

Other Ways of Classifying Objective Tests

Comprehensiveness

In addition to the source of the scores, there are at least two other important dimensions on which personality tests differ. The first such dimension concerns the extent to which an instrument seeks to assess personality in a reasonably comprehensive manner. At one extreme, many widely used measures are designed to assess a single core attribute. Examples of these types of measures include the Toronto Alexithymia Scale (Bagby, Parker, & Taylor, 1994), the Rosenberg Self-Esteem Scale (Rosenberg, 1965), and the Multidimensional Experiential Avoidance Questionnaire (Gamez, Chmielewski, Kotov, Ruggero, & Watson, 2011). At the other extreme, a number of omnibus inventories contain a large number of specific scales and purport to measure personality in a reasonably comprehensive manner. These instruments include the California Psychological Inventory (Gough, 1987), the Revised HEXACO Personality Inventory (HEXACO-PI-R) (Lee & Ashton, 2006), the Multidimensional Personality Questionnaire (Patrick, Curtin, & Tellegen, 2002), the NEO Personality Inventory-3 (NEO-PI-3) (McCrae et al., 2005), the Personality Research Form (Jackson, 1984), and the Sixteen Personality Factor Questionnaire (Cattell, Eber, & Tatsuoka, 1980).

Breadth of the target characteristics

Second, personality characteristics can be classified at different levels of breadth or generality. For example, many models emphasize broad, “big” traits such as neuroticism and extraversion. These general dimensions can be divided up into several distinct yet empirically correlated component traits. For example, the broad dimension of extraversion contains such specific component traits as dominance (extraverts are assertive, persuasive, and exhibitionistic), sociability (extraverts seek out and enjoy the company of others), positive emotionality (extraverts are active, energetic, cheerful, and enthusiastic), and adventurousness (extraverts enjoy intense, exciting experiences).

Some popular personality instruments are designed to assess only the broad, general traits. For example, similar to the sample instrument displayed in Table 1, the Big Five Inventory (John & Srivastava, 1999) contains brief scales assessing the broad traits of neuroticism, extraversion, openness, agreeableness, and conscientiousness. In contrast, many instruments—including several of the omnibus inventories mentioned earlier—were designed primarily to assess a large number of more specific characteristics. Finally, some inventories—including the HEXACO-PI-R and the NEO-PI-3—were explicitly designed to provide coverage of both general and specific trait characteristics. For instance, the NEO-PI-3 contains six specific facet scales (e.g., Gregariousness, Assertiveness, Positive Emotions, Excitement Seeking) that then can be combined to assess the broad trait of extraversion.

Projective and Implicit Tests

Projective Tests

An example of a Rorschach inkblot — Projective tests, such as the famous Rorschach inkblot test require a person to give spontaneous answers that "project" their unique personality onto an ambiguous stimulus. [Imge: CC0 Public Domain, https://goo.gl/m25gce]

As noted earlier, some approaches to personality assessment are based on the belief that important thoughts, feelings, and motives operate outside of conscious awareness. Projective tests represent influential early examples of this approach. Projective tests originally were based on the projective hypothesis (Frank, 1939; Lilienfeld, Wood, & Garb, 2000): If a person is asked to describe or interpret ambiguous stimuli—that is, things that can be understood in a number of different ways—their responses will be influenced by nonconscious needs, feelings, and experiences (note, however, that the theoretical rationale underlying these measures has evolved over time) (see, for example, Spangler, 1992). Two prominent examples of projective tests are the Rorschach Inkblot Test (Rorschach, 1921) and the Thematic Apperception Test (TAT) (Morgan & Murray, 1935). The former asks respondents to interpret symmetrical blots of ink, whereas the latter asks them to generate stories about a series of pictures.

For instance, one TAT picture depicts an elderly woman with her back turned to a young man; the latter looks downward with a somewhat perplexed expression. Another picture displays a man clutched from behind by three mysterious hands. What stories could you generate in response to these pictures?

In comparison to objective tests, projective tests tend to be somewhat cumbersome and labor intensive to administer. The biggest challenge, however, has been to develop a reliable and valid scheme to score the extensive set of responses generated by each respondent. The most widely used Rorschach scoring scheme is the Comprehensive System developed by Exner (2003). The most influential TAT scoring system was developed by McClelland, Atkinson and colleagues between 1947 and 1953 (McClelland et al., 1989; see also Winter, 1998), which can be used to assess motives such as the need for achievement.

The validity of the Rorschach has been a matter of considerable controversy (Lilienfeld et al., 2000; Mihura, Meyer, Dumitrascu, & Bombel, 2012; Society for Personality Assessment, 2005). Most reviews acknowledge that Rorschach scores do show some ability to predict important outcomes. Its critics, however, argue that it fails to provide important incremental information beyond other, more easily acquired information, such as that obtained from standard self-report measures (Lilienfeld et al., 2000).

Validity evidence is more impressive for the TAT. In particular, reviews have concluded that TAT-based measures of the need for achievement (a) show significant validity to predict important criteria and (b) provide important information beyond that obtained from objective measures of this motive (McClelland et al., 1989; Spangler, 1992). Furthermore, given the relatively weak associations between objective and projective measures of motives, McClelland et al. (1989) argue that they tap somewhat different processes, with the latter assessing implicit motives (Schultheiss, 2008).

Implicit Tests

In recent years, researchers have begun to use implicit measures of personality (Back, Schmuckle, & Egloff, 2009; Vazire & Carlson, 2011). These tests are based on the assumption that people form automatic or implicit associations between certain concepts based on their previous experience and behavior. If two concepts (e.g., me and assertive) are strongly associated with each other, then they should be sorted together more quickly and easily than two concepts (e.g., me and shy) that are less strongly associated. Although validity evidence for these measures still is relatively sparse, the results to date are encouraging: Back et al. (2009), for example, showed that implicit measures of the FFM personality traits predicted behavior even after controlling for scores on objective measures of these same characteristics.

Behavioral and Performance Measures

Two college students sit on bunk beds in a very clean and orderly dorm room. — Observing real world behavior is one way to assess personality. Tendencies such as messiness and neatness are clues to personality. [Image: Crumley Roberts, https://goo.gl/6Ahn8q, CC BY 2.0, https://goo.gl/BRvSA7]

A final approach is to infer important personality characteristics from direct samples of behavior. For example, Funder and Colvin (1988) brought opposite-sex pairs of participants into the laboratory and had them engage in a five-minute “getting acquainted” conversation; raters watched videotapes of these interactions and then scored the participants on various personality characteristics. Mehl, Gosling, and Pennebaker (2006) used the electronically activated recorder (EAR) to obtain samples of ambient sounds in participants’ natural environments over a period of two days; EAR-based scores then were related to self- and observer-rated measures of personality. For instance, more frequent talking over this two-day period was significantly related to both self- and observer-ratings of extraversion. As a final example, Gosling, Ko, Mannarelli, and Morris (2002) sent observers into college students’ bedrooms and then had them rate the students’ personality characteristics on the Big Five traits. The averaged observer ratings correlated significantly with participants’ self-ratings on all five traits. Follow-up analyses indicated that conscientious students had neater rooms, whereas those who were high in openness to experience had a wider variety of books and magazines.

Behavioral measures offer several advantages over other approaches to assessing personality. First, because behavior is sampled directly, this approach is not subject to the types of response biases (e.g., self-enhancement bias, reference group effect) that can distort scores on objective tests. Second, as is illustrated by the Mehl et al. (2006) and Gosling et al. (2002) studies, this approach allows people to be studied in their daily lives and in their natural environments, thereby avoiding the artificiality of other methods (Mehl et al., 2006). Finally, this is the only approach that actually assesses what people do, as opposed to what they think or feel (see Baumeister, Vohs, & Funder, 2007).

At the same time, however, this approach also has some disadvantages. This assessment strategy clearly is much more cumbersome and labor intensive than using objective tests, particularly self-report. Moreover, similar to projective tests, behavioral measures generate a rich set of data that then need to be scored in a reliable and valid way. Finally, even the most ambitious study only obtains relatively small samples of behavior that may provide a somewhat distorted view of a person’s true characteristics. For example, your behavior during a “getting acquainted” conversation on a single given day inevitably will reflect a number of transient influences (e.g., level of stress, quality of sleep the previous night) that are idiosyncratic to that day.

Conclusion

No single method of assessing personality is perfect or infallible; each of the major methods has both strengths and limitations. By using a diversity of approaches, researchers can overcome the limitations of any single method and develop a more complete and integrative view of personality.

Discussion Questions

Under what conditions would you expect self-ratings to be most similar to informant ratings? When would you expect these two sets of ratings to be most different from each other?
The findings of Gosling, et al. (2002) demonstrate that we can obtain important clues about students’ personalities from their dorm rooms. What other aspects of people’s lives might give us important information about their personalities?
Suppose that you were planning to conduct a study examining the personality trait of honesty. What method or methods might you use to measure it?

Vocabulary

Big Five: Five, broad general traits that are included in many prominent models of personality. The five traits are neuroticism (those high on this trait are prone to feeling sad, worried, anxious, and dissatisfied with themselves), extraversion (high scorers are friendly, assertive, outgoing, cheerful, and energetic), openness to experience (those high on this trait are tolerant, intellectually curious, imaginative, and artistic), agreeableness (high scorers are polite, considerate, cooperative, honest, and trusting), and conscientiousness (those high on this trait are responsible, cautious, organized, disciplined, and achievement-oriented).
High-stakes testing: Settings in which test scores are used to make important decisions about individuals. For example, test scores may be used to determine which individuals are admitted into a college or graduate school, or who should be hired for a job. Tests also are used in forensic settings to help determine whether a person is competent to stand trial or fits the legal definition of sanity.
Honeymoon effect: The tendency for newly married individuals to rate their spouses in an unrealistically positive manner. This represents a specific manifestation of the letter of recommendation effect when applied to ratings made by current romantic partners. Moreover, it illustrates the very important role played by relationship satisfaction in ratings made by romantic partners: As marital satisfaction declines (i.e., when the “honeymoon is over”), this effect disappears.
Implicit motives: These are goals that are important to a person, but that he/she cannot consciously express. Because the individual cannot verbalize these goals directly, they cannot be easily assessed via self-report. However, they can be measured using projective devices such as the Thematic Apperception Test (TAT).
Letter of recommendation effect: The general tendency for informants in personality studies to rate others in an unrealistically positive manner. This tendency is due a pervasive bias in personality assessment: In the large majority of published studies, informants are individuals who like the person they are rating (e.g., they often are friends or family members) and, therefore, are motivated to depict them in a socially desirable way. The term reflects a similar tendency for academic letters of recommendation to be overly positive and to present the referent in an unrealistically desirable manner.
Projective hypothesis: The theory that when people are confronted with ambiguous stimuli (that is, stimuli that can be interpreted in more than one way), their responses will be influenced by their unconscious thoughts, needs, wishes, and impulses. This, in turn, is based on the Freudian notion of projection, which is the idea that people attribute their own undesirable/unacceptable characteristics to other people or objects.
Reference group effect: The tendency of people to base their self-concept on comparisons with others. For example, if your friends tend to be very smart and successful, you may come to see yourself as less intelligent and successful than you actually are. Informants also are prone to these types of effects. For instance, the sibling contrast effect refers to the tendency of parents to exaggerate the true extent of differences between their children.
Reliablility: The consistency of test scores across repeated assessments. For example, test-retest reliability examines the extent to which scores change over time.
Self-enhancement bias: The tendency for people to see and/or present themselves in an overly favorable way. This tendency can take two basic forms: defensiveness (when individuals actually believe they are better than they really are) and impression management (when people intentionally distort their responses to try to convince others that they are better than they really are). Informants also can show enhancement biases. The general form of this bias has been called the letter-of-recommendation effect, which is the tendency of informants who like the person they are rating (e.g., friends, relatives, romantic partners) to describe them in an overly favorable way. In the case of newlyweds, this tendency has been termed the honeymoon effect.
Sibling contrast effect: The tendency of parents to use their perceptions of all of their children as a frame of reference for rating the characteristics of each of them. For example, suppose that a mother has three children; two of these children are very sociable and outgoing, whereas the third is relatively average in sociability. Because of operation of this effect, the mother will rate this third child as less sociable and outgoing than he/she actually is. More generally, this effect causes parents to exaggerate the true extent of differences between their children. This effect represents a specific manifestation of the more general reference group effect when applied to ratings made by parents.
Validity: Evidence related to the interpretation and use of test scores. A particularly important type of evidence is criterion validity, which involves the ability of a test to predict theoretically relevant outcomes. For example, a presumed measure of conscientiousness should be related to academic achievement (such as overall grade point average).

References

Back, M. D., Schmukle, S. C., & Egloff, B. (2009). Predicting actual behavior from the explicit and implicit self-concept of personality. Journal of Personality and Social Psychology, 97, 533–548.
Bagby, R. M., Parker, J. D. A., Taylor, G. J. (1994). The Twenty-Item Toronto Alexithymia Scale: I. Item selection and cross-validation of the factor structure. Journal of Psychosomatic Research, 38, 23–32.
Baumeister, R. F., Vohs, K. D., & Funder, D. C. (2007). Psychology as the science of self-reports and finger movements: Whatever happened to actual behavior? Perspectives on Psychological Science, 2, 396–403.
Beer, A., & Watson, D. (2010). The effects of information and exposure on self-other agreement. Journal of Research in Personality, 44, 38–45.
Cattell, R. B., Eber, H. W, & Tatsuoka, M. M. (1980). Handbook for the Sixteen Personality Factor Questionnaire (16PF). Champaign, IL: Institute for Personality and Ability Testing.
Exner, J. E. (2003). The Rorschach: A comprehensive system (4th ed.). New York, NY: Wiley.
Frank, L. K. (1939). Projective methods for the study of personality. Journal of Psychology: Interdisciplinary and Applied, 8, 389–413.
Friedman, H. S., Kern, K. L., & Reynolds, C. A. (2010). Personality and health, subjective well-being, and longevity. Journal of Personality, 78, 179–215.
Funder, D. C. (2012). Accurate personality judgment. Current Directions in Psychological Science, 21, 177–182.
Funder, D. C., & Colvin, C. R. (1988). Friends and strangers: Acquaintanceship, agreement, and the accuracy of personality judgment. Journal of Personality and Social Psychology, 55, 149–158.
Funder, D. C., & Dobroth, K. M. (1987). Differences between traits: Properties associated with interjudge agreement. Journal of Personality and Social Psychology, 52, 409–418.
Gamez, W., Chmielewski, M., Kotov, R., Ruggero, C., & Watson, D. (2011). Development of a measure of experiential avoidance: The Multidimensional Experiential Avoidance Questionnaire. Psychological Assessment, 23, 692–713.
Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. C. (2006). The International Personality Item Pool and the future of public-domain personality measures. Journal of Research in Personality, 40, 84–96.
Gosling, S. D., Ko, S. J., Mannarelli, T., & Morris, M. E. (2002). A room with a cue: Personality judgments based on offices and bedrooms. Journal of Personality and Social Psychology, 82, 379–388.
Gough, H. G. (1987). California Psychological Inventory [Administrator’s guide]. Palo Alto, CA: Consulting Psychologists Press.
Heine, S. J., Buchtel, E. E., & Norenzayan, A. (2008). What do cross-national comparisons of personality traits tell us? The case of conscientiousness. Psychological Science, 19, 309–313.
Jackson, D. N. (1984). Personality Research Form manual (3rd ed.). Port Huron, MI: Research Psychologists Press.
John, O. P., & Srivastava, S. (1999). The big five trait taxonomy: History, measurement, and theoretical perspectives. In L. A. Pervin & O. P. John (Eds.), Handbook of personality: Theory and research (2nd ed., pp. 102–138). New York, NY: The Guilford Press.
Kotov, R., Gamez, W., Schmidt, F., & Watson, D. (2010). Linking “big” personality traits to anxiety, depressive, and substance use disorders: A meta-analysis. Psychological Bulletin, 136, 768–821.
Lee, K., & Ashton, M. C. (2006). Further assessment of the HEXACO Personality Inventory: Two new facet scales and an observer report form. Psychological Assessment, 18, 182–191.
Leising, D., Erbs, J., & Fritz, U. (2010). The letter of recommendation effect in informant ratings of personality. Journal of Personality and Social Psychology, 98, 668–682.
Lilienfeld, S. O., Wood, J. M., & Garb, H. N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1, 27–66.
Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3, 635–694.
McClelland, D. C., Koestner, R., & Weinberger, J. (1989). How do self-attributed and implicit motives differ? Psychological Review, 96, 690–702.
McCrae, R. R. (1994). The counterpoint of personality assessment: Self-reports and observer ratings. Assessment, 1, 159–172.
McCrae, R. R., Costa, P. T., Jr., & Martin, T. A. (2005). The NEO-PI-3: A more readable Revised NEO Personality Inventory. Journal of Personality Assessment, 84, 261–270.
McGregor, I., McAdams, D. P., & Little, B. R. (2006). Personal projects, life stories, and happiness: On being true to traits. Journal of Research in Personality, 40, 551–572.
Mehl, M. R., Gosling, S. D., & Pennebaker, J. W. (2006). Personality in its natural habitat: Manifestations and implicit folk theories of personality in daily life. Journal of Personality and Social Psychology, 90, 862–877.
Meyer, G. J., & Kurtz, J. E. (2006). Advancing personality assessment terminology: Time to retire “objective” and “projective” as personality test descriptors. Journal of Personality Assessment, 87, 223–225.
Mihura, J. L., Meyer, G. J., Dumitrascu, N., & Bombel, G. (2012). The validity of individual Rorschach variables: Systematic Reviews and meta-analyses of the Comprehensive System. Psychological Bulletin. (Advance online publication.) doi:10.1037/a0029406
Mineka, S., Watson, D., & Clark, L. A. (1998). Comorbidity of anxiety and unipolar mood disorders. Annual Review of Psychology, 49, 377–412.
Morgan, C. D., & Murray, H. A. (1935). A method for investigating fantasies. The Archives of Neurology and Psychiatry, 34, 389–406.
Oh, I.-S., Wang, G., & Mount, M. K. (2011). Validity of observer ratings of the five-factor model of personality traits: A meta-analysis. Journal of Applied Psychology, 96, 762–773.
Oltmanns, T. F., & Turkheimer, E. (2009). Person perception and personality pathology. Current Directions in Psychological Science, 18, 32–36.
Patrick, C. J., Curtin, J. J., & Tellegen, A. (2002). Development and validation of a brief form of the Multidimensional Personality Questionnaire. Psychological Assessment, 14, 150–163.
Paunonen, S. V., & LeBel, E. P. (2012). Socially desirable responding and its elusive effects on the validity of personality assessments. Journal of Personality and Social Psychology, 103, 158–175.
Pinto, R., Rijsdijk, F., Frazier-Wood, A. C., Asherson, P., & Kuntsi, J. (2012). Bigger families fare better: A novel method to estimate rater contrast effects in parental ratings on ADHD symptoms. Behavior Genetics, 42, 875–885.
Poropat, A. E. (2009). A meta-analysis of the five-factor model of personality and academic performance. Psychological Bulletin, 135, 322–338.
Roberts, B. W., Kuncel, N. R., Shiner, R., Caspi, A., & Goldberg, L. R. (2007). The power of personality: The comparative validity of personality traits, socioeconomic status, and cognitive ability for predicting important life outcomes. Perspectives on Psychological Science, 2, 313–345.
Rorschach, H. (1942) (Original work published 1921). Psychodiagnostik [Psychodiagnostics]. Bern, Switzerland: Bircher.
Rosenberg, M. (1965). Society and the adolescent self-image. Princeton, NJ: Princeton University Press.
Schultheiss, O. C. (2008). Implicit motives. In O. P. John, R. W. Robins, & L. A. Pervin (Eds.), Handbook of personality: Theory and research (3rd ed.) (pp. 603–633). New York, NY: Guilford Press.
Society for Personality Assessment. (2005). The status of the Rorschach in clinical and forensic practice: An official statement by the Board of Trustees of the Society for Personality Assessment. Journal of Personality Assessment, 85, 219–237.
Spangler, W. D. (1992). Validity of questionnaire and TAT measures of need for achievement: Two meta-analyses. Psychological Bulletin, 112, 140–154.
Vazire, S., & Carlson, E. N. (2011). Others sometimes know us better than we know ourselves. Current Directions in Psychological Science, 20, 104–108.
Watson, D., & Humrichouse, J. (2006). Personality development in emerging adulthood: Integrating evidence from self- and spouse-ratings. Journal of Personality and Social Psychology, 91, 959–974.
Watson, D., Hubbard, B., & Wiese, D. (2000). Self-other agreement in personality and affectivity: The role of acquaintanceship, trait visibility, and assumed similarity. Journal of Personality and Social Psychology, 78, 546–558.
Winter, D. G. (1998). Toward a science of personality psychology: David McClelland’s development of empirically derived TAT measures. History of Psychology, 1, 130–153.

Authors

David Watson
David Watson is the Andrew J. McKenna Professor of Psychology at the University of Notre Dame. He is well known for his work in personality, clinical, and mood assessment. He and his colleagues have created a number of widely used instruments, including the Positive and Negative Affect Schedule (PANAS).

Creative Commons License

Personality Assessment by David Watson is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available in our Licensing Agreement.

How to cite this Noba module using APA Style

Watson, D. (2026). Personality assessment. In R. Biswas-Diener & E. Diener (Eds), Noba textbook series: Psychology. Champaign, IL: DEF publishers. Retrieved from http://noba.to/eac2pyv7