Conducting Psychology Research in the Real World


University of Arizona

Because of its ability to determine cause-and-effect relationships, the laboratory experiment is traditionally considered the method of choice for psychological science. One downside, however, is that as it carefully controls conditions and their effects, it can yield findings that are out of touch with reality and have limited use when trying to understand real-world behavior. This module highlights the importance of also conducting research outside the psychology laboratory, within participants’ natural, everyday environments, and reviews existing methodologies for studying daily life

Learning Objectives

  • Identify limitations of the traditional laboratory experiment.
  • Explain ways in which daily life research can further psychological science.
  • Know what methods exist for conducting psychological research in the real world.


The laboratory experiment is traditionally considered the “gold standard” in psychology research. This is because only laboratory experiments can clearly separate cause from effect and therefore establish causality. Despite this unique strength, it is also clear that a scientific field that is mainly based on controlled laboratory studies ends up lopsided. Specifically, it accumulates a lot of knowledge on what can happen—under carefully isolated and controlled circumstances—but it has little to say about what actually does happen under the circumstances that people actually encounter in their daily lives.

Do the research results obtained in isolated, carefully controlled laboratory conditions generalize into the real world? (Photo: nessen marshall)

For example, imagine you are a participant in an experiment that looks at the effect of being in a good mood on generosity, a topic that may have a good deal of practical application. Researchers create an internally-valid, carefully-controlled experiment where they randomly assign you to watch either a happy movie or a neutral movie, and then you are given the opportunity to help the researcher out by staying longer and participating in another study. If people in a good mood are more willing to stay and help out, the researchers can feel confident that – since everything else was held constant – your positive mood led you to be more helpful. However, what does this tell us about helping behaviors in the real world? Does it generalize to other kinds of helping, such as donating money to a charitable cause? Would all kinds of happy movies produce this behavior, or only this one? What about other positive experiences that might boost mood, like receiving a compliment or a good grade? And what if you were watching the movie with friends, in a crowded theatre, rather than in a sterile research lab? Taking research out into the real world can help answer some of these sorts of important questions.

As one of the founding fathers of social psychology remarked, “Experimentation in the laboratory occurs, socially speaking, on an island quite isolated from the life of society” (Lewin, 1944, p. 286). This module highlights the importance of going beyond experimentation and also conducting research outside the laboratory (Reis & Gosling, 2010), directly within participants’ natural environments, and reviews existing methodologies for studying daily life.

Rationale for Conducting Psychology Research in the Real World

One important challenge researchers face when designing a study is to find the right balance between ensuring internal validity, or the degree to which a study allows unambiguous causal inferences, and external validity, or the degree to which a study ensures that potential findings apply to settings and samples other than the ones being studied (Brewer, 2000). Unfortunately, these two kinds of validity tend to be difficult to achieve at the same time, in one study. This is because creating a controlled setting, in which all potentially influential factors (other than the experimentally-manipulated variable) are controlled, is bound to create an environment that is quite different from what people naturally encounter (e.g., using a happy movie clip to promote helpful behavior). However, it is the degree to which an experimental situation is comparable to the corresponding real-world situation of interest that determines how generalizable potential findings will be. In other words, if an experiment is very far-off from what a person might normally experience in everyday life, you might reasonably question just how useful its findings are.

Because of the incompatibility of the two types of validity, one is often—by design—prioritized over the other. Due to the importance of identifying true causal relationships, psychology has traditionally emphasized internal over external validity. However, in order to make claims about human behavior that apply across populations and environments, researchers complement traditional laboratory research, where participants are brought into the lab, with field research where, in essence, the psychological laboratory is brought to participants. Field studies allow for the important test of how psychological variables and processes of interest “behave” under real-world circumstances (i.e., what actually does happen rather than what can happen). They can also facilitate “downstream” operationalizations of constructs that measure life outcomes of interest directly rather than indirectly.

Take, for example, the fascinating field of psychoneuroimmunology, where the goal is to understand the interplay of psychological factors - such as personality traits or one’s stress level - and the immune system. Highly sophisticated and carefully controlled experiments offer ways to isolate the variety of neural, hormonal, and cellular mechanisms that link psychological variables such as chronic stress to biological outcomes such as immunosuppression (a state of impaired immune functioning; Sapolsky, 2004). Although these studies demonstrate impressively how psychological factors can affect health-relevant biological processes, they—because of their research design—remain mute about the degree to which these factors actually do undermine people’s everyday health in real life. It is certainly important to show that laboratory stress can alter the number of natural killer cells in the blood. But it is equally important to test to what extent the levels of stress that people experience on a day-to-day basis result in them catching a cold more often or taking longer to recover from one. The goal for researchers, therefore, must be to complement traditional laboratory experiments with less controlled studies under real-world circumstances. The term ecological validity is used to refer the degree to which an effect has been obtained under conditions that are typical for what happens in everyday life (Brewer, 2000). In this example, then, people might keep a careful daily log of how much stress they are under as well as noting physical symptoms such as headaches or nausea. Although many factors beyond stress level may be responsible for these symptoms, this more correlational approach can shed light on how the relationship between stress and health plays out outside of the laboratory.

An Overview of Research Methods for Studying Daily Life

Capturing “life as it is lived” has been a strong goal for some researchers for a long time. Wilhelm and his colleagues recently published a comprehensive review of early attempts to systematically document daily life (Wilhelm, Perrez, & Pawlik, 2012). Building onto these original methods, researchers have, over the past decades, developed a broad toolbox for measuring experiences, behavior, and physiology directly in participants’ daily lives (Mehl & Conner, 2012). Figure 1 provides a schematic overview of the methodologies described below.

Figure 1. Schematic Overview of Research Methods for Studying Daily Life

Studying Daily Experiences

Starting in the mid-1970s, motivated by a growing skepticism toward highly-controlled laboratory studies, a few groups of researchers developed a set of new methods that are now commonly known as the experience-sampling method (Hektner, Schmidt, & Csikszentmihalyi, 2007), ecological momentary assessment (Stone & Shiffman, 1994), or the diary method (Bolger & Rafaeli, 2003). Although variations within this set of methods exist, the basic idea behind all of them is to collect in-the-moment (or, close-to-the-moment) self-report data directly from people as they go about their daily lives. This is typically accomplished by asking participants’ repeatedly (e.g., five times per day) over a period of time (e.g., a week) to report on their current thoughts and feelings. The momentary questionnaires often ask about their location (e.g., “Where are you now?”), social environment (e.g., “With whom are you now?”), activity (e.g., “What are you currently doing?”), and experiences (e.g., “How are you feeling?”). That way, researchers get a snapshot of what was going on in participants’ lives at the time at which they were asked to report.

Technology has made this sort of research possible, and recent technological advances have altered the different tools researchers are able to easily use. Initially, participants wore electronic wristwatches that beeped at preprogrammed but seemingly random times, at which they completed one of a stack of provided paper questionnaires. With the mobile computing revolution, both the prompting and the questionnaire completion were gradually replaced by handheld devices such as smartphones. Being able to collect the momentary questionnaires digitally and time-stamped (i.e., having a record of exactly when participants responded) had major methodological and practical advantages and contributed to experience sampling going mainstream (Conner, Tennen, Fleeson, & Barrett, 2009).

Using modern technology like smartphones allows for more widespread experience sampling of research participants. (Photo: Ed Yourdon)

Over time, experience sampling and related momentary self-report methods have become very popular, and, by now, they are effectively the gold standard for studying daily life. They have helped make progress in almost all areas of psychology (Mehl & Conner, 2012). These methods ensure receiving many measurements from many participants, and has further inspired the development of novel statistical methods (Bolger & Laurenceau, 2013). Finally, and maybe most importantly, they accomplished what they sought out to accomplish: to bring attention to what psychology ultimately wants and needs to know about, namely “what people actually do, think, and feel in the various contexts of their lives” (Funder, 2001, p. 213). In short, these approaches have allowed researchers to do research that is more externally valid, or more generalizable to real life, than the traditional laboratory experiment.

To illustrate these techniques, consider a classic study, Stone, Reed, and Neale (1987), who tracked positive and negative experiences surrounding a respiratory infection using daily experience sampling. They found that undesirable experiences peaked and desirable ones dipped about four to five days prior to participants coming down with the cold. More recently, Killingsworth and Gilbert (2010) collected momentary self-reports from more than 2,000 participants via a smartphone app. They found that participants were less happy when their mind was in an idling, mind-wandering state, such as surfing the Internet or multitasking at work, than when it was in an engaged, task-focused one, such as working diligently on a paper. These are just two examples that illustrate how experience-sampling studies have yielded findings that could not be obtained with traditional laboratory methods.

Recently, the day reconstruction method (DRM) (Kahneman, Krueger, Schkade, Schwarz, & Stone, 2004) has been developed to obtain information about a person’s daily experiences without going through the burden of collecting momentary experience-sampling data. In the DRM, participants report their experiences of a given day retrospectively after engaging in a systematic, experiential reconstruction of the day on the following day. As a participant in this type of study, you might look back on yesterday, divide it up into a series of episodes such as “made breakfast,” “drove to work,” “had a meeting,” etc. You might then report who you were with in each episode and how you felt in each. This approach has shed light on what situations lead to moments of positive and negative mood throughout the course of a normal day.

Studying Daily Behavior

Experience sampling is often used to study everyday behavior (i.e., daily social interactions and activities). In the laboratory, behavior is best studied using direct behavioral observation (e.g., video recordings). In the real world, this is, of course, much more difficult. As Funder put it, it seems it would require a “detective’s report [that] would specify in exact detail everything the participant said and did, and with whom, in all of the contexts of the participant’s life” (Funder, 2007, p. 41).

As difficult as this may seem, Mehl and colleagues have developed a naturalistic observation methodology that is similar in spirit. Rather than following participants—like a detective—with a video camera (see Craik, 2000), they equip participants with a portable audio recorder that is programmed to periodically record brief snippets of ambient sounds (e.g., 30 seconds every 12 minutes). Participants carry the recorder (originally a microcassette recorder, now a smartphone app) on them as they go about their days and return it at the end of the study. The recorder provides researchers with a series of sound bites that, together, amount to an acoustic diary of participants’ days as they naturally unfold—and that constitute a representative sample of their daily activities and social encounters. Because it is somewhat similar to having the researcher’s ear at the participant’s lapel, they called their method the electronically activated recorder, or EAR (Mehl, Pennebaker, Crow, Dabbs, & Price, 2001). The ambient sound recordings can be coded for many things, including participants’ locations (e.g., at school, in a coffee shop), activities (e.g., watching TV, eating), interactions (e.g., in a group, on the phone), and emotional expressions (e.g., laughing, sighing). As unnatural or intrusive as it might seem, participants report that they quickly grow accustomed to the EAR and say they soon find themselves behaving as they normally would.

In a cross-cultural study, Ramírez-Esparza and her colleagues used the EAR method to study sociability in the United States and Mexico. Interestingly, they found that although American participants rated themselves significantly higher than Mexicans on the question, “I see myself as a person who is talkative,” they actually spent almost 10 percent less time talking than Mexicans did (Ramírez-Esparza, Mehl, Álvarez Bermúdez, & Pennebaker, 2009). In a similar way, Mehl and his colleagues used the EAR method to debunk the long-standing myth that women are considerably more talkative than men. Using data from six different studies, they showed that both sexes use on average about 16,000 words per day. The estimated sex difference of 546 words was trivial compared to the immense range of more than 46,000 words between the least and most talkative individual (695 versus 47,016 words; Mehl, Vazire, Ramírez-Esparza, Slatcher, & Pennebaker, 2007). Together, these studies demonstrate how naturalistic observation can be used to study objective aspects of daily behavior and how it can yield findings quite different from what other methods yield (Mehl, Robbins, & Deters, 2012).

A series of other methods and creative ways for assessing behavior directly and unobtrusively in the real world are described in a seminal book on real-world, subtle measures (Webb, Campbell, Schwartz, Sechrest, & Grove, 1981). For example, researchers have used time-lapse photography to study the flow of people and the use of space in urban public places (Whyte, 1980). More recently, they have observed people’s personal (e.g., dorm rooms) and professional (e.g., offices) spaces to understand how personality is expressed and detected in everyday environments (Gosling, Ko, Mannarelli, & Morris, 2002). They have even systematically collected and analyzed people’s garbage to measure what people actually consume (e.g., empty alcohol bottles or cigarette boxes) rather than what they say they consume (Rathje & Murphy, 2001). Because people often cannot and sometimes may not want to accurately report what they do, the direct—and ideally nonreactive—assessment of real-world behavior is of high importance for psychological research (Baumeister, Vohs, & Funder, 2007).

Studying Daily Physiology

In addition to studying how people think, feel, and behave in the real world, researchers are also interested in how our bodies respond to the fluctuating demands of our lives. What are the daily experiences that make our “blood boil”? How do our neurotransmitters and hormones respond to the stressors we encounter in our lives? What physiological reactions do we show to being loved—or getting ostracized? You can see how studying these powerful experiences in real life, as they actually happen, may provide more rich and informative data than one might obtain in an artificial laboratory setting that merely mimics these experiences.

Real world stressors may result in very different physiological responses than the same stressors simulated in a lab environment. (Photo: State Farm)

Also, in pursuing these questions, it is important to keep in mind that what is stressful, engaging, or boring for one person might not be so for another. It is, in part, for this reason that researchers have found only limited correspondence between how people respond physiologically to a standardized laboratory stressor (e.g., giving a speech) and how they respond to stressful experiences in their lives. To give an example, Wilhelm and Grossman (2010) describe a participant who showed rather minimal heart rate increases in response to a laboratory stressor (about five to 10 beats per minute) but quite dramatic increases (almost 50 beats per minute) later in the afternoon while watching a soccer game. Of course, the reverse pattern can happen as well, such as when patients have high blood pressure in the doctor’s office but not in their home environment—the so-called white coat hypertension (White, Schulman, McCabe, & Dey, 1989).

Ambulatory physiological monitoring – that is, monitoring physiological reactions as people go about their daily lives - has a long history in biomedical research and an array of monitoring devices exist (Fahrenberg & Myrtek, 1996). Among the biological signals that can now be measured in daily life with portable signal recording devices are the electrocardiogram (ECG), blood pressure, electrodermal activity (or “sweat response”), body temperature, and even the electroencephalogram (EEG) (Wilhelm & Grossman, 2010). Most recently, researchers have added ambulatory assessment of hormones (e.g., cortisol) and other biomarkers (e.g., immune markers) to the list (Schlotz, 2012). The development of ever more sophisticated ways to track what goes on underneath our skins as we go about our lives is a fascinating and rapidly advancing field.

In a recent study, Lane, Zareba, Reis, Peterson, and Moss (2011) used experience sampling combined with ambulatory electrocardiography (a so-called Holter monitor) to study how emotional experiences can alter cardiac function in patients with a congenital heart abnormality (e.g., long QT syndrome). Consistent with the idea that emotions may, in some cases, be able to trigger a cardiac event, they found that typical—in most cases even relatively low intensity— daily emotions had a measurable effect on ventricular repolarization, an important cardiac indicator that, in these patients, is linked to risk of a cardiac event. In another study, Smyth and colleagues (1998) combined experience sampling with momentary assessment of cortisol, a stress hormone. They found that momentary reports of current or even anticipated stress predicted increased cortisol secretion 20 minutes later. Further, and independent of that, the experience of other kinds of negative affect (e.g., anger, frustration) also predicted higher levels of cortisol and the experience of positive affect (e.g., happy, joyful) predicted lower levels of this important stress hormone. Taken together, these studies illustrate how researchers can use ambulatory physiological monitoring to study how the little—and seemingly trivial or inconsequential—experiences in our lives leave objective, measurable traces in our bodily systems.

Studying Online Behavior

Another domain of daily life that has only recently emerged is virtual daily behavior or how people act and interact with others on the Internet. Irrespective of whether social media will turn out to be humanity’s blessing or curse (both scientists and laypeople are currently divided over this question), the fact is that people are spending an ever increasing amount of time online. In light of that, researchers are beginning to think of virtual behavior as being as serious as “actual” behavior and seek to make it a legitimate target of their investigations (Gosling & Johnson, 2010).

Online activity reveals a lot of psychological information to researchers. (Photo: SarahCFrey)

One way to study virtual behavior is to make use of the fact that most of what people do on the Web—emailing, chatting, tweeting, blogging, posting— leaves direct (and permanent) verbal traces. For example, differences in the ways in which people use words (e.g., subtle preferences in word choice) have been found to carry a lot of psychological information (Pennebaker, Mehl, & Niederhoffer, 2003). Therefore, a good way to study virtual social behavior is to study virtual language behavior. Researchers can download people’s—often public—verbal expressions and communications and analyze them using modern text analysis programs (e.g., Pennebaker, Booth, & Francis, 2007).

For example, Cohn, Mehl, and Pennebaker (2004) downloaded blogs of more than a thousand users of, one of the first Internet blogging sites, to study how people responded socially and emotionally to the attacks of September 11, 2001. In going “the online route,” they could bypass a critical limitation of coping research, the inability to obtain baseline information; that is, how people were doing before the traumatic event occurred. Through access to the database of public blogs, they downloaded entries from two months prior to two months after the attacks. Their linguistic analyses revealed that in the first days after the attacks, participants expectedly expressed more negative emotions and were more cognitively and socially engaged, asking questions and sending messages of support. Already after two weeks, though, their moods and social engagement returned to baseline, and, interestingly, their use of cognitive-analytic words (e.g., “think,” “question”) even dropped below their normal level. Over the next six weeks, their mood hovered around their pre-9/11 baseline, but both their social engagement and cognitive-analytic processing stayed remarkably low. This suggests a social and cognitive weariness in the aftermath of the attacks. In using virtual verbal behavior as a marker of psychological functioning, this study was able to draw a fine timeline of how humans cope with disasters.

Reflecting their rapidly growing real-world importance, researchers are now beginning to investigate behavior on social networking sites such as Facebook (Wilson, Gosling, & Graham, 2012). Most research looks at psychological correlates of online behavior such as personality traits and the quality of one’s social life but, importantly, there are also first attempts to export traditional experimental research designs into a online setting. In a pioneering study of online social influence, Bond and colleagues (2012) experimentally tested the effects that peer feedback has on voting behavior. Remarkably, their sample consisted of 16 million (!) Facebook users. They found that online political-mobilization messages (e.g., “I voted” accompanied by selected pictures of their Facebook friends) influenced real-world voting behavior. This was true not just for users who saw the messages but also for their friends and friends of their friends. Although the intervention effect on a single user was very small, through the enormous number of users and indirect social contagion effects, it resulted cumulatively in an estimated 340,000 additional votes—enough to tilt a close election. In short, although still in its infancy, research on virtual daily behavior is bound to change social science, and it has already helped us better understand both virtual and “actual” behavior.

“Smartphone Psychology”?

A review of research methods for studying daily life would not be complete without a vision of “what’s next.” Given how common they have become, it is safe to predict that smartphones will not just remain devices for everyday online communication but will also become devices for scientific data collection and intervention (Kaplan & Stone, 2013; Yarkoni, 2012). These devices automatically store vast amounts of real-world user interaction data, and, in addition, they are equipped with sensors to track the physical (e. g., location, position) and social (e.g., wireless connections around the phone) context of these interactions. Miller (2012, p. 234) states, “The question is not whether smartphones will revolutionize psychology but how, when, and where the revolution will happen.” Obviously, their immense potential for data collection also brings with it big new challenges for researchers (e.g., privacy protection, data analysis, and synthesis). Yet it is clear that many of the methods described in this module—and many still to be developed ways of collecting real-world data—will, in the future, become integrated into the devices that people naturally and happily carry with them from the moment they get up in the morning to the moment they go to bed.


This module sought to make a case for psychology research conducted outside the lab. If the ultimate goal of the social and behavioral sciences is to explain human behavior, then researchers must also—in addition to conducting carefully controlled lab studies—deal with the “messy” real world and find ways to capture life as it naturally happens.

Mortensen and Cialdini (2010) refer to the dynamic give-and-take between laboratory and field research as “full-cycle psychology”. Going full cycle, they suggest, means that “researchers use naturalistic observation to determine an effect’s presence in the real world, theory to determine what processes underlie the effect, experimentation to verify the effect and its underlying processes, and a return to the natural environment to corroborate the experimental findings” (Mortensen & Cialdini, 2010, p. 53). To accomplish this, researchers have access to a toolbox of research methods for studying daily life that is now more diverse and more versatile than it has ever been before. So, all it takes is to go ahead and—literally—bring science to life.

Discussion Questions

  1. What do you think about the tradeoff between unambiguously establishing cause and effect (internal validity) and ensuring that research findings apply to people’s everyday lives (external validity)? Which one of these would you prioritize as a researcher? Why?
  2. What challenges do you see that daily-life researchers may face in their studies? How can they be overcome?
  3. What ethical issues can come up in daily-life studies? How can (or should) they be addressed?
  4. How do you think smartphones and other mobile electronic devices will change psychological research? What are their promises for the field? And what are their pitfalls?


Ambulatory assessment
An overarching term to describe methodologies that assess the behavior, physiology, experience, and environments of humans in naturalistic settings.
Daily Diary method
A methodology where participants complete a questionnaire about their thoughts, feelings, and behavior of the day at the end of the day.
Day reconstruction method (DRM)
A methodology where participants describe their experiences and behavior of a given day retrospectively upon a systematic reconstruction on the following day.
Ecological momentary assessment
An overarching term to describe methodologies that repeatedly sample participants’ real-world experiences, behavior, and physiology in real time.
Ecological validity
The degree to which a study finding has been obtained under conditions that are typical for what happens in everyday life.
Electronically activated recorder, or EAR
A methodology where participants wear a small, portable audio recorder that intermittently records snippets of ambient sounds around them.
Experience-sampling method
A methodology where participants report on their momentary thoughts, feelings, and behaviors at different points in time over the course of a day.
External validity
The degree to which a finding generalizes from the specific sample and context of a study to some larger population and broader settings.
Full-cycle psychology
A scientific approach whereby researchers start with an observational field study to identify an effect in the real world, follow up with laboratory experimentation to verify the effect and isolate the causal mechanisms, and return to field research to corroborate their experimental findings.
Generalizing, in science, refers to the ability to arrive at broad conclusions based on a smaller sample of observations. For these conclusions to be true the sample should accurately represent the larger population from which it is drawn.
Internal validity
The degree to which a cause-effect relationship between two variables has been unambiguously established.
Linguistic inquiry and word count
A quantitative text analysis methodology that automatically extracts grammatical and psychological information from a text by counting word frequencies.
Lived day analysis
A methodology where a research team follows an individual around with a video camera to objectively document a person’s daily life as it is lived.
White coat hypertension
A phenomenon in which patients exhibit elevated blood pressure in the hospital or doctor’s office but not in their everyday lives.


  • Matthias R. Mehl
    Matthias Mehl is an Associate Professor of Psychology at the University of Arizona where he also holds adjunct appointments in the Department of Communication, the Arizona Cancer Center, and the Evelyn F. McKnight Brain Institute. He currently serves as Vice President of the Society for Ambulatory Assessment.

