Face Validity – A Simple Guide with Definition & Examples

Validity refers to the degree to which a test measures what it promises to measure. A test needs to be valid if the results are to be applied and interpreted appropriately.1

Face Validity – In a Nutshell

  • Face validity contrasts with content validity, which measures how correctly an experiment depicts what it is attempting to test.
  • The distinction between content validity and face validity is that content validity is assessed thoroughly, whereas the latter is a more general metric that frequently incorporates the subjects’ input.
  • Face validity is categorized as “poor evidence” in favor of construct validity, but this does not imply that it is invalid; caution is required.

Definition: Face validity

Face validity refers to whether or not a test seems to measure what it is intended to measure. This sort of validity examines if a measure appears relevant and suitable for what it is assessing.

The subsequent forms of measurement validity are:2

Construct validity Is the test able to measure the notion it is designed to measure accurately?
Content validity Is the test's content a true reflection of the constructs it seeks to measure?
Criterion validity Do the outcomes accurately assess the measurable outcome for which they were designed?

Importance of face validity

Face validity only indicates that the test appears to be effective. It does not imply that the test’s efficacy has been demonstrated. Nonetheless, if the measure is valid at this time, researchers may conduct additional research to assess if the test is genuine and should be utilized in the future.

A measure has good validity if anybody who reviews it concludes that it appears to measure what it is intended to measure. If your measure has poor validity, a potential reviewer may be confused as to what you’re trying to quantify and why you’re employing this particular approach.

To achieve validity, your measurement should be:

  • Applicable to what is being measured.
  • Suitable for the participants.
  • Relevant to its function.

Example: Good vs. poor face validity

In a health study, you wish to determine the participants’ ages. There are two ways to record age:

  • Requesting participants’ self-reported birthdates and then determining their ages.
  • Participants’ ages were estimated by counting the number of gray hairs on their heads and extrapolating from there.

These two techniques have radically different levels of validity:

  • Face validity is high for the first method since it directly measures age.
  • The second method has a low validity since it does not measure age in a meaningful or acceptable way.

Face validity does not guarantee good overall validity or reliability of measurement. It is a weak type of validity because it is evaluated subjectively, without rigorous testing or statistical analysis.

However, verifying the test’s face validity is a crucial initial step in evaluating its validity as it allows you to evaluate more advanced types of validity, such as criterion or content validity.

Assessment of face validity

You can test the validity of your measurement method and items by asking others to check and determine if they are appropriate for measuring your target variable.

Pose the following questions:

  • Are the measure’s components pertinent to what is being measured?
  • Does the measurement technique appear appropriate for determining the value of the variable?
  • Does the measure appear suitable for capturing the variable?

You can send your test reviewers a short questionnaire, or you can ask them informally if the test appears to measure what it is intended to.

Who should assess face validity?

It is essential to pick qualified individuals to evaluate a test. For instance, persons who take the test would be in the best position to evaluate its validity.

Also, others who work with the test, such as university administrators, could provide feedback. Lastly, the researcher could utilize members of the general public who have an interest in the test, such as parents of test subjects or teachers.

A test’s face validity can only be regarded as a robust construct if raters exhibit a sufficient level of agreement.

Example: Assessing face validity

You discover a questionnaire that analyzes the emotional states of adolescents and intends to use it in a study. Before beginning the study, you distribute the questionnaire to both fellow researchers and possible participants.

Your fellow researchers provide you with positive feedback, stating that it has good face validity. However, potential participants report that they are unsure of the purpose of specific questions due to the usage of jargon. Additionally, they inform you that some queries appear obsolete and make no sense. From their standpoint, the inventory has poor validity.3

When face validity is best tested

Obtaining an early indicator of the test’s validity is critical, whether you’re conducting a new study or using an established test in a new context.

Here are three instances where (re)evaluating facial validity is crucial:

Developing a brand new measure or test

Example: Developing a new test

You construct a personality assessment for job candidates. Respondents are asked in your survey how they would respond in various job scenarios.

You solicit feedback on the validity of your test from employers, employees, and jobseekers. While employers agree it has significant validity, the other two groups claim they cannot always respond appropriately to questions of this nature without a thorough understanding of the position and the firm. It has poor face validity for them.

Using an existing test for a population the test wasn’t designed for

Example: Repurposing an existing test for a new population

You choose to evaluate math and language skills for a study. You intend to administer an IQ exam designed for American high school students to Indian high school students.

Teachers, potential participants, and researchers in India evaluate the validity of your test. They are all the opinions that the verbal component is deficient in face validity because some questions are heavily culture-bound to the U. S. The math component, however, has high validity.

Using an existing test in a context it wasn’t designed for

Example: Repurposing an existing test for a new context

In diary research, individuals record their daily calorie consumption and moods. You modify an older questionnaire into a condensed form so that you may collect daily data for two weeks. The original questionnaire consists of twenty questions, but the revised version contains only three.

You ask prospective participants and coworkers about the validity of your short-form questionnaire. Their response indicates that it is lucid, concise, and has high validity.


