Cross-tabulation analysis: what it is and how to use it

/ 9 min read
Rebecca Noori

Businesses are built on numbers. That’s why they make so many surveys to collect data on everything from market research to employee satisfaction. But survey results mean absolutely zilch without proper analysis.

Cross-tabulation is one of the most popular data analysis methods to read the relationships between variables in your survey results. Sure, it sounds geeky, but it's pretty cool once you know how to use it.

By the end of this article, you’ll know exactly what cross-tabulation is, how it works, and how to use this technique to improve your interpretation of survey data.

What is cross-tabulation?

Cross-tabulation is used to analyse the categories in a data table and confirm if there's a link between two or more of those categorical variables. It’s easier to explain with an example.

The cross-tab or contingency table below details the survey results of an employer who wants to find out if employees would prefer to work remotely, in the office, or switch to a hybrid model. The results are displayed in a two-dimensional data table, using age and work location preference.

Survey results displayed in Google Sheets(Image Source: Google Sheets)

But what do these results mean? Is any connection between age and workplace a coincidence, or is there a hidden relationship between these category variables?

From looking at them, it seems like the youngest employees aren’t as keen on working remotely as their 25-35-year-old colleagues. Similarly, it would appear as if those aged 35 to 55 are the least likely to favour a return to the office. But how can we know if these observations hold water?

Cross-tabulation calculates the probability of a link between two (or more) categories so you can make an informed decision about what to do next. Your probability score will power strategic business decisions and help you find the most value in your data.

When to use cross-tabulation

A cross-tabulation analysis is useful when you have information that you can divide into categories or subgroups, such as age, gender, product type, or region.

For example, you might want to analyse:

  • Newsletter signups by age
  • Preference by Gender
  • Sales by region
  • Job seniority by education level
  • Product categories by payment method

But there are a couple of rules to bear in mind.

Rule 1: Cross tabulation doesn't work as a statistical tool for numerical data, like if you wanted to sort through a list of heights or weights. You'd need to group this numerical data into categories for cross-tabulation to be effective.

Rule 2: You must have enough data in your sample size for accurate crosstab reports. The rule of thumb is that each entry in your data table should have a value of at least 5.

Not sure how to use cross-tab in your business? If you’re looking for a little inspiration, here are some popular ways to put cross-tab (also known as contingency table analysis) into action.

1. Employee engagement surveys

Distributing pulse surveys to employees can help you track workforce engagement and satisfaction levels and then filter the results into categories.

For example, you might sort the responses based on department, job roles, workplace location, gender, or compensation rate.

Cross-tabulation analysis of the results can identify any issues early on and alert the managers responsible so they can nip potential conflict in the bud.

Use cross-tabs to spot any disparities between on-site and remote worker engagement or if specific teams have varying levels of job satisfaction.

employee engagement survey

2. Exit interview surveys

Use surveys during your exit interview process to reveal why employees leave your company and potentially reduce churn in the future.

Is there a correlation between quitting and compensation? Or dissatisfaction ratings and lack of training? These impactful insights will help you build an action plan to keep your most talented staff on your side.

exit interview form

3. Student evaluation surveys

An instructor evaluation survey collects feedback from students to decipher how different groups feel about their learning experience.

This includes evaluating whether there are differences between those who study full-time or part-time, preferred class subjects, and the length of the course.

instructor evaluation form

4. Product research

Test how well your target audience has received a new product or service. You might want to find out if there's a difference in response between men and women, or between those who are familiar with your brand and those who aren't.

product research survey

What are the benefits of cross-tabulation?

Cross-tab isn’t just helpful for understanding your data. There are tons of business benefits you can enjoy when you know how to use this helpful tool.

1. Reduce errors

Large data sets can be daunting, but grouping them into representative subgroups for crosstab analysis makes the task more palatable. The bonus? You’ll get a better handle on the numbers with fewer errors creeping into your reports.

2. Quickly compare relationships

Creating a cross-tab report is a relatively swift process if you use pivot tables to analyse your data. Pivot tables are available in spreadsheet tools like Excel or Google Sheets and they make light work of cross-tab analysis.

They'll even do the maths for you, so your only role is to interpret relationships and receive actionable insights.

3. Get more granular insights

The overall results of a survey give you the big picture of what's going on. But they don't zone into the granular details. In business, this could be the subgroups of employees you need to pay more attention to or the products and services that aren’t hitting the mark with your target market.

A cross-tabulation table will deliver new perspectives on your data to help you make strategic decisions about where to focus your attention in the future.

4. Works with a variety of survey question types

Versatility is the name of the game when it comes to cross-tabulation. This analytical tool works with a wide variety of question types so long as you can split the answers into distinct categories or subgroups. This includes:

  • Multiple choice: where respondents answer from a predetermined list of options.
  • Likert scale: where respondents rate something on a scale from agree to disagree or choose a 1 to a 5-star rating for your service or product.
  • Closed-ended questions: such as Yes/No, True/False, or Agree/Disagree.
Important: Cross-tab isn't a great fit for open-ended questions, where you encourage the respondent to provide more information about their opinion. These responses are more emotive and harder to categorise, so they don't suit cross-tab analysis.

Form a better life now.

How to do cross-tabulation analysis in Google Sheets

Google Sheets makes it easy to analyse complicated form data by populating a pivot table which delivers a comprehensive breakdown of the results.

The cross-tabulation example below displays a list of RSVPs for a global sales conference. In the screenshot, the columns tell us the sales rep's name, region, and whether they've RSVP'd as a ‘Yes’ or ‘No’ to the conference invitation.

Survey results displayed in Google Sheets(Image Source: Google Sheets)

To read these responses more closely, you need a pivot table. Open Google Sheets, navigate to the top menu bar, and hit Insert > Pivot table.

Highlight the data range you want to include, then decide if you're going to create the table in a new or existing sheet. Remember to include the column headers in your data range.

Making a pivot table from survey data in Google Sheets(Image Source: Google Sheets)

This will generate a blank pivot table, and from here you can decide where to focus your analysis. In our example, we want to find out how many sales reps from each region will attend the sales conference.

To do this, click on the Rows button, choose the Region category to analyse, and repeat with the Columns button for Attendance. Select the Values you want to display (either as numerical values or row and column percentages) and the pivot table will populate.

The cross-tabulated pivot table shows exactly how many sales reps from each region will attend the sales conference. This presents the data in a more manageable form and allows you to make a hypothesis for further statistical analysis, or test an alternative hypothesis.

A pivot table created in Google Sheets(Image Source: Google Sheets)

How to use the Chi-square test to check the statistical significance

Let's say you have a hunch that there's a relationship between two or more data categories in your survey responses. You can use the Pearson Chi-square test to confirm or deny your theory beyond doubt. Follow these five steps to calculate the probability that a link exists.

Step 1: Create your statistical hypothesis

After you've noticed a potential correlation between factors, the first step is to state your hypothesis. For example:

  • Gender and product preference are independent variables (meaning they’re not connected)
  • Gender and product preference are dependent variables (meaning they are connected)

Step 2: Set your p-value

Chi-square analysis hinges on the "p-value” or the probability that the single variables you're examining are independent of each other. You can set the p-value wherever you like, but the most common is 0.05, or 5%.

But what does this mean? After analysing your Chi-square statistics, if the p-value is greater than 0.05, this generates a null hypothesis, confirming association is absent between your tracked variables. The pattern you were looking for was probably just a coincidence.

On the flip side, if the p-value is less than 0.05%, this confirms that your variables are linked. You could keep surveying countless more people, and the results will likely keep churning out similar results proving a distinctive connection between these two categories.

If you want to be extra sure that your two categorical variables are unrelated, you could choose a p-value of 0.01 or 1%. But in general, a p-value of under 5% is considered statistically significant.

Step 3: Calculate your expected and actual values

Use the Chi-square calculation to determine the p-value of your particular categories and check it against your benchmark p-value above.

In this example, the table shows whether men and women prefer Product X or Y from a brand's product line. The data is grouped into product preferences below. These are the actual values, also known as the observed value.

Data in Google Sheets(Image Source: Google Sheets)

To determine your p-value, you first work with the data's actual (observed) values, which you can do by adding up each of the rows and columns in the table.

Calculating your p-value(Image Source: Google Sheets)

The next step is calculating the expected value for each entry. You can do this by multiplying each row total by each column total and then dividing by the overall total.

Calculating the expected value(Image Source: Google Sheets)

This becomes:

Results from calculating the expected value(Image Source: Google Sheets)

Now, subtract the expected value from the observed value, square it, and then divide it by the expected value.

Calculating the Chi-square value(Image Source: Google Sheets)

This becomes:

Calculating the figures needed to determine the Chi-square value(Image Source: Google Sheets)

Add all these values to give us our Chi-Square value of 14.92+20.45+13.19+18.08, which is 66.64.

(Now pat yourself on the back! That was a lot of maths.)

Step 4. Calculate your degree of freedom

To determine the degree of freedom, or the maximum number of independent values that have the freedom to change, use the following formula.

Degree of Freedom = (rows − 1) × (columns − 1)

There are 2 rows and 2 columns in this example table format, so (2-1) x (2-1) = 1. So, your degree of freedom = 1.

Step 5. Use a p-value calculator

The final step is to take your Chi-square value of 66.64 from step 3 and your degree of freedom value of 1 from step 4.

Plunge these into a reliable p-value calculator for Chi-square distribution, then compare the p-value output against your hypothesis to test if it's correct.

Use Paperform integrations to start your cross-tabulation analysis

Cross-tabulation is a crucial part of statistical data analysis for surveys. It helps you unravel a heap of numbers and information to make meaningful connections between the variables.

If you've already collected the data, why not embrace an analytical approach and let the numbers tell their story? You might even uncover gems of intel that your competitors still need to figure out.

Paperform is the perfect tool to bridge the gap between form data and cross-tab analysis. Our digital Swiss army knife integrates with 2,000+ other popular apps, including Google Sheets. You can also export all your form results to a CSV file at any time, and import it into your spreadsheet of choice to start cross-tabulation.

Start gathering your survey data today with Paperform’s 14-day free trial, and let the cross-tabulation begin.

About the author
Rebecca Noori
Freelance Contributor
Rebecca Noori is a freelance content marketer who writes about productivity and HR solutions for B2B SaaS companies. You’ll also find her supporting beginner freelancers when they’re starting out.

Form a better life now.

Get your 14 day unrestricted trial
No credit card needed.
The 8 best Calendly alternatives in 2024

Calendly is great if you want a basic scheduling tool. But there’s a world of better Calendly altern...

Jotform vs Google Forms: which is right for you?

Deciding between Jotform or Google Forms? This post compares both tools in detail so you can find th...

One Year of Papersign: Product Updates & Milestones

Papersign is turning 1! As we step into our second year, we celebrate with a thank you to you, our a...

Product Updates: January to June 2024

Here's a summary of what we've shipped for Paperform and Papersign since January 2024, from new feat...