Week 3: Crosstabs
This is an Individual Discussion. The discussion prompt is a mini-assignment where students post statistical analyses conducted via SPSS. The instructor then gives feedback within the discussion as a response post. Once feedback is given, students are able to make corrections (if needed) within response posts to the instructor’s feedback. If no corrections are needed, students should respond and acknowledge they have read the instructor’s feedback.
Initial Discussion posts are due Wednesday. All interaction and corrections should be completed by Sunday. There is no interaction with peers. The responses are only visible to each individual student and the instructor. Initial posts should be thorough, completing all tasks given in the discussion prompt. All posts should demonstrate college-level writing skills. You can review the general Discussion Guidelines here as well.
This week we talk about the uses of a crosstabulation (crosstabs) and the benefits of creating this “snapshot” of your data.
For this discussion, provide a brief introduction to your study to remind your classmates what we are reading about here. Include:
1. Your overall research question
2. A broad research hypothesis; that is, the relationship of IV to DV. (For example, educational attainment affects family income in US adults.)
Next, create two crosstabs for variable relationships and include them in the post – one to look at your IV and DV and one to look at your control variable and the DV. Be sure to explain your findings, including a description of the data, a calculation of the epsilons, and a discussion of the 10% rule for each crosstab. The epsilons in short are the differences between the highest and lowest column % in any given row. As long as one epsilon makes the 10% threshold, we’ll deem two variables have “enough” going on with each other to warrant further statistical analysis.
When a variable is continuous (interval/ratio level of measurement), for example age of respondent, we do not run crosstabs directly because it will result in a really spread-out table with lots of zeros and low frequency cells. Such a crosstab does not help us understand the data. The correct way is to reduce the level of measurement to either ordinal level or nominal level (group the numbers into categories) by recoding and then run the crosstab. (Please refer to the Lessons for further information.)
When you run your crosstabs, be sure to also include a measure of association. As a reminder, here are the guidelines:
Both DV and IV are nominal variables: Lambda (when it is not a 2X2 table)
Both DV and IV are nominal variables and it is a 2X2 table: Phi
Both DV and IV are ordinal variables: Gamma
One variable ordinal AND the other variable dichotomous nominal (like Yes/No, male/female, etc.): Gamma
One variable ordinal AND the other variable nominal (not dichotomous, has more than 2 categories): Cramer’s V.
Both DV and IV are I/R variables: Pearson’s r
Be sure to discuss the strength and direction of the potential relationship between the variables. Keep in mind measures of association is a statistical procedure based on Proportional Reduction of Error (PRE). Thus the format of interpretation will be: Knowing the IV will reduce error in predicting the DV by *%.