1B: Exploring Sample Data

How did you do? Did you think one car performed better than the other? What data were you paying attention to as you were racing?

Before we analyze the data your class has collected, let’s look at a sample dataset, called sample1. In the following visualization app, you first have the Group ID which represents a sample taken from a previous group of players. The X Variable can either be Car, Order, or Player ID. The Y Variable can either be Finish Time or Top Speed Reached. Try switching between different X and Y variables to see what different data pops up. Then, complete the questions below the app to make sure you understand how this app works.


To use the app, start with the following settings then answer the questions below.
  • Group ID:   sample1
  • X Variable:   Car
  • Y Variable:   Finish Time
  • Check:   Add Boxplot



  • 1C: Get Curiousget curious icon

    1. To make good decisions with data, you often need to consider more than what you can see in a single graph or table. Sometimes there’s an outside influence that the original experiment did not account for. For example, confounding variables (variables that the researcher did not include in the study, but that might be connected to both the independent variable and the dependent variable) may influence the results. In order to evaluate whether the car speeds are truly different, we should identify and consider possible confounding variables. List at least two potential confounding variables that might get in the way of determining which car is faster.

    2. In the Racer App above replace sample1 with the Group ID used by your class.
    3. a. Look carefully through the data from your class. Identify any players that you think should be removed from the dataset.
      b. Do the results from your class agree with the sample1 data? In what ways (if any) do they disagree?
      c. Do you believe the tutorial data from your class provides convincing evidence that one car is better than the other?
      d. Based on the previous question, what evidence did you use to determine that one car was better than another, or what additional evidence would be needed before you could determine that?

    4. Think about different ways we could have conducted this study to improve the quality of our data.
    5. a. Are there better variables to use? For example, what would be the benefits of using Top Speed Reached as the response variable instead of Finish Time?
      b. What are the benefits of randomizing the order of the cars?
      c. How many races would need to be completed to convince you that one car is better than the other?

    Continue to Part 2

     



    Dataspace

    Data Stories


    NYPD

    Covid-19

    Recidivism

    Brexit

    Stats Games


    Racer

    Greenhouse

    Statistically Grounded

    Questions?

    If you have any questions or comments, please email us at DASIL@grinnell.edu

    Dataspace is supported by the Grinnell College Innovation Fund and was developed by Grinnell College faculty and students. Partial support provided by the Transforming Undergraduate Education in Science (TUES) program at the National Science Foundation under DUE#0510392, DUE #1043814, and DUE #1712475. Copyright © 2021. All rights reserved

    This page was last updated on 5 August 2022.