Tutor profile: Zac W.
Hi Zac, can you explain what correlation is and how it relates to regression?
Correlation is the linear association between observations in a data set. Said differently, correlation tells you the strength of the relationship between points on the X and Y plane. The correlation coefficient, "r", is the measure that we use to quantify the strength of this relationship. The correlation coefficient can take on a range of values from -1 to 1. When "r" = 1, all the data would perfectly slope upward on a line. If "r" = -1, then all the data would perfectly slop downward on a line. If "r" = 0.7, then the data is a little scattered, but generally increases on the Y-axis as the X-axis increases. As "r" gets closer to +1 and -1, the relationship gets stronger, while "r" approaching 0 indicates that the association between points is weaker. Linear regression utilizes the correlation coefficient to make a line that is able to predict a given output. For instance, we can build a linear regression model that predicts the price of a home from the square footage of the home. This model constructs a line through all the data that "best fits" (I can explain that concept later) the data set. The correlation coefficient is used to calculate the slope coefficient (which is called "beta") of the line. The slope coefficient = "r" * (Sy/Sx). In simpler words, the slope coefficient is equal to the correlation coefficient multiplied by the standard deviation of the data on the Y-axis over the standard deviation of the data on the X-axis. To summarize, we look at the correlation coefficient to determine the strength of the linear association between points in a data set. With this strength, we can calculate the slope for the line that we will use to predict some "y" given some "x" (e.g. predicting the price of a home from the size of the home).
Hi Zac, in my stats class this week we learned about different distributions. Can you explain the difference between the normal distribution and skewed distributions?
The normal distribution has a bell shaped curve. In the middle of this curve, both the median and mean are at the exact center. Keep in mind, the median is the 50th percentile, which is the exact middle observation in a data set, while the mean is the average of the total data set. For skewed distributions, the median and mean are no longer equal to each other. This happens because the data is somehow skewed. When data is skewed, the mean is pulled in the direction of any outliers. If a distribution is skewed to the right, there are data points that are significantly higher than the rest of the observations in the population, which pulls the mean higher than the median. The opposite is true for a distribution that is skewed to the left. Visually speaking, the side of the curve that is much thinner and pulled away from the majority of the data is the side that is skewed. So, if the bulk of the curve is on the right, and there is a thin tail off to the left, the distribution is skewed to the left, because this indicates that there are data points to the left that skews the average and pulls it below the median.
Hi Zac, in my accounting class this week, we learned about the Income Statement and Balance Sheet. Can you explain to me the differences between the two and their relationship?
The Income Statement provides information about a company's financial performance during a certain period (typically annually or quarterly). A company records Revenues and Expenses during this time period to calculate either Net Income or a Net Loss. The Balance Sheet provides a snashot of a company's Assets, Liabilities, and Equity balances as of a certain date (typically year or quarter-end). This differs from the Income Statement in that the Balance Sheet has account balances at a point in time while the Income Statement records earnings over a time period. The key relationship between the Income Statement and Balance Sheet is that Net Income (Income Statement account) is closed out to Retained Earnings (Balance Sheet Account). Indeed, Net Income serves to connect a company's performance during one period to the cumulative earnings over multiple periods.
needs and Zac will reply soon.