How are impossible values and missing values represented in the R language?
Missing values are represented as NA or 'Not Available', while impossible values are represented as NaN or 'Not a Number'. It is important to note that deleting missing values would be deemed poor programming. This is because the most likely cause for missing values is a problem with the data collection process. At this point the engineer should find the root cause of the missing values and take necessary steps to resolve them
What is the difference between supervised learning and unsupervised learning?
Supervised learning requires that the training data is labelled. Or in other words, that the target variable is present in the dataset. For example, in order to perform a supervised learning operation such as classification it is necessary to first label the data in the training set in order to train the model. Unsupervised learning, on the other hand, does not explicitly require labelled data.
What is statistical power?
Statistical power is the likelihood that a study will detect an effect when the effect is present. The higher the statistical power, the less likely you are to conclude that there is no effect when there actually is an effect. This is also known as a type 2 error