# Tutor profile: Mahitha T.

## Questions

### Subject: R Programming

R code: y <- 1 f <- function(x) { y <- 2 y ^ 2 + g(x) } g <- function(x) { x * y } The value of y is assigned as 2, then f(6) should be 16 but why is the output of f(6) 10?

This is because of lexical scoping in R. Unlike dynamic environment where the value is assumed from the parent environment, lexical scoping assumes the value of a variable from the environment where the function is defined The function f(x) returns a value y^2 + g(x). y in this environment has been defined as 2 and g(x) from inside this function. The value of x is passed to function g as 6. Now comes the catch, what is the value of y here? The function g(x) is defined in the global environment here, and hence the value of y is assumed to be 1. Therefore a value of 6 is returned from the function g(x). f(6) is finally returning as 10.

### Subject: Machine Learning

Why is 'Naive Bayes Algorithm' naive?

Naive Bayes Algorithm is based on the Bayes theorem which translates to posterior probability = (conditional probability * prior probability)/ evidence Conditional probability is computed as the product of the individual probabilities for each feature. Since the absolute independence of features is probably never met in practice, it is 'naive'. In simple terms, if you like Pasta, and you like Ice Cream, Naive Bayes will assume independence and gives you a Pasta Ice Cream and thinks you will like it

### Subject: Statistics

Given a data set of numbers, when is it safe to assume that the data is normally distributed?

Central Limit Theorem states that given ‘large enough’ sample sizes, the sample means of the data tends to a normal distribution. The interesting thing about the theorem is that It says nothing about the intrinsic distribution of the sample itself or the population at large. Then what is large enough? The answer to this depends on various factors, one of the most important ones being the shape of the population distribution from which the samples are being taken. The closer the shape is to normal, the smaller the required sample size. When the population is farthest from the normal distribution, i.e., the exponential distribution, simulations have shown a size of N = 30 is a valid lower bound for sample size. In summary, it is never safe to assume that the data is normally distributed.

## Contact tutor

needs and Mahitha will reply soon.