# Tutor profile: Katie G.

## Questions

### Subject: Study Skills

If you are particularly struggling with a reading assignment, what are ways you can make it easier?

Reading can sometimes feel so sluggish and difficult, especially if you are burnout or just plain exhausted from studying. There are quite a few ways to help your brain absorb information from readings. First, break up the reading into more manageable sections. This can be chapters, pages, paragraphs, or even sentences. After reading the section, try to write down a summary of what was on the page. Use your own words and feel free to look back at the reading for help. By writing down your own interpretation of the reading, you are engaging different parts of your brain and giving more chances for the reading to stick. Second, use active reading techniques! Try to come up with at least one question and one comment per page (even if it feels pointless) and write it either in the margins or on a sticky note. By engaging with the material, even at a surface level, you are making the task more interesting for your brain! If your reading has terms you need to remember, keep a glossary of terms separate from your notes. This way you can have quick access to the materials. This can also help in making flashcards if those help you study. If you have time, try reading over the material first before coming back and actually studying it. That way you have the general context fo what you are reading and have a better understanding of what is important.

### Subject: Statistics

Given two random variables $$ Y_1 $$ and $$Y_2$$ with a joint distribution function $( f_{Y_1,Y_2}(y_1,y_2) = \begin{cases} 2(1-y_1), & 0 \le y_1 \le 1,\; 0\le y_2\le 1,\\ 0, & \text{otherwise} \end{cases} $) Find the probability distribution of $$U=Y_1Y_2$$

This problem is asking for a bivariate transformation. For every bivariate transformation, there are three steps: 1. Find inverse functions for the transformation 2. Find the Jacobian determinant of those inverse functions 3. Find the distribution using a general formula So first we need to find two transformation functions of the form $(u_1=h_1(y_1,y_2)\quad u_2=h_2(y_1,y_2)$) and their inverses of the form $(y_1=h_1^{-1}(u_1,u_2)\quad y_2=h_2^{-1}(u_1,u_2)$) We are given $$U = Y_1Y_2$$ so let's make that the function for $$u_2$$ and find the inverse: $(u_2 = y_1y_2 = h_2(y_1,y_2) \implies y_2 = \frac{u_2}{y_1} = h_2^{-1}(y_1,u_2)$) This inverse function is not in the correct form --- $$h_2^{-1}(u_1,u_2)$$ --- so we need to choose $$u_1 = h_1(y_1,y_2)$$ such that $$h_2^{-1}$$ and $$h_1^{-1}$$ are entirely in terms of $$u$$. The easiest choice seems to be $(u_1 = y_1 = h_1(y_1,y_2)\implies y_1 = u_1 = h_1^{-1}(u_1,u_2)$) $(u_2 = y_1y_2 = h_2(y_1,y_2) \implies y_2 = \frac{u_2}{u_1} = h_2^{-1}(u_1,u_2)$) Next we need to find the Jacobian determinant of $$h_1^{-1}(u_1,u_2)$$ and $$h_2^{-1}(u_1,u_2)$$. The Jacobian matrix is simply the partial derivatives of the equations with respect to both of their parameters. It looks like $(\begin{bmatrix} \frac{\partial h_1^{-1}}{u_1} & \frac{\partial h_1^{-1}}{u_2}\\ \frac{\partial h_2^{-1}}{u_1} & \frac{\partial h_2^{-1}}{u_2} \end{bmatrix} $) All of those derivatives are as follows: $$\frac{\partial h_1^{-1}}{u_1} = \frac{\partial (u_1)}{u_1} = 1$$ $$\frac{\partial h_1^{-1}}{u_2} = \frac{\partial (u_1)}{u_2} = 0$$ $$\frac{\partial h_2^{-1}}{u_1} = \frac{\partial (u_2/u_1)}{u_1} = -u_2/u_1^2$$ $$\frac{\partial h_2^{-1}}{u_2} = \frac{\partial (u_2/u_1)}{u_2} = 1/u_1$$ Plugging these into the Jacobian matrix, we get: $(\begin{bmatrix} 1 & 0\\ -u_2/u_1^2 & 1/u_1 \end{bmatrix} $) And now we have to find the determinant. Recall $(\det\begin{bmatrix} a & b\\ c & d \end{bmatrix} = a\cdot d - b\cdot c $) So the determinant of our Jacobian Matrix is $(J = \det\begin{bmatrix} 1 & 0\\ -u_2/u_1^2 & 1/u_1 \end{bmatrix} = 1\cdot \frac{1}{u_1} - 0\cdot\frac{-u_2}{u_1^2} = \frac{1}{u_1} $) Now we can finally make a distribution. The general formula for the joint distribution of $$u_1$$ and $$u_2$$ is $(f_{U_1,U_2}(u_1,u_2)=f_{Y_1,Y_2}\big(h_1^{-1}(u_1,u_2), h_2^{-1}(u_1,u_2)\big)\lvert J \rvert$) While this may look complicated, it is really simple. Our original joint distribution for $$Y_1$$ and $$Y_2$$ is $$2(1-y_1)$$, so when we plug in the inverse functions we get $(f_{Y_1,Y_2}\big(h_1^{-1}(u_1,u_2), h_2^{-1}(u_1,u_2)\big) = \begin{cases}2(1-u_1), & 0\le u_2\le 1,\; 0\le u_2/u_1\le 1\\0,&\text{otherwise}\end{cases}$) And then we just need to multiply in the absolute value of the Jacobian to get our distribution. $(f_{U_1,U_2}(u_1,u_2) = \begin{cases}2(1-u_1)\lvert \frac{1}{u_1}\rvert, & 0\le u_2\le 1,\; 0\le u_2/u_1\le 1\\0,&\text{otherwise}\end{cases}$) Cleaning this up a little bit: because all values are positive, we know $$\frac{1}{u_1}$$ will always be positive as well. Thus, we can remove the absolute value. For the bounds, if $$0\le u_2/u_1\le 1$$, then we know $$u_2 \le u_1$$ or else the value is greater than 1. So, we can simplify the bounds to be $$0 \le u_u \le u_1 \le 1$$. Our joint distribution is therefore $(f_{U_1,U_2}(u_1,u_2) = \begin{cases}\frac{2(1-u_1)}{u_1}, & 0\le u_2 \le u_1\le 1\\0,&\text{otherwise}\end{cases}$) Now the problem asked for the probability distribution of $$U$$, not $$u_1$$ and $$u_2$$. To get the final answer, we must integrate our joint distribution function with respect to $$u_1$$ to get the marginal distribution of $$U_2 = U$$. The bounds for the integral are the bounds for $$u_1$$ which are $$u_2 \le u_1 \le 1$$. $(\int_{u_2}^1 \frac{2(1-u_1)}{u_1}\; du_1 = \int_{u_2}^1 \frac{2}{u_1}-2\; du_1 = \Big[2\ln u_1-2u_1\Big]_{u_2}^1 = 2(u_2 - \ln u_2 - 1)$) So our final final answer is $( f_{U}(u) = \begin{cases} 2(u - \ln u - 1), & 0 \le u_2\le 1\\ 0, & \text{otherwise} \end{cases} $)

### Subject: Data Science

When dealing with data in high-dimensional spaces (i.e. there are many interconnected variables that can be used to uniquely identify a data point), data scientists can represent the data on a manifold. A manifold is a way to take complex data and project it onto a lower dimension without losing too much of the original structure. What are the main ways data scientists can perform regression of manifold-valued data?

The three main ways data scientists deal with manifold-valued data in regression are intrinsic manifold regression, kernel-based regression, and Manifold Learning-based methods. Intrinsic manifold regression uses typical linear regression techniques and applies them to the manifold. These methods need the manifold to be defined in some analytical way. This means the program (or the data scientist) needs to understand the exact shape. Sometimes this is easy! But a lot of times when we deal with messy math, exact definitions are hard to come by. This makes using linear regression techniques hard to use on many sets of manifold-valued data. Kernel-based regression was originally developed for complex data types like trees and graphs. The basic idea is to take non-linear data and map it to a high-dimensional inner product space. An inner product space is essentially the vector space of the inner product of the data. When we do this mapping, we have a lot of regression techniques at our disposal. The problem with kernel-based regression on some data sets is that (1) you need to define a kernel function, and (2) you need a lot of computational power compared to other methods. Manifold Learning-based methods are based on reducing the data's dimensions using various pieces of information. This method attempts to preserve as much geometry as possible. It creates a lower-dimension embedding and then needs to revert the data back to the manifold in a process called backscoring. If we have a low-dimensional data, we don't need to use high-dimensional regression techniques. Backscoring is an active research area in machine learning and statistics right now.

## Contact tutor

needs and Katie will reply soon.