Enable contrast version

# Tutor profile: Katie G.

Katie G.
University Student in Mathematics and Computer Science

## Questions

### Subject:Study Skills

TutorMe
Question:

If you are particularly struggling with a reading assignment, what are ways you can make it easier?

Katie G.

### Subject:Statistics

TutorMe
Question:

Given two random variables $$Y_1$$ and $$Y_2$$ with a joint distribution function $( f_{Y_1,Y_2}(y_1,y_2) = \begin{cases} 2(1-y_1), & 0 \le y_1 \le 1,\; 0\le y_2\le 1,\\ 0, & \text{otherwise} \end{cases}$) Find the probability distribution of $$U=Y_1Y_2$$

Katie G.

This problem is asking for a bivariate transformation. For every bivariate transformation, there are three steps: 1. Find inverse functions for the transformation 2. Find the Jacobian determinant of those inverse functions 3. Find the distribution using a general formula So first we need to find two transformation functions of the form $(u_1=h_1(y_1,y_2)\quad u_2=h_2(y_1,y_2)$) and their inverses of the form $(y_1=h_1^{-1}(u_1,u_2)\quad y_2=h_2^{-1}(u_1,u_2)$) We are given $$U = Y_1Y_2$$ so let's make that the function for $$u_2$$ and find the inverse: $(u_2 = y_1y_2 = h_2(y_1,y_2) \implies y_2 = \frac{u_2}{y_1} = h_2^{-1}(y_1,u_2)$) This inverse function is not in the correct form --- $$h_2^{-1}(u_1,u_2)$$ --- so we need to choose $$u_1 = h_1(y_1,y_2)$$ such that $$h_2^{-1}$$ and $$h_1^{-1}$$ are entirely in terms of $$u$$. The easiest choice seems to be $(u_1 = y_1 = h_1(y_1,y_2)\implies y_1 = u_1 = h_1^{-1}(u_1,u_2)$) $(u_2 = y_1y_2 = h_2(y_1,y_2) \implies y_2 = \frac{u_2}{u_1} = h_2^{-1}(u_1,u_2)$) Next we need to find the Jacobian determinant of $$h_1^{-1}(u_1,u_2)$$ and $$h_2^{-1}(u_1,u_2)$$. The Jacobian matrix is simply the partial derivatives of the equations with respect to both of their parameters. It looks like $(\begin{bmatrix} \frac{\partial h_1^{-1}}{u_1} & \frac{\partial h_1^{-1}}{u_2}\\ \frac{\partial h_2^{-1}}{u_1} & \frac{\partial h_2^{-1}}{u_2} \end{bmatrix}$) All of those derivatives are as follows: $$\frac{\partial h_1^{-1}}{u_1} = \frac{\partial (u_1)}{u_1} = 1$$ $$\frac{\partial h_1^{-1}}{u_2} = \frac{\partial (u_1)}{u_2} = 0$$ $$\frac{\partial h_2^{-1}}{u_1} = \frac{\partial (u_2/u_1)}{u_1} = -u_2/u_1^2$$ $$\frac{\partial h_2^{-1}}{u_2} = \frac{\partial (u_2/u_1)}{u_2} = 1/u_1$$ Plugging these into the Jacobian matrix, we get: $(\begin{bmatrix} 1 & 0\\ -u_2/u_1^2 & 1/u_1 \end{bmatrix}$) And now we have to find the determinant. Recall $(\det\begin{bmatrix} a & b\\ c & d \end{bmatrix} = a\cdot d - b\cdot c$) So the determinant of our Jacobian Matrix is $(J = \det\begin{bmatrix} 1 & 0\\ -u_2/u_1^2 & 1/u_1 \end{bmatrix} = 1\cdot \frac{1}{u_1} - 0\cdot\frac{-u_2}{u_1^2} = \frac{1}{u_1}$) Now we can finally make a distribution. The general formula for the joint distribution of $$u_1$$ and $$u_2$$ is $(f_{U_1,U_2}(u_1,u_2)=f_{Y_1,Y_2}\big(h_1^{-1}(u_1,u_2), h_2^{-1}(u_1,u_2)\big)\lvert J \rvert$) While this may look complicated, it is really simple. Our original joint distribution for $$Y_1$$ and $$Y_2$$ is $$2(1-y_1)$$, so when we plug in the inverse functions we get $(f_{Y_1,Y_2}\big(h_1^{-1}(u_1,u_2), h_2^{-1}(u_1,u_2)\big) = \begin{cases}2(1-u_1), & 0\le u_2\le 1,\; 0\le u_2/u_1\le 1\\0,&\text{otherwise}\end{cases}$) And then we just need to multiply in the absolute value of the Jacobian to get our distribution. $(f_{U_1,U_2}(u_1,u_2) = \begin{cases}2(1-u_1)\lvert \frac{1}{u_1}\rvert, & 0\le u_2\le 1,\; 0\le u_2/u_1\le 1\\0,&\text{otherwise}\end{cases}$) Cleaning this up a little bit: because all values are positive, we know $$\frac{1}{u_1}$$ will always be positive as well. Thus, we can remove the absolute value. For the bounds, if $$0\le u_2/u_1\le 1$$, then we know $$u_2 \le u_1$$ or else the value is greater than 1. So, we can simplify the bounds to be $$0 \le u_u \le u_1 \le 1$$. Our joint distribution is therefore $(f_{U_1,U_2}(u_1,u_2) = \begin{cases}\frac{2(1-u_1)}{u_1}, & 0\le u_2 \le u_1\le 1\\0,&\text{otherwise}\end{cases}$) Now the problem asked for the probability distribution of $$U$$, not $$u_1$$ and $$u_2$$. To get the final answer, we must integrate our joint distribution function with respect to $$u_1$$ to get the marginal distribution of $$U_2 = U$$. The bounds for the integral are the bounds for $$u_1$$ which are $$u_2 \le u_1 \le 1$$. $(\int_{u_2}^1 \frac{2(1-u_1)}{u_1}\; du_1 = \int_{u_2}^1 \frac{2}{u_1}-2\; du_1 = \Big[2\ln u_1-2u_1\Big]_{u_2}^1 = 2(u_2 - \ln u_2 - 1)$) So our final final answer is $( f_{U}(u) = \begin{cases} 2(u - \ln u - 1), & 0 \le u_2\le 1\\ 0, & \text{otherwise} \end{cases}$)

### Subject:Data Science

TutorMe
Question:

When dealing with data in high-dimensional spaces (i.e. there are many interconnected variables that can be used to uniquely identify a data point), data scientists can represent the data on a manifold. A manifold is a way to take complex data and project it onto a lower dimension without losing too much of the original structure. What are the main ways data scientists can perform regression of manifold-valued data?

Katie G.

The three main ways data scientists deal with manifold-valued data in regression are intrinsic manifold regression, kernel-based regression, and Manifold Learning-based methods. Intrinsic manifold regression uses typical linear regression techniques and applies them to the manifold. These methods need the manifold to be defined in some analytical way. This means the program (or the data scientist) needs to understand the exact shape. Sometimes this is easy! But a lot of times when we deal with messy math, exact definitions are hard to come by. This makes using linear regression techniques hard to use on many sets of manifold-valued data. Kernel-based regression was originally developed for complex data types like trees and graphs. The basic idea is to take non-linear data and map it to a high-dimensional inner product space. An inner product space is essentially the vector space of the inner product of the data. When we do this mapping, we have a lot of regression techniques at our disposal. The problem with kernel-based regression on some data sets is that (1) you need to define a kernel function, and (2) you need a lot of computational power compared to other methods. Manifold Learning-based methods are based on reducing the data's dimensions using various pieces of information. This method attempts to preserve as much geometry as possible. It creates a lower-dimension embedding and then needs to revert the data back to the manifold in a process called backscoring. If we have a low-dimensional data, we don't need to use high-dimensional regression techniques. Backscoring is an active research area in machine learning and statistics right now.

## FAQs

What is a lesson?
A lesson is virtual lesson space on our platform where you and a tutor can communicate. You'll have the option to communicate using video/audio as well as text chat. You can also upload documents, edit papers in real time and use our cutting-edge virtual whiteboard.
How do I begin a lesson?
If the tutor is currently online, you can click the "Start Lesson" button above. If they are offline, you can always send them a message to schedule a lesson.
Who are TutorMe tutors?
Many of our tutors are current college students or recent graduates of top-tier universities like MIT, Harvard and USC. TutorMe has thousands of top-quality tutors available to work with you.
BEST IN CLASS SINCE 2015
TutorMe homepage