Cannot Conclude That Inner City Students Have Better Physical Strength

a) With 95% confidence, estimate the average annual income of workers in that enterprise.

b) At factory B, the proportion of workers with an annual income of 6.5 million is 12%. With a significance level of 5%, can it be said that the proportion of workers with an income of 6.5 million in factory B is higher than in factory A?

4.20. Check the weight of 36 500 gram sugar packets of an automatic packaging machine, the following data table is obtained:

Weight (grams)

495	497	499	501	503
Corresponding package number	5	8	10	7	6

Maybe you are interested!

Managing experiential activities of students at Nam Son Secondary School, Bac Ninh City according to the orientation of the new general education program - 2
Managing sex education activities for students at Nong Trang secondary school, Viet Tri city, Phu Tho province in the current context - 19
Managing physical education activities for students at universities in Hanoi city according to quality assurance approach - 23
Managing moral education for students in secondary schools in Hung Yen city, Hung Yen province in the context of the 4.0 Industrial Revolution - 15
Current Status of Life Skills Education for Primary School Students Through Experiential Activities in Lao Cai City Primary Schools

package,

Assume the weight of the sugar packet is a random variable that obeys the normal law.

a) Find a 95% confidence interval for the variance in weight of packaged sugar

b) It is suspected that the packing machine is underweight. With significance level

5% please conclude on the above suspicion.

4.21. The average height of 100 male students in an inner-city high school is 1.68m, the adjusted sample deviation is 6m. Testing 120 students in a suburban district, the average height is 1.64m, the adjusted sample deviation is 5cm. With a significance level of 5%, can we conclude that inner-city students have different physical development?

4.22. Testing randomly selected products at two factories, we get the following data:

Factory

Number of products tested	Number of scraps
1	n 1 = 100	20
2	n 2 = 120	36

With the same level of meaning?

 0.01 ; the scrap rate of the two factories can be considered as

4.23. Checking 100 products in the first warehouse, there are 8 defective products. Checking

150 products in the second warehouse have 18 defective products. At the 0.05 significance level, can we conclude that the quality of goods in the two warehouses is different?

4.24. A company has a computer system that can process 1300 invoices in 1 hour. The company has just imported a new computer system, this system runs a test for 40 hours and shows that the average number of invoices processed in 1 hour is 1378 with a standard deviation of 215. At the 2.5% significance level, determine whether the new system is better than the old system?

INSTRUCTIONS AND ANSWERS TO CHAPTER 4 EXERCISES

4.1. Rejecting the shipper's opinion

4.2. The company should decide to open a supermarket.

4.3. Need to increase electricity consumption standards

4.6. The company still holds at least 42% of the market

4.13. Effective improvement

4.14. Reliable reporting

4.15. New effective techniques

4.20. It cannot be concluded that inner-city students have better physical strength.

4.21. The scrap rate in the two factories can be considered the same.

4.22. The quality of the two warehouses is not different.

CHAPTER 5: REGRESSION ANALYSIS

5.1. Multidimensional random variables

5.1.1. Concept of 2-dimensional random variable

a. Concept

In Chapter 2, we studied the probabilistic nature of individual random variables, but in practice, we often have to consider many different random variables that are interrelated at the same time, so the concept of multidimensional random variables appears. For example, when studying a product on the market, we are interested in many different aspects at the same time such as: material, color, price, quality, etc. Studying each of those aspects may give us incomplete information. For simplicity, we study the random variable

2-way naturalX , Y 

in there

X and Y are one-dimensional random variables. Most of the results

The results for 2-dimensional random variables can be extended to n- dimensional random variables.

X and Y are discrete random variables, then X , Y 

is a 2-dimensional random variable

discrete, if

X , Y are continuous, we have a 2-dimensional random variable X , Y 

continuous.

b. Distribution function of 2-dimensional random variable

Consider two events

A  X x  , B  Y y . Distribution function of random variable 2

afternoonX , Y 

is defined as follows

F  x , y P AB P  X x , Y y  ,x , y  ¡

(Sometimes we also call it

F  x , y 

is the simultaneous distribution function)

* Properties of distribution function

F  x , y 

 i 

 ii 

0  F  x , y  1

F  x , y non-decreasing per argument

 iii F   , y F  x ,   0; F   ;   1

 iv 

x 1 x 2 ; y 1 y 2

then

P  x 1X x 2,y 1Y y 2F  x 2;y 2 F  x 2;y 1 F  x 1 ;y 2 F  x 1 ;y 1

- Two random variables X, Y are called independent if

F x , y F 1 x  F 2y 

5.1.2. Probability distribution of discrete 2-dimensional random variables

The probability distribution table of a discrete 2-dimensional random variable is:

x i y i	y 1	y 2	...	y j	...	y m
x 1	P(x 1 ,y 1 )	P(x 1 ,y 2 )	...	P(x 1 ,y j )	...	P(x 1 ,y m )
x 2	P(x 2 ,y 1 )	P(x 2 ,y 2 )	...	P(x 2 ,y j )	...	P(x 2 ,y m )
...	...	...	...	...	...	...
x i	P(x i ,y 1 )	P(x i ,y 2 )	...	P(x i ,y j )	...	P(x i ,y m )
...	...	...	...	...	...	...
x n	P(x n ,y 1 )	P(x n ,y 2 )	...	P(x n ,y j )	...	P(x n ,y m )

in there:

x i (i  1, n) are the possible values of the component X

y j ( j  1, m) are the possible values of the X component

P(x i, y j)  P(X  x i, Y  y j)  p i j ; i  1, n; j  1, m and   P(x i, y j)  1 .

i  1 j  1

From the above definition, we can find the probability distribution function

F  x , y p ij

x i xy j y

The marginal distributions of 2D random variables are determined from

P X x ip 1x ip ij, i  1, n

P  Y y j  p 2  y j  p ij , j  1, m

Example 5.1. Given a 2-dimensional random variable X , Y 

has the following simultaneous distribution law:

x i y i

1	2
1	0.18	0.08
2	0.22	0.16
3	0.16	0.2

Prize

Find the distribution law of each variable X and Y. The distribution law of variable X has the form

x i	1	2	3
p i	0.26	0.38	0.36

The distribution law of variable X has the form:

1	2
p j	0.56	0.44

y j

Example 5.2. Given the joint distribution table of X and Y:

x i y i

1	2	3
1	0.1	0.25	0.1
2	0.15	0.05	0.35

Find the probability distribution of random variables X and Y, then calculate

F  2.3  ; F  2, 2 .

Prize

Take the sum of the rows and columns and you have:

x i

1	2
p i	0.45	0.55

y j

1	2	3
p j	0.25	0.3	0.45

We have

F  2,3 pij p 11 p 12  0.35

x i  2 y j  3

F  2, 2 pij p 11  0.1

x i  2 y j  2

5.1.3. Probability distribution of continuous 2-dimensional random variables

A non-negative, continuous function f (x, y) is called the probability density function of a 2-dimensional continuous random variable (X,Y) if it satisfies:

P(X  A, Y  B)   dx  f (x, y)dy

where A and B are sets of real numbers.

The probability distribution function of a two-dimensional random variable (X, Y), denoted F(x,y), is a quantity defined as follows:

F(x, y)  P(X  x, Y  y)

We have

F(x, y)  P(X  x, Y  y)  f (x, y)dxdy



f  x , y continuous in both variables then

f (x, y)  F(x, y)

 x  y

The density function of a 2-dimensional random variable has the following properties:

 i f  x , y  0



 ii 



f  x , y  d xd y  1

 iii P  X , Y  D f  x , y  d xd y

Example 5.3. Given a 2-dimensional random variable (X,Y) with a density function that is both

i i

yes

f ( x , y ) =yes yes

a  x 2y 2

if x 2 +

y 2 £ 1

ï ï ïî

0 if x 2 +

y 2 > 1

Prize

Find the coefficient a.

The coefficient a is determined from the equation

a    x 2y 2 d xd y  1

where D is a circle x 2 y 2  1 .

Converting to polar coordinates we have

2 1 2



a d r dr  1 a 

0 0

Example 5.4. Given the simultaneous density function of X and Y as

f  x, y  1

with 0  x , y  1

Prize

Calculate the joint distribution function

F  x , y and

P  0, 2 X  0.7;0.25 Y  0.45 

ï ì 0 if

x £ 0 or

y  0

ï ï x y if 0 £

x £1 and £0

y £ 1

yes

F ( x , y ) = ï

x if 0 pounds

x £ 1 and

y > 1

yes if

x > 1 and 0 £

y £ 1

ï ï 1 if

yes

x > 1 and

y > 1

0.7 0.45

P  0, 2  X  0, 7; 0, 25  Y  0, 45  

0.2 0.25

d x d y  0.1

5.2. Characteristic numbers of 2-dimensional random variables

5.2.1. Characteristic numbers of component variables

The random variables X and Y have important characteristic numbers which are expectation and variance.

- For discrete variables, we have:

EX    x iP(x i, y j) ;

i  1 j  1

EY  y j P(x i, y j ) ;

jij

j  1 i  1

iij

V X    x 2 P (x , y )   E X  2;

VY   

y 2 P (x , y )  E Y  2.

i  1 j  1

As for continuous variables we have:



j  1 i  1



EX      xf (x, y)dxdy ; EY      yf (x, y)dxdy ;

  2 2



VX xf (x, y)dxdy EX ;



5.2.2. Correlation coefficient

a. Covariance



  2 2

VY yf (x, y)dxdy EY .



The covariance of random variables X and Y, denoted cov(X, Y) or  XY , is a number defined as follows:

Attention:

cov(X, Y)  E  [X  EX][Y  EY] 

If cov(X, Y)  0 then we say that the two random variables X and Y are uncorrelated.

cov(X, Y)  E  XY   EX.EY

- If (X, Y) is discrete then: cov(X, Y)    x iy j P(x i, y j)  EX.EY .

i  1 j  1



- If (X, Y) is continuous then: cov(X, Y)      xyf (x, y)dxdy  EX.EY .

Comment:

If X and Y are two independent random variables then they are not correlated;

cov(X, X)  VX .

b. Correlation coefficient

Correlation coefficient of two random variables X and Y, symbol

r XY

determined as follows:

r XY

 cov(X, Y) S X .S Y

in which SX , S Y

is the standard deviation of X, Y.

Meaning: The correlation coefficient measures the degree of linear dependence between X and Y. When | r XY |

The closer to 1 the linear relationship is stronger, when |r XY | is closer to 0 the linear relationship is stronger

liquid

Example 5.5. Given the joint distribution table of X and Y as

x i y i

1	2	3
1	0.1	0.25	0.1
2	0.15	0.05	0.35

Prize

Calculate the covariance and correlation coefficient of X and Y. We have the distribution table of X and Y as

y j

1	2	3
p j	0.25	0.3	0.45

x i	1	2
p i	0.45	0.55

EX  1.55; EY  2, 2; V X  0, 2475; VY  0.66

2 3

E  X Y x i y j p ij  3.5

i  1 j  1

Covariance

cov(X, Y)  E  XY   EX.EY  3.5  1.55.2, 2  0, 09

Correlation coefficient

r XY

 cov(X, Y) 

S X .S Y

0.09

 0.22

0, 2475.0, 66

Example 5.6. The 2-dimensional random variable (X,Y) has a density function that is both

ï ï 1

yes

f ( x , y ) = ï ï 2 

4 x 2 +

2 2

y 2 £ 4

yes

0 if 4 x + y > 4

Prove that X, Y are dependent and calculate the covariance. Solution

The boundary density functions are: