## current position：Home>[Wu Enda Machine Learning Notes] 2. Univariate Linear Regression

# [Wu Enda Machine Learning Notes] 2. Univariate Linear Regression

2022-11-24 23:08:28【Pandaconda】

个人博客：https://blog.csdn.net/Newin2020?spm=1011.2415.3001.5343

专栏定位：In-class notes for students studying Wu Enda's machine learning videos.

专栏简介：在这个专栏,I'll be organizing my notes for all the content of Andrew Ng's machine learning videos,方便大家参考学习.

视频地址：吴恩达机器学习系列课程

️如果有收获的话,欢迎点赞收藏,您的支持就是我创作的最大动力

## 二、单变量线性回归

Common expression symbols：

假设函数（Hypthesis）

Suppose the function passes to find the optimal two parameters,In order to obtain a curve that best fits the data.

### 1. 代价函数

- 定义：通过**代价函数（cost function）**得到的值,来获得最优解,Smaller values represent higher accuracy.
- So we have to find the minimum value of the cost function,So as to get its corresponding parameter value,Then the best fitting curve is obtained.

平方误差代价函数（The squared error dost function）

Among them, dividing by two in front of the formula is convenient for subsequent derivative calculations,This function can solve most regression problems.

This is our linear regression model.

And we can simplify the assumption function,So as to better understand the meaning behind the cost function.

Its cost function image is as follows：

Above we can know when the parameter is equal to 1时,The value of the cost function is the smallest,So bring the argument back into the equation for the hypothetical function,We can get a curve that best fits the data.如果参数更多的话,就会更加复杂,Below is a 3D image of the two parameters：

小结因此,对于回归问题,We just boil down to finding the minimum value of the cost function,Below is my objective function for linear regression.

### 2. 梯度下降

- 定义：We will get the initialization parameters,Then keep looking for smaller ones by changing the parametersJ值.

- 注意
- One of the properties of gradient descent is ,You may end up with two different local optima due to the deviation of the initial position,就如下图

梯度下降算法（Gradient Descent Algorithm）

assignment and equal sign

`：=`

Represents assignment in computers,即将b赋值给a`=`

Represents true and false judgments in the computer,即判断a是否等于b

算式中αUsed to control the rate of descent,The larger the value, the faster the gradient decreases,但是αThe value of can't be too big or too small,原因如下：

- 如果α太小,It takes many, many steps to reach the lowest point.
- 如果α太大,It may lead to failure to converge or even diverge,It may go over the bottom.

在梯度下降算法中,The parameters are updated at the same time,That is, the left side of the figure below is the correct operation,The right side is incorrect operation.

The derivative part on the far right of the formula is JThe partial derivative of the function with respect to the parameter is the slope of the tangent line,详解如下：

- If the selected parameter is in JThe slope of the tangent line in the function curve is positive,Then the block part of the derivative is also positive,That is, the parameter is subtracted by a positive value,从图像上来看,The direction of parameter reduction goes to the left, that is, to the direction of the lowest point of the curve.
- If the selected parameter is in JThe slope of the tangent line in the function curve is negative,Then the block part of the derivative is also negative,That is, the argument is subtracted by a negative value,That is, add a positive value,从图像上来看,The direction of parameter increase is to the right, that is, to the direction of the lowest point of the curve.

So it can be seen from the image,When the operation has reached a local minimum,The tangent slope is zero,The parameters will not change anymore,如下图：

小结综上所述,当αwithin the normal range and unchanged,The function can still find local minima,Because the closer to the lowest point,The slope of the tangent line will be smaller until it is equal to zero,So the magnitude of its reduction will be smaller and smaller,until a local minimum is reached.

### 3. 线性回归的梯度下降

- When learning the linear regression model（下图右侧）与梯度下降算法（下图左侧）后,It is to solve the problem of combining the two

Now we bring the linear regression model into the gradient descent algorithm to calculate,Let's calculate the derivative part first,Find the partial expression for the derivative of our parameter.

最后,Bring back the formal expression for the parameter,就如下所示：

in the gradient descent function,We may get different local optima due to different initial values,But in the cost function of linear regression,There will always be a minimum and only minimum value,即会得到一个`凸函数（convex function）`

,如下图所示：

So just use the gradient descent of the linear regression function,In the end, a global optimal solution will always be obtained,He has no other local optimum.

And this gradient descent problem is shown in the figure below,Constantly changing parameter values,To find the curve that best fits the data.

小结总的来说,The algorithm we used above is as follows：

Batch 梯度下降法

Meaning at each step of gradient descent,It will traverse the entire training set of samples.

这是目前来说,Our first learned machine algorithm.

copyright notice

author[Pandaconda],Please bring the original link to reprint, thank you.

https://en.chowdera.com/2022/328/202211242306246890.html

## The sidebar is recommended

- How to make a requirement document that both technology and testing like?
- 7 principles to understand the "dark mode" design
- MetaElfLand, which integrates World Cup + GameFi elements, launched a special event for the World Cup
- How does automated testing improve efficiency in the enterprise?
- Solve the problem of MySQL database executing Update stuck
- Custom persistence layer framework MyORMFramework (2) - framework design
- The long article finishes the summary of MySQL common statements and commands in one go
- No way!This game is better than the king...
- [Other common methods of NSFileManager to create files, etc. Objective-C language]
- Simple analysis of cold start and hot start

## guess what you like

Physical address, address space, memory space, io space

C language handwritten love - restore the latest hot drama girl code

Domestic database, how can any company's products be completely independently developed?

Station 5: Operators (Act 1)

Transductive Learning and Inductive Learning

NNDL homework 9: Implement BPTT using numpy and pytorch respectively

End-to-end concepts in deep learning

NNDL Experiment 7 Recurrent Neural Network (2) Gradient Explosion Experiment

Code random recording brush questions record day26 incremental sequence + full arrangement

[C language] Getting to know pointers for the first time (final part)

## Random recommended

- open-set recognition (OSR) open set recognition
- ["Programmer's Self-Cultivation --- Link Loading in Library" Reading Notes] Loading and process of executable files
- [Complete super-score and MEF at the same time]
- investment thinking
- [Kubernetes quick combat]
- A simple case to help you understand the concept of entropy
- How to understand information gain?
- Information gain measures the difference between two groups
- 成功解决:没有注册同步因为同步不活跃
- [Kubernetes three core concepts]
- LQ0244 Square root [program fill in the blank]
- LQ0242 pi [program fill in the blank]
- LQ0247 The first number [program to fill in the blank]
- Fill in the blanks prime twins LQ0246 【 application 】
- LQ0245 Maximum number [program fill in the blank]
- LQ0243 Cut circle [program fill in the blank]
- It took 6 months, from a monthly income of 3K to 14K, what have I experienced...
- BYD bydatto3 received five-star safety rating from EU safety test
- In-depth understanding of ThreadLocal
- An article takes you to understand the use of PID
- MMsegmentation related
- Jmeter - BeanShell commonly used built-in variables and script development
- Power button 73. Matrix zero C language implementation
- What does solidity address(this) mean?
- Day04[1306. Jumping Game III][703. Kth Largest Element in Data Stream][1337. Kth Weakest Combat Row in Matrix]
- Mysql and Postgresql generate data in batches
- Configuration Practice of OSPF in Frame Relay
- Understand the single sign-on (sso)
- Implementing Partition on a linked list and the Dutch flag problem
- Is it safe and legal to use MT4 for external disk?
- CompletableFuture asynchronous task orchestration
- Are the funds in the futures account safe?Which company is good for opening an account?
- 44 - Access Levels in Inheritance
- [TypeScript] TypeScript Learning — Generics
- 41 and 42 - type conversion function
- 39 - Analysis of the comma operator
- I just graduated and was cheated into a small company. I "take the data and make a watch" every day. I regret that I didn't use this tool earlier.
- 46 - structure of inheritance and the destructor
- 45 - Different ways of inheritance
- Are Long Tou Academy courses reliable?Is it safe to open an account on it?