current position：Home>What is artificial neural network
What is artificial neural network
2022-05-15 07:51:48【User 7353950】
- One 、 Artificial neural network
- Two 、 Biological neural networks
- 3、 ... and 、 Silicon based intelligence and carbon based intelligence
- Computer ： Silicon based intelligence
- The human brain ： Carbon based intelligence
- Four 、MP Model
- perceptron —— The simplest neural network structure
- Single layer perceptron —— Unable to handle XOR problem
- Multilayer perceptron —— Hidden layer 、 Back propagation
One 、 Artificial neural network
The mainstream research method of artificial intelligence is connectionism , Artificial neural network is used to simulate human intelligence . Artificial neural network （Artificial Neural Network, namely ANN ）, yes 20 century 80 Since the s, it has become a research hotspot in the field of artificial intelligence . It abstracts the neural network of human brain from the perspective of information processing , Build some simple model , Make up different networks according to different ways of connection . Artificial neural network draws lessons from the idea of biological neural network , It's a super simplified version of biological neural network . Simulate the structure and function of human brain nervous system by means of engineering technology , Many neurons in the human brain are simulated by a large number of nonlinear parallel processors , The complex connection of the processor is used to simulate the synaptic behavior between many neurons in the human brain .
Two 、 Biological neural networks
The human brain is composed of about 100 billion nerve cells and 100 billion nerve synapses , These nerve cells and their synapses together constitute a huge biological neural network
- The protruding processes of each neuron are divided into dendrites and axons .
- More dendritic branches , Each branch can also branch again , The length is generally short , The function is to receive signals .
- There is only one axon , The length is generally long , The function is to transmit the nerve signals from dendrites and cell surface into the cell body to other neurons .
- Neurons in the brain receive excitatory postsynaptic potentials and inhibitory postsynaptic potentials from nerve dendrites , Produce the action potential of neurons transmitted along their axons .
Biological neural network has the following characteristics ：
- Each neuron is a multi input single output information processing unit , There are two types of neuronal input: excitatory input and inhibitory input
- Nerve cells connect and communicate with other nerve cells through synapses , When the signal strength received by the synapse exceeds a certain threshold , Nerve cells are activated , And send activation signals to the upper nerve cells through synapses
- Neurons have the characteristics of spatial integration and threshold , Higher level neurons process what lower level neurons don't have “ new function ”
- There is a fixed time lag between input and output of neuron , Mainly depends on synaptic delay
The attribute of external things is generally light wave 、 acoustic wave 、 Radio wave and other methods as input , Stimulate human biosensors .
3、 ... and 、 Silicon based intelligence and carbon based intelligence
Human intelligence is carbon based intelligence based on organic matter , Artificial intelligence is silicon-based intelligence based on inorganic matter . The essential difference between carbon based intelligence and silicon-based intelligence is architecture , Determines whether the data transmission and processing can be carried out at the same time .
Computer ： Silicon based intelligence
Data transmission and processing cannot be synchronized .
feng · A core feature of Neumann architecture is the separation of computing unit and storage unit , The two are connected by a data bus . The arithmetic unit needs to receive data from the storage unit from the data bus , After the operation is completed, the operation result is transmitted back to the storage unit through the data bus Data is not stored for storage , It's stored for quick extraction when needed , The function of storage is to improve the effectiveness of data processing
The human brain ： Carbon based intelligence
Data transmission and processing are carried out synchronously .
Data transmission and processing are completed by the interaction between synapses and neurons , And at the same time , There is no order of transmission before processing . In the same time and space , The mammalian brain can exchange and process information on a distributed nervous system , This is beyond the reach of computers . Biological memory is a process of retaining essence , It is impossible to simulate biological memory with simple storage .
Four 、MP Model
M-P Model , It is an abstract and simplified model constructed according to the structure and working principle of biological neurons . MP Neurons receive one or more inputs , The linear weighting of the input is processed nonlinearly to produce the output . Assume MP The neuron input signal is N+1 Dimension vector [Math Processing Error](x0,x1,...,xN), The first i The weight of each component is [Math Processing Error]wi, Then the output can be written as
[Math Processing Error]y=ϕ(∑i=0Nwixi)
[Math Processing Error]ϕ(⋅) It's a transfer function , It is used to convert the weighted input into output , It is usually designed as a continuous and bounded nonlinear increasing function . stay MP In neurons , McCulloch and Pitts limited input and output to binary signals , The transfer function used is a discontinuous symbolic function , The symbolic function takes the preset threshold as the parameter ： When the input is greater than the threshold , Symbolic function output 1, And vice versa 0 such MP Neurons work like logic gates in digital circuits , Can achieve “ Logic and ” perhaps “ Logic or ” The function of .
perceptron —— The simplest neural network structure
stay 1958 year , american psychologist Frank Rosenblatt This paper presents a neural network with a single layer of computing cells , It's called perceptron (Perceptron). It's based on M-P Structure of model . Frank suffer 1949 Canadian psychologist Donald · What herb put forward “ Heb theory ”, The core idea is that the process of learning is mainly realized through the formation and change of synapses between neurons . The more communication between two neurons , The connection between them is becoming stronger and stronger , The effect of learning is also gradually produced in the process of continuous strengthening of connection . From the perspective of artificial neural network , The significance of this theory is to give the criterion of changing the weight between model neurons
- If two neurons are activated at the same time , Their weight should be increased
- If two neurons are activated separately , The weight of both should be reduced
Perceptron is not a real device , But a supervised learning algorithm of two classification , Can decide whether the input represented by the vector does not belong to a specific category . The perceptron is composed of input lead and output lead , The input conductor is responsible for receiving external signals , Output lead is MP Neuron , That is, the threshold logic unit . Each input signal （ features ） With a certain weight MP In neurons ,MP Neurons use symbols to map the linear combination of features into classification output . Give a training set containing several instances of input-output correspondence , The learning steps are ：
- Initialization weight w(0) And thresholds , The weight can be initialized to 0 Or smaller random Lou
- For the first time in the training set j Samples , Enter it into the vector [Math Processing Error]xj Into the already initialized perceptron , Get the output [Math Processing Error]yj(t)
- according to [Math Processing Error]yj(t) And the sample [Math Processing Error]j Given output result of [Math Processing Error]dj, Update the weight vector according to the rule ： [Math Processing Error]wi(t+1)=wi(t)+η[dj−yj(t)]⋅xj,i
- Repeat the above two steps , Until the training times reach the preset value
The third step is to update the weight of the perceptron , It is the core step of learning algorithm , among 0<η≤1 It is called the learning rate parameter , Is a scale factor to correct the error . If the classification result is the same as the real result , Then keep the weight unchanged ; If the output value should be 0 But it is 1, It's about reducing [Math Processing Error]xj The value entered in is 1 The weight of the component ; If the output value should be 1 But it is 0, Increase [Math Processing Error]xj The value entered in is 1 The weight of the component
The premise that perceptron can learn is that it has convergence . The perceptron learning algorithm can converge after a finite number of iterations , The hyperplane where the decision surface lies between the two classes is obtained . In essence, when performing binary classification problems , The perceptron takes the total distance from all error classification points to the hyperplane as the loss function , The loss function is continuously reduced by the random gradient descent method , Until the correct classification results are obtained
In addition to excellent convergence performance , The perceptron is also adaptive , As long as a given training data set , The algorithm can adaptively adjust the parameters based on error correction without manual intervention , This is in MP Great progress has been made in the of neurons .
Single layer perceptron —— Unable to handle XOR problem
It can only solve the linear classification problem , There is no way to deal with the XOR problem The so-called linear classification means that all positive and negative cases can be completely separated by a hyperplane in high-dimensional space without errors . If a circle is divided into two semicircles, black and white , This is a linearly separable problem ; But if it's a Tai Chi diagram , A straight line alone can't completely distinguish black from white , This problem is not linear .
If the training data set is not linearly separable , That is, the positive case cannot be separated from the negative case through the hyperplane , Then it is impossible for the perceptron to correctly classify all input vectors .
Multilayer perceptron —— Hidden layer 、 Back propagation
Multilayer perceptron solves the XOR problem , A hidden layer is added between the input and output layers , It adopts the way of back propagation .
- Hidden layer The core structure of multilayer perceptron is hidden layer , For feature detection . It's called a hidden layer , Because these neurons do not belong to the input or output of the network . Hidden neurons transform the training data into a new feature space , And identify the prominent features of the training data . Between different layers , The multi-channel sensor has full connectivity , That is, each neuron in any layer is connected with all neurons or nodes in its previous layer , The strength of the connection is determined by the weight system in the network .
- Back propagation The error function is obtained by subtracting the output from the real value , Finally, the weight is updated according to the error function . During training , Although the flow direction of the signal is the output direction , However, the calculated error function is opposite to the direction of signal propagation , This way of learning is called back propagation . By solving the partial derivative of the error function about each weight coefficient , So as to minimize the error and train the whole network .
author[User 7353950],Please bring the original link to reprint, thank you.
The sidebar is recommended
- Which securities company does qiniu school recommend? Is it safe to open an account
- Hyperstyle: complete face inversion using hypernetwork
- What activities are supported by the metauniverse to access reality at this stage?
- P2P swap OTC trading on qredo
- Google | coca: the contrast caption generator is the basic image text model
- SIGIR 2022 | Huawei reloop: self correcting training recommendation system
- Whether you want "melon seed face" or "national character face", the "face changing" technology of Zhejiang University video can be done with one click!
- Sorting of naacl2022 prompt related papers
- Servlet create project
- "Chinese version" Musk was overturned by the original: "if it's true, I want to see him"
guess what you like
[network security] web security trends and core defense mechanisms
[intensive reading] object detection series (10) FPN: introducing multi-scale with feature pyramid
007. ISCSI server chap bidirectional authentication configuration
plot_ Importance multi classification, sorting mismatch, image value not displayed
[intensive reading] object detection series (XI) retinanet: the pinnacle of one stage detector
How to install MFS environment for ECS
[intensive reading] the beginning of object detection series (XII) cornernet: anchor free
Open source sharing -- a record of students passing through time
MOT：A Higher Order Metric for Evaluating Multi-object Tracking
- How to develop a distributed memory database (1)
- Reverse engineers reverse restore app and code, and localization is like this
- One line command teaches you how to export all the libraries in anaconda
- Bi tools are relatively big. Let's see which one is most suitable for you
- Read the history of database development
- Self cultivation of coder - batterymanager design
- Technology application of swift phantom type phantom in Apple source code learning
- Swiftui advanced skills: what is the use of the technology of swift phantom type phantom
- Swiftui advanced animation Encyclopedia of complex deformation animation is based on accelerate and vector arithmetic (tutorial includes source code)
- What problems remain unsolved in swiftui in 2022
- I'll set the route for fluent
- Flutter drawing process analysis and code practice
- Emoji language commonly used icon collection (interesting Emoji)
- 5.14 comprehensive case 2.0 - automatic induction door
- How to deploy redis service on k8s top?
- Importance of data warehouse specification
- Idea automatically generates serialization ID
- Why is it recommended not to use select * in MySQL?
- Let's talk about why redis needs to store two data structures for the same data type?
- Domain lateral move RDP delivery
- [learn slam orb_slam2 or slam3 from scratch] summary of all blog articles
- 20000 + star ultra lightweight OCR system pp-ocrv3 effect increased by 5% - 11%!
- A configurable canvas clock - Super multi style
- The pp-ocrv3 effect of 20000 + star ultra lightweight OCR system is further improved by 5% - 11%
- MySQL's golden rule: "don't use select *"
- True interview question: why does redis store a data type twice?
- High threshold for large factories? Five exclusive PDFs inside Alibaba will take you forward and win the offer
- Is it really hard to find a job? How on earth can I find a job with high salary without worrying about being laid off
- How to design knowledge center? (code attached)
- OWASP top 10 vulnerability analysis
- Are you still writing comment templates manually? Idea can generate annotation templates of classes and methods with one click. Click in if you don't know
- Numpy core syntax and code sorting summary!
- Can you believe that the swoole timer can realize millisecond task scheduling?
- Detailed explanation of art template engine
- Telephone subsystem of openharmony source code analysis -- call flow
- Yixin Huachen: how to do a good job in digital transformation in the power industry?
- One stop collaboration is really delicious - apipost
- Notes on modern algebra and series of questions: Chapter 1 (introduction of algebraic system)
- Notes on modern algebra and serialization of question types: Chapter 2 (properties of binary operation)