current position:Home>What is artificial neural network

What is artificial neural network

2022-05-15 07:51:48User 7353950


  • One 、 Artificial neural network
  • Two 、 Biological neural networks
  • 3、 ... and 、 Silicon based intelligence and carbon based intelligence
    • Computer : Silicon based intelligence
    • The human brain : Carbon based intelligence
  • Four 、MP Model
    • perceptron —— The simplest neural network structure
    • Single layer perceptron —— Unable to handle XOR problem
    • Multilayer perceptron —— Hidden layer 、 Back propagation

One 、 Artificial neural network

The mainstream research method of artificial intelligence is connectionism , Artificial neural network is used to simulate human intelligence . Artificial neural network (Artificial Neural Network, namely ANN ), yes 20 century 80 Since the s, it has become a research hotspot in the field of artificial intelligence . It abstracts the neural network of human brain from the perspective of information processing , Build some simple model , Make up different networks according to different ways of connection . Artificial neural network draws lessons from the idea of biological neural network , It's a super simplified version of biological neural network . Simulate the structure and function of human brain nervous system by means of engineering technology , Many neurons in the human brain are simulated by a large number of nonlinear parallel processors , The complex connection of the processor is used to simulate the synaptic behavior between many neurons in the human brain .

Two 、 Biological neural networks

The human brain is composed of about 100 billion nerve cells and 100 billion nerve synapses , These nerve cells and their synapses together constitute a huge biological neural network

  • The protruding processes of each neuron are divided into dendrites and axons .
  • More dendritic branches , Each branch can also branch again , The length is generally short , The function is to receive signals .
  • There is only one axon , The length is generally long , The function is to transmit the nerve signals from dendrites and cell surface into the cell body to other neurons .
  • Neurons in the brain receive excitatory postsynaptic potentials and inhibitory postsynaptic potentials from nerve dendrites , Produce the action potential of neurons transmitted along their axons .

Biological neural network has the following characteristics :

  1. Each neuron is a multi input single output information processing unit , There are two types of neuronal input: excitatory input and inhibitory input
  2. Nerve cells connect and communicate with other nerve cells through synapses , When the signal strength received by the synapse exceeds a certain threshold , Nerve cells are activated , And send activation signals to the upper nerve cells through synapses
  3. Neurons have the characteristics of spatial integration and threshold , Higher level neurons process what lower level neurons don't have “ new function ”
  4. There is a fixed time lag between input and output of neuron , Mainly depends on synaptic delay

The attribute of external things is generally light wave 、 acoustic wave 、 Radio wave and other methods as input , Stimulate human biosensors .

3、 ... and 、 Silicon based intelligence and carbon based intelligence

Human intelligence is carbon based intelligence based on organic matter , Artificial intelligence is silicon-based intelligence based on inorganic matter . The essential difference between carbon based intelligence and silicon-based intelligence is architecture , Determines whether the data transmission and processing can be carried out at the same time .

Computer : Silicon based intelligence

Data transmission and processing cannot be synchronized .

feng · A core feature of Neumann architecture is the separation of computing unit and storage unit , The two are connected by a data bus . The arithmetic unit needs to receive data from the storage unit from the data bus , After the operation is completed, the operation result is transmitted back to the storage unit through the data bus Data is not stored for storage , It's stored for quick extraction when needed , The function of storage is to improve the effectiveness of data processing

The human brain : Carbon based intelligence

Data transmission and processing are carried out synchronously .

Data transmission and processing are completed by the interaction between synapses and neurons , And at the same time , There is no order of transmission before processing . In the same time and space , The mammalian brain can exchange and process information on a distributed nervous system , This is beyond the reach of computers . Biological memory is a process of retaining essence , It is impossible to simulate biological memory with simple storage .

Four 、MP Model

M-P Model , It is an abstract and simplified model constructed according to the structure and working principle of biological neurons . MP Neurons receive one or more inputs , The linear weighting of the input is processed nonlinearly to produce the output . Assume MP The neuron input signal is N+1 Dimension vector [Math Processing Error](x0,x1,...,xN), The first i The weight of each component is [Math Processing Error]wi, Then the output can be written as

[Math Processing Error]y=ϕ(∑i=0Nwixi)

[Math Processing Error]ϕ(⋅) It's a transfer function , It is used to convert the weighted input into output , It is usually designed as a continuous and bounded nonlinear increasing function . stay MP In neurons , McCulloch and Pitts limited input and output to binary signals , The transfer function used is a discontinuous symbolic function , The symbolic function takes the preset threshold as the parameter : When the input is greater than the threshold , Symbolic function output 1, And vice versa 0 such MP Neurons work like logic gates in digital circuits , Can achieve “ Logic and ” perhaps “ Logic or ” The function of .

perceptron —— The simplest neural network structure

stay 1958 year , american psychologist Frank Rosenblatt This paper presents a neural network with a single layer of computing cells , It's called perceptron (Perceptron). It's based on M-P Structure of model . Frank suffer 1949 Canadian psychologist Donald · What herb put forward “ Heb theory ”, The core idea is that the process of learning is mainly realized through the formation and change of synapses between neurons . The more communication between two neurons , The connection between them is becoming stronger and stronger , The effect of learning is also gradually produced in the process of continuous strengthening of connection . From the perspective of artificial neural network , The significance of this theory is to give the criterion of changing the weight between model neurons

  1. If two neurons are activated at the same time , Their weight should be increased
  2. If two neurons are activated separately , The weight of both should be reduced

Perceptron is not a real device , But a supervised learning algorithm of two classification , Can decide whether the input represented by the vector does not belong to a specific category . The perceptron is composed of input lead and output lead , The input conductor is responsible for receiving external signals , Output lead is MP Neuron , That is, the threshold logic unit . Each input signal ( features ) With a certain weight MP In neurons ,MP Neurons use symbols to map the linear combination of features into classification output . Give a training set containing several instances of input-output correspondence , The learning steps are :

  1. Initialization weight w(0) And thresholds , The weight can be initialized to 0 Or smaller random Lou
  2. For the first time in the training set j Samples , Enter it into the vector [Math Processing Error]xj Into the already initialized perceptron , Get the output [Math Processing Error]yj(t)
  3. according to [Math Processing Error]yj(t) And the sample [Math Processing Error]j Given output result of [Math Processing Error]dj, Update the weight vector according to the rule : [Math Processing Error]wi(t+1)=wi(t)+η[dj−yj(t)]⋅xj,i
  4. Repeat the above two steps , Until the training times reach the preset value

The third step is to update the weight of the perceptron , It is the core step of learning algorithm , among 0<η≤1 It is called the learning rate parameter , Is a scale factor to correct the error . If the classification result is the same as the real result , Then keep the weight unchanged ; If the output value should be 0 But it is 1, It's about reducing [Math Processing Error]xj The value entered in is 1 The weight of the component ; If the output value should be 1 But it is 0, Increase [Math Processing Error]xj The value entered in is 1 The weight of the component

The premise that perceptron can learn is that it has convergence . The perceptron learning algorithm can converge after a finite number of iterations , The hyperplane where the decision surface lies between the two classes is obtained . In essence, when performing binary classification problems , The perceptron takes the total distance from all error classification points to the hyperplane as the loss function , The loss function is continuously reduced by the random gradient descent method , Until the correct classification results are obtained

In addition to excellent convergence performance , The perceptron is also adaptive , As long as a given training data set , The algorithm can adaptively adjust the parameters based on error correction without manual intervention , This is in MP Great progress has been made in the of neurons .

Single layer perceptron —— Unable to handle XOR problem

It can only solve the linear classification problem , There is no way to deal with the XOR problem The so-called linear classification means that all positive and negative cases can be completely separated by a hyperplane in high-dimensional space without errors . If a circle is divided into two semicircles, black and white , This is a linearly separable problem ; But if it's a Tai Chi diagram , A straight line alone can't completely distinguish black from white , This problem is not linear .

If the training data set is not linearly separable , That is, the positive case cannot be separated from the negative case through the hyperplane , Then it is impossible for the perceptron to correctly classify all input vectors .

Multilayer perceptron —— Hidden layer 、 Back propagation

Multilayer perceptron solves the XOR problem , A hidden layer is added between the input and output layers , It adopts the way of back propagation .

  • Hidden layer The core structure of multilayer perceptron is hidden layer , For feature detection . It's called a hidden layer , Because these neurons do not belong to the input or output of the network . Hidden neurons transform the training data into a new feature space , And identify the prominent features of the training data . Between different layers , The multi-channel sensor has full connectivity , That is, each neuron in any layer is connected with all neurons or nodes in its previous layer , The strength of the connection is determined by the weight system in the network .
  • Back propagation The error function is obtained by subtracting the output from the real value , Finally, the weight is updated according to the error function . During training , Although the flow direction of the signal is the output direction , However, the calculated error function is opposite to the direction of signal propagation , This way of learning is called back propagation . By solving the partial derivative of the error function about each weight coefficient , So as to minimize the error and train the whole network .

copyright notice
author[User 7353950],Please bring the original link to reprint, thank you.

Random recommended