Team:AHUT China/Model

Royal Hotel Royal Hotel

Model

At present, our project is still in the laboratory stage and has not yet reach the large-scale application. In order to seek scenarios of large-scale application in a better way.With the help of the instructor group, we held a seminar with the Astronautics Model Team of Anhui University of Technology to develop a social practice program.

We got in touch with a factory in Maanshan(Maanshan Steel Plant), and a six-rotor drone was provided by the model team, as shown in Fig.1 :

Fig.1 Six-rotor drone

Zhao Lei, a member of our team who has studied embedded programming development, used different gas sensors (the gas sensor can detect the mass of carbon dioxide, sulfur dioxide and other gases in each liter) and STM32 ARM microcontroller to develope a mountable carbon dioxide detection device on the drone. As the picture shows:

Fig.2 Gas detector

We used this drone to carry the detection device into the air and tested the air nearing the end of the factory's exhaust system. It was found that the concentration of carbon dioxide nearing the smoke extraction device was extremely high,and it is 5-20 times greater than the normal value.As mentioned in the eleven national standards for greenhouse gas management, including Greenhouse Gas Accounting and Reporting for Industrial Enterprises.The mass ratio of the various gases in the exhaust gas from the production process of such factories is about :oxygen: carbon dioxide: sulfur dioxide: hydrogen sulfide: Carbon monoxide: Hydrogen chloride: Fluoride: Nitrogen oxide: Other=14:10:3:3:3:2:3:8:54. After several measurements and averaging , we knew the composition,content and mass percentage of each gas in the factory exhaust gas.

Gas name	Content under standard conditions（mg/L）	Mass percentage
carbon dioxide	123.0025	10.2599%
oxygen	155.1683	12.9429%
Sulfur dioxide	48.5526	4.0499%
Hydrogen sulfide	56.8314	4.4704%
Carbon monoxide	38.2593	3.1913%
Hydrogen chloride	25.9654	2.1658%
Fluoride	46.9342	3.9149%
Nitrogen oxides	96.2349	8.0272%
other	607.9167	50.7077%
total	1198.8653	100．0000%

The detected gas content proves the correctness of the literature data.According to the proportion of gases in the exhaust gas in the literature.We conducted the simulation in the laboratory.A proportionate gas is manually mixed and passed through water to prepare an unsaturated solution. Depending on the time of access, the amount of carbon dioxide in the solution is continuously tested and used as a source of data for our mathematical modeling.

First of all.By using MATLAB to conduct the correlation analysis of the experimental data,we found that the color readings (five dimensions: B, G, R, H, S) showed a certain linear correlation with the concentration of carbon dioxide.This conclusion is consistent with the literature [1],which obtains its conclusion by using lambert-beer's absorption law. That is, there is a certain relationship between the substance concentration and the color reading. Secondly, using the multiple regression of statistics to carry out regression analysis on the data, the relationship between the material concentration and the color reading (five-dimensional) is obtained, and the appropriate mathematical expression (or mathematical model) between them is determined as the empirical formula or Regression equation.

A mathematical model for determining color readings and carbon dioxide concentration - a linear regression equation. Firstly, a linear regression model between carbon dioxide concentration and color reading is established. The residual of the model is large and the fitting effect is not good.

Considering establishing a nonlinear quadratic regression model.Using the rstool function modeling in the MATLAB statistical toolbox, and evaluating the pros and cons of the model by residual standard deviation and residual. In the final nonlinear quadratic regression model, the residual standard deviation is small, the prediction model is very good, and the residual of the model is reduced by an order of magnitude compared with the multiple linear regression model. Therefore, the linear quadratic regression model is better than the linear regression model. The comparison of the errors of the two models shows that the nonlinear regression quadratic equation has higher precision.

Model establishment and solution:

According to the previous analysis.Firstly, we established a linear regression model,which is consistent with the problem.by using the experimental data (ie, Table 1) and linear regression with matlab, we obtain a linear regression equation between carbon dioxide concentration and color reading.

Multiple linear regression model

By using multiple linear regression ,we plot the residuals (see Fig.3). As we can be seen from the residual plot, except for the 15th data, the residuals of the remaining data are close to zero, and the confidence interval of the residuals contains zero points, which indicates that the regression model can better match the original data, and this data can be regarded as the abnormal point (cull). After the rejection, the multiple linear regression is performed again to obtain the residual plot (see Fig.4),the significance test indicators of the regression equation (see Table 1) and the specific residual values (see Table 2). From the table 1: correlation coefficient R² = 0.9250310882931, indicating that the regression equation is significant. According to the test of F, the probability of F corresponds to p < α, rejecting H₀, and the regression model (VIII) established. However, the estimated error variance is too large.
y=2910.630153554265+3.587352490846x1-21.155917919245x2+4.796418968805x3-6.750902382498x4-10.532016102969x5 (Ⅷ)

concentration（mg/L）	B	G	R	H	S
0	153	148	157	138	14
0	153	147	157	138	16
0	153	146	158	137	20
0	153	146	158	137	20
0	154	145	157	141	19
20	144	115	170	135	82
20	144	115	169	136	81
20	145	115	172	135	83
30	145	114	174	135	87
30	145	114	176	135	89
30	145	114	175	135	89
30	146	114	175	135	88
50	142	99	175	137	110
50	141	99	174	137	109
50	142	99	176	136	110
80	141	96	181	135	119
80	141	96	182	135	119
80	140	96	182	135	120
100	139	96	175	136	115
100	139	96	174	136	114
100	139	96	176	136	116
150	139	86	178	136	131
150	139	87	177	137	129
150	138	86	177	137	130
150	139	86	178	137	131

Table 1 Experimental data of carbon dioxide

Fig.3 Carbon dioxide linear regression residual map

Fig.4 Linear regression residuals of carbon dioxide concentration and color reading after eliminating abnormal points

Correlation coefficient R^2	F	Probability P corresponding to F	Estimated error variance
0.9250310882931	44.4199047583412	0.0000000016617	270.6516543935724

Table 2 The significance test index of carbon dioxide linear regression equation

concentration（mg/L）	Residual value
0	-2.384256481544441
0	-2.476142194851974
0	6.948682946475856
0	6.948682946475856
0	3.473424932212367
20	-14.672434139092047
20	-13.657128890758031
20	-17.320608464579323
30	4.058700090441562
30	15.529894358769411
30	20.326313327574439
30	6.206944733759315
50	-31.576255061221445
50	-33.724499704539312
80	-8.948829979216725
80	-13.745248948021299
80	0.374119645794281
100	11.627226785928087
100	5.891629651764106
100	17.362823920092069
150	4.191048334563675
150	15.830255399174575
150	8.793706073744261
150	10.941950717061900

Table 3 Specific residual value

It can be seen from the residual value that the model has yet to be optimized. The multivariate linear regression model can continue to be optimized by eliminating the anomalous points in the new residual map. But the continued optimization is limited and the data integrity is getting worse. The results of the linear regression model require further optimization and improvement.So a multivariate nonlinear quadratic regression has been tried.

Multiple quadratic regression model

Establishment and solution of multiple quadratic regression models

A multivariate quadratic regression equation is established using rstool(x, y, 'model', alpha). The 'model' option refers to selecting one of the following four models (input with a string, which default is a linear model):

The function output includes regression parameters, residual standard deviation, and residuals. You can determine which is best by comparing the standard deviation of multiple models by modifying the value of model.
This problem ends with a completely quadratic method for multivariate nonlinear quadratic regression.That is by using the model (IX)

(Model (IX) where y represents the concentration,x1, x2, x3, x4, x5 representing the B, G, R, H, S)

Substituting data for multivariate quadratic regression fitting.The specific results are shown in Fig.5,Table2 and model (X).

Fig.5 Result

concentration（mg/L）	Residual value
0	-0.183023306644486
0	0.295698988685444
0	-0.0573058051979842
0	-0.0573058051979842
0	-0.0112789762431476
20	0.483971591391310
20	0.0544367711663654
20	-0.617107404175840
30	0.438573762845408
30	0.223558673598745
30	-0.471712868260511
30	0.0136040522083931
50	0.551662284327904
50	-0.775350805535709
50	0.0956137145112734
80	0.0435429326025769
80	0.243119500199100
80	-0.357436173322640
100	-1.73816418466959
100	0.0231545045717212
100	1.60688363169902
150	0.0231545045717212
150	0.875543694299267
150	0.637870694485173
150	-1.34395668067009

Table 4

Test of multiple quadratic regression model

We test the quadratic regression model with the residual standard deviation. The regression residual e_i=Y_i-Y×i helps us to measure the degree of the regression model fitting the sample data. In order to use linear regression analysis, the regression residual standard deviation needs to be calculated. The regression residual standard deviation is the accuracy index used by the regression equation to do some predictions, an it can be used to test the reliability of the model prediction.The regression residual standard deviation (recorded as ): If is close to 0, indicating the deviation of the model of the sample data is small,and the reliability (accuracy) of the prediction is higher.The larger the value of ,the larger the model deviates from the sample data,and the worse the reliability of the prediction is. In practical problems, tends to be large. To evaluate the pros and cons of the model, the index S/Y is usually used. When S/Y < 15%, the prediction model can be considered better. According to the results, we can calculate that the regression residual standard deviation RMSE= 1.65062261369908, S/Y=0.02821577117, and the prediction model is very good. Moreover, it can be seen from the residual value that the model fitting effect is very good. The original data is not eliminated, which ensures the integrity of the data,and the inadequacy should be the complexity of the equation.

References

[1]YANG Haiyan, JIA Guiru. A Method for Rapid Detection of Colored and Transparent Solution Concentration Based on Digital Colorimetry[J]. Journal of China Agricultural University, 2006, 11(3): 47-50.
[2] Wang Yan, Yan Silian, Wang Aiqing. Mathematical Statistics and MATLAB Engineering Data Analysis [M]. Beijing: Tsinghua University Press. 2006: 126-177
[3] National Standards Committee “National Greenhouse Gas Emissions Accounting and Reporting and Other 11 National Standards for Greenhouse Gas Management”
[4] The proportion of gas content comes from "Situation Analysis of Greenhouse Gas Emissions in China's Steel Industry", Zhang Li, Wang Pretty, Li Wei, Li Sujing, 2015, 12

concentration（mg/L）	B	G	R	H	S
0	153	148	157	138	14
0	153	147	157	138	16
0	153	146	158	137	20
0	153	146	158	137	20
0	154	145	157	141	19
20	144	115	170	135	82
20	144	115	169	136	81
20	145	115	172	135	83
30	145	114	174	135	87
30	145	114	176	135	89
30	145	114	175	135	89
30	146	114	175	135	88
50	142	99	175	137	110
50	141	99	174	137	109
50	142	99	176	136	110
80	141	96	181	135	119
80	141	96	182	135	119
80	140	96	182	135	120
100	139	96	175	136	115
100	139	96	174	136	114
100	139	96	176	136	116
150	139	86	178	136	131
150	139	87	177	137	129
150	138	86	177	137	130
150	139	86	178	137	131

concentration（mg/L）	B	G	R	H	S
0	153	148	157	138	14
0	153	147	157	138	16
0	153	146	158	137	20
0	153	146	158	137	20
0	154	145	157	141	19
20	144	115	170	135	82
20	144	115	169	136	81
20	145	115	172	135	83
30	145	114	174	135	87
30	145	114	176	135	89
30	145	114	175	135	89
30	146	114	175	135	88
50	142	99	175	137	110
50	141	99	174	137	109
50	142	99	176	136	110
80	141	96	181	135	119
80	141	96	182	135	119
80	140	96	182	135	120
100	139	96	175	136	115
100	139	96	174	136	114
100	139	96	176	136	116
150	139	86	178	136	131
150	139	87	177	137	129
150	138	86	177	137	130
150	139	86	178	137	131

concentration（mg/L）	B	G	R	H	S
0	153	148	157	138	14
0	153	147	157	138	16
0	153	146	158	137	20
0	153	146	158	137	20
0	154	145	157	141	19
20	144	115	170	135	82
20	144	115	169	136	81
20	145	115	172	135	83
30	145	114	174	135	87
30	145	114	176	135	89
30	145	114	175	135	89
30	146	114	175	135	88
50	142	99	175	137	110
50	141	99	174	137	109
50	142	99	176	136	110
80	141	96	181	135	119
80	141	96	182	135	119
80	140	96	182	135	120
100	139	96	175	136	115
100	139	96	174	136	114
100	139	96	176	136	116
150	139	86	178	136	131
150	139	87	177	137	129
150	138	86	177	137	130
150	139	86	178	137	131