## Non linear regression – 1

Recent posts have described how to fit linear equations (or equations that can be made linear) to a set of data, using the Excel Linest functions, or User Defined Functions (UDF’s) linking to the ALGLIB library (see Using LINEST for non-linear curve fitting and ALGLIB linear and polynomial fitting functions).

This post will describe how to perform non-linear regression using the Excel Solver, and the following one using ALGLIB non-linear regression functions, and the relative advantages and disadvantages of the two methods.

Non-linear regression uses an iterative process to minimise the square of the difference between the values in the data being fitted, and the values generated by the regression equation.  This is exactly the sort of problem that the Excel Solver is designed to solve, and setting up the solution is fairly straightforward.  The steps are:

• Paste the base data into a spreadsheet, with independent variables and dependent variable in adjacent columns.
• Enter a list of coefficients that will be used in the regression function, using guessed values.
• Enter the regression function in a cell adjacent to the top row of data, using absolute addresses for the coefficients, and relative addresses for the variables.
• In the adjacent column enter a formula returning the square of the difference between data values and the value of the regression function
• Copy these two cells down over the full list of data.
• Sum the “square of the difference” column
• Use the Solver to minimise the sum of the squares by adjusting the regression function coefficients.

This process is illustrated in the spreadsheet NonLinFit-Solver.xls (click to download), which uses the Solver to fit two alternative functions to data for concrete shrinkage.

The screenshot below shows the top of the data, and the columns calculating the square of the differences for each function:

Non-linear regression using Solver (click for full size view)

The values to be minimised are in cells E23 and G23, by adjusting the values in ranges B15:B18 and D15:D20 respectively.  This is done (in two separate operations) by simply entering these ranges into the Solver dialog box, and clicking the Solve button.  The set-up for the first equation is shown in the screenshot below:

Solver dialog box

The results for the first function are seen in the screen shots below:

k1calc.1 = A*LN(T*B)-C*T-D*Th; natural scale

k1calc.1 = A*LN(T*B)-C*T-D*Th; Log scale

It can be seen that although the general trend has been captured there is still a considerable difference between the regression lines and the data points in some place.

The second function provides a much better fit:

k1calc.2 = (A+B*exp(C*Th))*T^D/(T^E + F * Th); natural scale

k1calc.2 = (A+B*exp(C*Th))*T^D/(T^E + F * Th); log scale

The good fit is to be expected, because the base data was generated with an equation of the same form.  The solver analysis has generated coefficients (A to F) that are close to those used to generate the data, but not an exact fit.  The performance of the ALGLIB functions with the same data will be examined in the next post.

This entry was posted in AlgLib, Concrete, Excel, Maths, Newton and tagged , , , . Bookmark the permalink.

### 4 Responses to Non linear regression – 1

1. Obaida Al-Saggar says:

Thanks for the helpful post. I have a question:In the excel sheet. some arbitrary values are provided for variables/names A,B,C,D etc.

What are these variables and how did you calculate them ?

thanks again

Like

2. dougaj4 says:

Hi Obaida – to generate the data I used the formula in the Australian concrete structures code, which is:
k1 = (0.8 + 1.2 *exp(-0.005*Th))*T^0.8/(T^0.8 + 0.15 * Th)
where Th is the thickness in mm and T is the time in days.

The coefficents are empirical numbers chosen to match the real concrete shrinkage used in preparing the code. The second formula used in the spreadsheet has exactly the same form as that used in the code, and you will see that the values for A to F are close to the code values. These numbers were calculated by the Solver to match the results of the formula to the given data. If you change the numbers to the nearest integer, or -.002 for C, then run the Solver again, you should get the same results.

The first formula is a simplified version, that was intended to illustrate that if you try to fit the data to the wrong type of formula, you might get an approximate match, but the results won’t capture all the features of the data, and may give misleading results if used for extrapolation. If you enter your own numbers for A to D, in column B, and re-run the solver, you should again get back to the values shown in the screen shots, which are the values found by the Solver to minimise the square of the difference from the data you are trying to match. Note that when you run the solver make sure that the range in the “By Changing Variable Cells” box is the one you want to use (i.e. either B15:B18 or D15:D20) and the “Set objective” cell is E23 for formula 1 or G23 for formula 2.

Like