15.3.3 Theory of Nonlinear Curve FittingNLFitTheory
How Origin Fits the Curve
The aim of nonlinear fitting is to estimate the parameter values which best describe the data. Generally we can describe the process of nonlinear curve fitting as below.
 Generate an initial function curve from the initial values.
 Iterate to adjust parameter values to make data points closer to the curve.
 Stop when minimum distance reaches the stopping criteria to get the best fit
Origin provides options of different algorithm, which have different iterative procedure and statistics to define minimum distance.
Explicit Functions
Fitting Model
A general nonlinear model can be expressed as follows:
where is the independent variables and is the parameters.
 Examples of the Explicit Function

LeastSquares Algorithms
The least square algorithm is to choose the parameters that would minimize the deviations of the theoretical curve(s) from the experimental points. This method is also called chisquare minimization, defined as follows:
where is the row vector for the ith (i = 1, 2, ... , n) observation.
Origin provides two options to adjust the parameter values in the iterative procedure
LevenbergMarquardt (LM) Algorithm
The LevenbergMarquardt (LM) algorithm^{11} is a iterative procedure which combines the GaussNewton method and the steepest descent method.
The algorithm works well for most cases and become the standard of nonlinear least square routines.
 Compute the value from the given initial values: .
 Pick a modest value for , say = 0.001
 Solve the LevenbergMarquardt funciton^{11} for and evaluate
 If ,increase by a factor of 10 and go back to step 3
 if , decrease by a factor of 10, update the parameter values to be and go back to step 3
 Stop until the values computed in two successive iterations are small enough (compared with the tolerance)
Downhil Simplex Algorithm
Besides the LM method, Origin also provides a Downhill Simplex approximation^{9,10}. In geometry, a simplex is a polytope of N + 1 vertices in N dimensions. In nonlinear optimization, an analog exists for an objective function of N variables. During the iterations, the Simplex algorithm (also known as NelderMead) adjusts the parameter "simplex" until it converges to a local minimum.
Different from LM method, the Simplex method does not require derivatives, and it is effective when the computational burden is small. Normally, if you did not get a good value for parameter initialization, you can try this method to get the approximate parameter value for further fitting calculations with LM. The Simplex method tends to be more stable in that it is less likely to wander into a meaningless part of the parameter space; on the other hand, it is generally much slower than LM, especially very close to a local minimum. Actually, there is no "perfect" algorithm for nonlinear fitting, and many things may affect the result (e.g., initial values). In complicated models, you may find one method may do better than the other. Additionally, you may want to try both methods to perform the fitting operation.
Orthogonal Distance Regression (ODR) Algorithm
The Orthogonal Distance Regression (ODR) algorithm minimizes the residual sum of squares by adjusting both fitting parameters and values of the independent variable in the iterative process. The residual in ODR is not the difference between the observed value and the predicted value for the dependent variable, but the orthogonal distance from the data to the fitted curve.
Origin uses the ODR algorithm in ODRPACK95^{8}.
For a explict function, the ODR algorithm could be expressed as:
subject to the constraints:
where and are the user input weights of and , and are the residual of the corresponding and , and is the fitting parameter.
For more details of the ODR algorithm, please refer to ODRPACK95^{8}.
Comparison between ODR and LM
To choose the ODR or LM algorithm for your fitting, you may refer to the following table for information:

Orthogonal Distance Regression

LevenbergMarquardt

Application

Both implicit and explicit functions

Only explicit functions

Weight

Support both x weight and y weight

Support only y weight

Residual Source

The orthogonal distance from the data to the fitted curve

The difference between the observed value and the predicted value

Iteration Process

Adjusting the values of fitting parameters and independent variables

Adjusting the values of fitting parameters

Implicit Functions
Fitting Model
A general implicit function could be expressed as:
where and are the variables, are the fitting parameters and is a constant.
 Examples of the Implicit Function:

Orthogonal Distance Regression (ODR) Algorithm
The ODR method can be used for both implicit functions and explicit functions. To learn more details of ODR method, please refer to the description of ODR mehtod in above section
For implicit functions, the ODR algorithm could be expressed as:
subject to:
where and are the user input weights of and , and are the residual of the corresponding and , and is the fitting parameter.
Weighted Fitting
When the measurement errors are unknown, are set to 1 for all i, and the curve fitting is performed without weighting. However, when the experimental errors are known, we can treat these errors as weights and use weighted fitting. In this case, the chisquare can be written as:
There are a number of weighting methods available in Origin. Please read Fitting with Errors and Weighting in the Origin Help file for more details.
Parameters
The fitrelated formulas are summarized here:
The Fitted Value
Computing the fitted values in nonlinear regression is an iterative procedure. You can read a brief introduction in the above section (How Origin Fits the Curve), or see the belowreferenced material for more detailed information.
Parameter Standard Errors
During LM iteration, we need to calculate the partial derivatives matrix F, whose element in ith row and jth column is:
where is the error of y for the ith observation if Instrumental weight is used. If there is no weight, . And is evaluated for each observation in each iteration.
Then we can get the VarianceCovariance Matrix for parameters by:
where is the transpose of the F matrix, s^{2} is the mean residual variance, also called Reduced ChiSqr, or the Deviation of the Model, and can be calculated as follows:
where n is the number of points, and p is the number of parameters.
The square root of the main diagonal value of this matrix C is the Standard Error of the corresponding parameter
where C_{ii} is the element in ith row and ith column of the matrix C. C_{ij} is the covariance between θ_{i} and θ_{j}.
You can choose whether to exclude s^{2} when calculating the covariance matrix. This will affect the Standard Error values. When excluding s^{2}, clear the Use reduce ChiSqr check box on the Advanced page under Fit Control panel. The covariance is then calculated by:
So the Standard Error now becomes:
The parameter standard errors can give us an idea of the precision of the fitted values. Typically, the magnitude of the standard error values should be lower than the fitted values. If the standard error values are much greater than the fitted values, the fitting model may be overparameterized.
The Standard Error for Derived Parameter
Origin estimates the standard errors for the derived parameters according to the Error Propagation formula, which is an approximate formula.
Let be the function with a combination (linear or nonlinear) of variables .
The general law of error propagation is:
where is the covariance value for , and .
You can choose whether to exclude mean residual variance when calculating the covariance matrix , which affects the Standard Error values for derived parameters. When excluding , clear the Use reduce ChiSqr check box on the Advanced page under Fit Control panel.
For example, using three variables
we get:
Now, let the derived parameter be , and let the fitting parameters be . The standard error for the derived parameter is .
Confidence Intervals
Origin provides two methods to calculate the confidence intervals for parameters: AsymptoticSymmetry method and ModelComparison method.
AsymptoticSymmetry Method
One assumption in regression analysis is that data is normally distributed, so we can use the standard error values to construct the Parameter Confidence Intervals. For a given significance level, α, the (1α)x100% confidence interval for the parameter is:
The parameter confidence interval indicates how likely the interval is to contain the true value.
The confidence interval illustrated above is Asymptotic, which is the most frequently used method to calculate the confidence interval. The "Asymptotic" here means it is an approximate value.
ModelComparison Method
If you need more accurate values, you can use the Model Comparison Based method to estimate the confidence interval.
If the Model Comparison method is used, the upper and lower confidence limits will be calculated by searching for the values of each parameter p that makes RSS(θ_{j}) (minimized over the remaining parameters) greater than RSS by a factor of (1+F/(np)).
where F = Ftable(α,1,np)and RSS is the minimum residual sum of square found during the fitting session.
t Value
You can choose to perform a ttest on each parameter to see whether its value is equal to 0. The null hypothesis of the ttest on the jth parameter is:
And the alternative hypothesis is:
The tvalue can be computed as:
Prob>t
The probability that H_{0} in the t test above is true.
where tcdf(t, df) computes the lower tail probability for Student's t distribution with df degree of freedom.
Dependency
If the equation is overparameterized, there will be mutual dependency between parameters. The dependency for the ith parameter is defined as:
and (C^{1})_{ii} is the (i, i)th diagonal element of the inverse of matrix C. If this value is close to 1, there is strong dependency.
To learn more about how the value assess the quality of a fit model, see Model Diagnosis Using Dependency Values page
CI Half Width
The Confidence Interval Half Width is:
where UCL and LCL is the Upper Confidence Interval and Lower Confidence Interval, respectively.
Statistics
Several fit statistics formulas are summarized below:
Degree of Freedom
The Error degree of freedom. Please refer to the ANOVA Table for more details.
Residual Sum of Squares
The residual sum of squares:
Reduced ChiSqr
The Reduced Chisquare value, which equals the residual sum of square divided by the degree of freedom.
RSquare (COD)
The R^{2} value shows the goodness of a fit, and can be computed by:
where TSS is the total sum of square, and RSS is the residual sum of square.
Adj. RSquare
The adjusted R^{2} value:
R Value
The R value is the square root of R^{2}:
For more information on R^{2}, adjusted R^{2} and R, please see Goodness of Fit.
RootMSE (SD)
Root mean square of the error, or the Standard Deviation of the residuals, equal to the square root of reduced χ^{2}:
ANOVA Table
The ANOVA Table:
Note: The ANOVA table is not available for implicit function fitting.

df

Sum of Squares

Mean Square

F Value

Prob > F

Model

p

SS_{reg} = TSS  RSS

MS_{reg} = SS_{reg} / p

MS_{reg} / MSE

pvalue

Error

n  p

RSS

MSE = RSS / (n  p)



Uncorrected Total

n

TSS




Corrected Total

n1

TSS_{corrected}




Note: In nonlinear fitting, Origin outputs both corrected and uncorrected total sum of squares:
Corrected model:
Uncorrected model:
The F value here is a test of whether the fitting model differs significantly from the model y=constant. Additionally, the pvalue, or significance level, is reported with an Ftest. We can reject the null hypothesis if the pvalue is less than , which means that the fitting model differs significantly from the model y=constant.
Confidence and Prediction Bands
Confidence Band
The confidence interval for the fitting function says how good your estimate of the value of the fitting function is at particular values of the independent variables. You can claim with 100α% confidence that the correct value for the fitting function lies within the confidence interval, where α is the desired level of confidence. This defined confidence interval for the fitting function is computed as:
where:
Prediction Band
The prediction interval for the desired confidence level α is the interval within which 100α% of all the experimental points in a series of repeated measurements are expected to fall at particular values of the independent variables. This defined prediction interval for the fitting function is computed as:
where
is Reduced
Notes: The Confidence Band and Prediction Band in the fitted curve plot are not available for implicit function fitting.

Topics for Further Reading
Reference
 William. H. Press, etc. Numerical Recipes in C++. Cambridge University Press, 2002.
 Norman R. Draper, Harry Smith. Applied Regression Analysis, Third Edition. John Wiley & Sons, Inc. 1998.
 George Casella, et al. Applied Regression Analysis: A Research Tool, Second Edition. SpringerVerlag New York, Inc. 1998.
 G. A. F. Seber, C. J. Wild. Nonlinear Regression. John Wiley & Sons, Inc. 2003.
 David A. Ratkowsky. Handbook of Nonlinear Regression Models. Marcel Dekker, Inc. 1990.
 Douglas M. Bates, Donald G. Watts. Nonlinear Regression Analysis & Its Applications. John Wiley & Sons, Inc. 1988.
 Marko Ledvij. Curve Fitting Made Easy. The Industrial Physicist. Apr./May 2003. 9:2427.
 "J. W. Zwolak, P.T. Boggs, and L.T. Watson, ``Algorithm 869: ODRPACK95: A weighted orthogonal distance regression code with bound constraints, ACM Transactions on Mathematical Software Vol. 33, Issue 4, August 2007."
 Nelder, J.A., and R. Mead. 1965. Computer Journal, vol. 7, pp. 308 313
 Numerical Recipes in C, Ch. 10.4, Downhill Simplex Method in Multidimensions.
 Numerical Recipes in C, Ch. 15.5, Nonlinear Models.
