2.4.1 findBase
Brief Information
Find a region of XY data suitable for baseline
Additional Information
Minimum Origin Version Required: 8.0 SR5
Command Line Usage
1. findBase iy:=(1,2) pts:=4 max:=0.5;
XFunction Execution Options
Please refer to the page for additional option switches when accessing the xfunction from script
Variables
Display Name

Variable Name

I/O and Type

Default Value

Description

Input Data

iy

Input
XYRange

<active>

Specify the input data.

Minimum Continuous Flat Points ( 0.5 =half, 5 = 5 points)

flat

Input
double

15

This is actually used only when dir is set to 0. You can use this to define the minimum length of the baseline. If a flat section is longer than this variable, this section will be regarded as the baseline. If the value is smaller than 1, it will be viewed as the ratio of the baseline length to the length of the whole input curve; otherwise, it will be viewed as the minimum length of the baseline in points.

Region Options(0=any, >0 for from begin, <0 from end)

dir

Input
int

0

Use this option to determine the position of baseline. If the value is 0, it means the baseline can appear at any position of the curve; if the value is positive, it means the baseline is at the beginning of the curve; if the value is negative, it means the baseline is at the end of the curve. If the value is nonzero, the absolute value of it will be used as the position between the first point to be searched for baseline (or the first point with which the piecewise linear fit begins) and the beginning (if the value is positive) or the end (if the value is negative) of the curve.

Slope Threshold

h

Input
double

0

Specify a slope threshold. The slope of the baseline should be less than this.

Initial Points (0.01=1%, 5 = 5 points)

pts

Input
double

4

Specify the number of points of first segment on which linear fit is performed so as to find the baseline if dir is not equal to 0. If dir = 0, this will be the size for all linear fit segments.

Check Linear Tolerance

tol

Input
double

10

Specify the tolerance factor used to multiply the slope/intercept error value for continuous linear test.

Max Number of Points to Seach(0.5 = half, 50 = 50 points)

max

Input
double

0.5

Specify the maximum number of points used for searching the baseline. If the value is less than 1, the value will be viewed as the percentage of the maximum number of points to the size of the whole curve. Otherwise, this value will be viewed as the the maximum number of points used for searching the baseline.

Number of Points to Increment to Search

step

Input
double

0

If dir = 0, this variable has no effect. Otherwise, this will be used to specify the distance between two adjacent segment on which linear fit is performed to find where the linear segment ends.

Begin Index of Base Region

i1

Output
int

<>

Specify the output of the beginning index of the baseline.

End Index of Base Region

i2

Output
int

<>

Specify the output of the endding index of the baseline.

Intercept

a

Output
double

<>

Specifyt the output for the intercept of the baseline.

Slope

b

Output
double

<>

Specify the output of slope of the baseline.

Intercept Error

aerr

Output
double

<>

Specify the output of the intercept error of the baseline.

Slope Error

berr

Output
double

<>

Specify the output of the slope error of the baseline.

Pearson r (correlation coefficeint)

r

Output
double

<>

Specify the output for the adjusted residual sum of squares of the linear fitting used to find the baseline.

Fitted Line

oy

Output
XYRange

<optional>

Specify the output range of the fitted baseline data.

1=Show Internal Messages

cntrl

Input
int

0

Specify whether to show the internal messages

Description
This function is used to find baseline region for data that have peaks. It performs piecewise linear fit to a section of data and compares the changing slope with the error estimate to determine if a linear region has started turning.
Example
// single peak integration by auto finding base
// on both sides of the peak
newbook;
string ff$=system.path.program$+ "Samples\Curve Fitting\Gaussian.dat";
impASC ff$;
plotxy (1,2);
findbase d:=2 f:=0.1 p:=6 t:=5;i1=findbase.i2;
findbase d:=2 f:=0.1 p:=6 t:=5;i2=findbase.i1;
range aa = 1[$(i1):$(i2)];
integ1 aa oy:=<optional>;
type "Integration of "+integ1.iy$;
type "from "+integ1.x1$+" to "+integ1.x2$;
type "Area = $(integ1.area)";
type "Center=$(integ1.x0)";
// also draw vertical lines to show on graph the integration range
draw l v integ1.x1;
draw l v integ1.x2;
Algorithm
When dir > 0
This means that the baseline, which is a continuous linear segment, appears at the beginning of the input curve.
The XFunction performs piecewise linear fit to find such a continuous linear segment. Linear fit is performed on sections of data. And then the change of slope is used to determine if the linear region has ended.
Let us set the following:
n = npts + step;
piece_size = 3 * step;
m = n + piece_size;
where dir, pts and step are variables of the XFunction.
First, the XFunction performs linear fit to the first pts points. The slope value is set as v0. Then linear fit is performed on the the n^{th} to m^{th} data point. The slope value is set as vn and the slope error is set as vErr. Then the XFunction checks whether the following in true:
where dR is the tolerance specified by the tol variable.
If this is true, it means that the linear region has ended. Only the section that contains the first npts points is the linear part. And this part will be viewed as the baseline.
If this is not true, it means that the linear region continues in the second segment. The XFunction continues to search where the linear part ends. The v0 is set as vn. Linear fit is performed on the next segment which is step points to the right of the current segment. The new vn is set to the slope value of this linear fit. Also, vErr is set to the new slope error. Again, the XFunction checks whether the linear part has ended with
This repeats until the XFunction finds that the linear part is ended or the maximum data points to performed piecewise linear fit has reached. The continuous linear region that starts from the beginning of the input curve will be viewed as the candidate baseline. Finally, this candidate baseline will be checked to see whether it is good enough. See checking the candidate baseline below.
When dir < 0
This means that the baseline, which is a continuous linear segment, appears at the beginning of the input curve. The basic idea for searching this continuous linear segment is actually the same as the one for the "dir > 0" case, except that the piecewise linear fit begins from the rightmost point of the input curve.
When dir = 0
This means that the baseline can appears on any location of the input curve.
The XFunction searches continuous linear segment from the left to right with the method used in the "dir>0" case. If any continuous linear segment that has a length which is greater than the minimum length of baseline determined by the flat variable, the search stops and this linear segment will be viewed as the candidate baseline. Then, this candidate baseline will be checked to see whether it is good enough. See checking the candidate baseline below.
Checking the candidate baseline
The candidate baseline will be checked to see if it is good enough. All linear parts in it is found. And only the part whose slope is less than the slope threshold specified by the h variable of the Xfunction will be used as the final baseline.
References
Related XFunctions
Keywords:spectroscopy
