Skip to contents

Fit a principal curve which describes a smooth curve that passes through the middle of the data x in an orthogonal sense. This curve is a non-parametric generalization of a linear principal component. If a closed curve is fit (using smoother = "periodic_lowess") then the starting curve defaults to a circle, and each fit is followed by a bias correction suggested by Jeff Banfield.

Usage

principal_curve(
  x,
  start = NULL,
  thresh = 0.001,
  maxit = 10,
  stretch = 2,
  smoother = c("smooth_spline", "lowess", "periodic_lowess"),
  approx_points = FALSE,
  trace = FALSE,
  plot_iterations = FALSE,
  ...
)

# S3 method for class 'principal_curve'
lines(x, ...)

# S3 method for class 'principal_curve'
plot(x, ...)

# S3 method for class 'principal_curve'
points(x, ...)

whiskers(x, s, ...)

Arguments

x

a matrix of points in arbitrary dimension.

start

either a previously fit principal curve, or else a matrix of points that in row order define a starting curve. If missing or NULL, then the first principal component is used. If the smoother is "periodic_lowess", then a circle is used as the start.

thresh

convergence threshold on shortest distances to the curve.

maxit

maximum number of iterations.

stretch

A stretch factor for the endpoints of the curve, allowing the curve to grow to avoid bunching at the end. Must be a numeric value between 0 and 2.

smoother

choice of smoother. The default is "smooth_spline", and other choices are "lowess" and "periodic_lowess". The latter allows one to fit closed curves. Beware, you may want to use iter = 0 with lowess().

approx_points

Approximate curve after smoothing to reduce computational time. If FALSE, no approximation of the curve occurs. Otherwise, approx_points must be equal to the number of points the curve gets approximated to; preferably about 100.

trace

If TRUE, the iteration information is printed

plot_iterations

If TRUE the iterations are plotted.

...

additional arguments to the smoothers

s

a parametrized curve, represented by a polygon.

Value

An object of class "principal_curve" is returned. For this object the following generic methods a currently available: plot, points, lines.

It has components:

s

a matrix corresponding to x, giving their projections onto the curve.

ord

an index, such that s[order, ] is smooth.

lambda

for each point, its arc-length from the beginning of the curve. The curve is parametrized approximately by arc-length, and hence is unit-speed.

dist

the sum-of-squared distances from the points to their projections.

converged

A logical indicating whether the algorithm converged or not.

num_iterations

Number of iterations completed before returning.

call

the call that created this object; allows it to be updated().

References

Hastie, T. and Stuetzle, W., Principal Curves, JASA, Vol. 84, No. 406 (Jun., 1989), pp. 502-516, doi:10.2307/2289936 (PDF).

See also

Examples

x <- runif(100,-1,1)
x <- cbind(x, x ^ 2 + rnorm(100, sd = 0.1))
fit <- principal_curve(x)
plot(fit)
lines(fit)
points(fit)
whiskers(x, fit$s)