Polyfit (polyfit)
The polyfit
function uses polynomial regression to predict a smooth, nonlinear curve through a bivariate scatter plot. The polyfit
function takes three parameters:

The numeric field containing the independent (x) variable

The numeric field containing the dependent (y) variable

The integer degree of the polynomial
The degree determines the number of curves to the fitted curve. One degree polynomial performs linear regression. Typically the degree is an integer from 1 to 5.
Sample syntax
select polyfit(petal_length_d, petal_width_d, 3) as prediction,
residual,
petal_length_d,
petal_width_d
from iris
limit 150
Result set
The result set contains a random sample of records that match the WHERE
clause. If no WHERE
clause is included the random sample will be taken from the entire result set. The size of the result set can be controlled by the LIMIT
clause. The default size, if no limit is applied, is 25000.
The polyfit
function returns the predicted value for each record. There are three additional fields that can be selected when the polyfit
function is used:

residual
: the residual value for each sampleThe residual value is the samples dependent (y) value minus the predicted value. The residual represents the error of the regression prediction for each sample.

the independent variable for each sample

the dependent variable for each sample
Visualization
There are a number of visualizations that can flow from the regression result set.
The first visualization shown is a scatter plot with petal_length_d
on the xaxis and petal_width_d
on the yaxis. This can be used to visualize the relationship between the two variables in the regression analysis.
The second visualization shows the petal_length_d
variable on the xaxis and the prediction for petal_width_d
on yaxis.
The last visualization plots the predictions on the xaxis and the residual on the yaxis. This residual plot can be used to visualize the error of the regression model across the full range of predictions.