I have extracted data from https://asuntojen.hintatiedot.fi/ about flat prices in Espoo. Can we predict a flat price based on apartment size ?
The data consists of 1019 flat prices and apartment sizes. The idea is to check is there a correlation between flat price based on apartment size. How much should a flat of 400 square meters cost ? The largest flat in the data sample has size of 288 sqm. How much do the prices vary ?
First, a scatter graph was created. Independent variable, apartment size, was set in X axis and dependent variable flat price in Y axis.

We see visually some trends, the bigger flats get, the wider is also deviation.
The correlation is roughly 0,71. That can be calculated with CORRELATION function. That function includes only two arguments, X values and Y values.
To calculate the trend, we need SLOPE and INTERCEPT functions.
Y = SLOPE * X + INTERCEPT
SLOPE calculates how steep the regression line is and INTERCEPT where Y axis and regression line meet.
Both the formulas have two arguments: values for Y axis and values for X axis.

Another way to have the SLOPE and INTERCEPT values, is to use LINEST function.

SLOPE and INTERCEPT are in the first row. 0,50 is the R squared value. The LINEST function returns several values, but SLOPE, INTERCEPT and R squared are useful for us.

Another way to define the trend line and R squared, it to have right mouse click on top of the scatter graph. The select add trendline.

Scroll bit down to see the tick boxes display equation and R squared value. Then the values are visible on the graph.
Based on the equation, the price of the flat with 400 sqm would be appr. 1,3 M€.


When you format trendline, you can add forward value. 400 sqm flat would cost less than 1,4 M€.
One additional topic is standard deviation. We can see from the graph flats with 50 sqm have less deviation than flats with 147 sqm.
=STDEV.P(IF(ROUND(B2:B1020;0)=E21;C2:C1020))
This is the formula, which counts standard deviation for the flats with specific size. The flat sizes are presented with precision of 0,5 sqm. Therefore, the flat sizes are rounded to full square meter. In E21 cell we have the flat size 50 sqm and in E22 147 sqm.

For a flat with 50 sqm, the standard deviation is appr. 56,5 k€.

But with flats of 147 sqm, the standard deviation is nearly 280 k€.

When we count the average price of a flat of 50, that is around 200k€ and for 147 sqm flat around 664 k€. The standard deviations for flats are significant. Therefore, estimate for a flat of 400 sqm, includes variation.
If a flat of 400 sqm costs roughly 1,3 M€, a standard deviation could be around 400 k€.













































































































































