Simple deviation graph

You have a sample data about persons. The data includes also birth year. We should investigate what can we find out about age distribution.  The data is imaginative, and the data can be found at the end of this blog.

We need to group the data like how many persons are born in which decade and create a graph. How many persons are born in 1940s and how many in 1950s and so on.

Count all the persons whose birth year is 1940-1949 to get the number for 1940 bucket. We need to count how many persons are born in 1940 or later. Then we minus how many persons are born in 1950 or later. The result is the persons born in 1940s.

For example, there are 85 persons persons born in 1950 or later and 67 persons born in 1960 or later. Therefore, we have 18 persons born in 1950s.

For the first the value we need to count all the values minus the values higher or equal than 1940.

The COUNTIF function holds arguments range and criteria. First we define range like C3:C104 where the values are and then the criteria when the value in range is equal or greater than the value in the cell F5. To define equal or greater than F5, we use the phrase “>=”&(F5). COUNT simply counts how many times the criteria is fulfilled.

To create a graph, activate the range F3:G14. Leave the cell F3 empty but populate the G3.

Then select insert and choose graph.

The graph:

Now we have visualized the data.

We can see, that the first peak is from those who were born in 1940s and 1950s. Second peak is from 1970s  and 1980s. The last peak is from 2010s. It looks like the sample includes three generations. The first generation is from 40s and 50s, their descendants were born in 70s and 80s. The third generation was typically born in 2010s. The clearest and most obvious peak is in 2010s. The second clearest is the first generation. We have only few observations from 1990s and 2000s.

The sample data:

1971
2019
1979
1970
1951
1945
1966
1950
1944
1943
1953
1950
1981
1989
1978
1976
2010
2012
1968
2019
1951
1970
1949
1943
1949
2016
1961
1974
1963
1949
1947
1970
1986
1976
2013
1988
1957
2022
2015
1980
1984
2010
1951
1950
1994
1989
1952
1946
1983
1959
2012
1955
2021
1930
2018
1963
1954
1974
1980
1974
1973
1985
1941
2013
1982
2019
2020
1985
1967
1968
1999
2004
1947
1953
1948
1945
2019
2005
1951
1945
1951
1983
1987
2008
1999
1966
1974
2004
1969
1992
2001
1948
1967
1967
1988
2011
2016
1948
1953
1953
1959
2019

VLOOKUP and TRIM

You have a simple VLOOKUP, but somehow it does not work. Even though VLOOKUP is correctly written. This might depend on several things.

VLOOKUP is looking for qw34 value in the range (B3:C5). Then one column right from qw34 is selected. The expected value is 1, but the VLOOKUP does not return that value.

As said, this might depend on several things. In our case, there are invisible spaces after qw34 in the cell of B3. This can be found with LEN function. LEN return the number of characters in a string, invisible spaces are counted and increasing the length.

The VLOOKUP does not return 1 as the value in B3 is “qw34  ” with six characters and the value in F3 is “qw34” with four characters. VLOOKUP does not simply find “qw34” in B3:B5.

We can correct with TRIM function, which is removing spaces.

To do this easiest way, we can execute Visual Basic code to manipulate the cells with TRIM.

Sub z_vlookup_trim()

Dim za, zn

Do

    za = ActiveCell.Value

    zn = Application.WorksheetFunction.Trim(za)

    ActiveCell.Value = zn

    ActiveCell.Offset(1, 0).Select

Loop Until ActiveCell = “”

End Sub

First, the variables za and zn are introduced. The variable za is receiving the value from the active cell. If you execute the macro when the cell B3 is activated, then za gets the value “qw34  ”. The variable zn is za manipulated with Excel function TRIM. The value for zn is “qw34”. The VBA proceeds downwards till the active cell is empty.

Place the cursor in B3 and execute the macro.

After the macro has been executed, the TRIM has trimmed the B3:B5 area from spaces. VLOOKUP in G3 finds the value one. One benefit using a macro is that we don’t have to add any new values in a worksheet.

The expression Application.WorksheetFunction. is a useful way to call an Excel function inside VBA macro.