I have blogged how to count values in the data range, like how many times letter “A” appears in the data range. Let’s take such a large data range that it does not fit Excel workbook.

I have a text file with good 3M rows of values. Each line holds value either A, B, C, or D.
How many times does the letter “A” exists in the file ? We cannot count with normal Excel COUNT. As the data does not fit normal Excel workbook.

3M rows were loaded into Excel data model.

We use COUNTA function in DAX.
The formula is: =COUNTA(C3M[Val]) .
C3M is the table and Val field.

Pivot settings.

The results per letter.
Next step is to count percentual division between four letters. Therefore, we need to count the total number of rows.

The formula is: =calculate(COUNT(C3M[Val]),ALL(C3M))
The CALCULATE formula calculates the total number of rows in the data range.

To have the percentage distribution, we need to just divide count of each letter by total number of rows. To have the percentage value, the format is percentage.
The formula is: =divide([Count],[Count_all]) .
We divide one measure with another.

The letter A counts for a half, other letter divide the remaining half.
Count_all measure is counting with CALCULATE the total number of records. The expression for CALCULATE is table C3M and field Val, the filter is ALL as we are counting all the records in the table.
One option is again to use MS Access, as Access can take 3M rows of data.

The table has two fields running number and the letters in VAL field.

A simple SQL query, just remember the square brackets around the table.

The results in absolute number.

Here is the SQL query to count percentage values. How many percents of all the letters is letter “A” ?
SELECT C3M.Val, ROUND(Count(Val)/(SELECT COUNT(*) FROM [C3M]), 2) AS Expr1
FROM C3M
GROUP BY C3M.Val;
Now the square brackets were not needed with FROM command. To have percent values, we need to take the total number of letter “A” rows with Count(Val). That is divided with all the records in the table (SELECT COUNT(*) FROM [C3M]). In order to decrease the number of decimals, ROUND function is framing the division calculation.

Every second letter in the data range is “A”. Remaining letters B, C, D are dividing evenly remaining half.
SQL is powerful tool to calculate numeric values out of a large data. When using DAX we needed few formulas, but with SQL we needed only few lines. In case you want to present distribution of the four letters, you can easily create a graph like this in Excel.







