XML data

Sometime data you want to investigate is in XML-format. If you download data from a system and data happens to be in XML format. This blog demonstrates how you can analyze data with Excel and Access. The example data is taken from page https://www.w3schools.com/xml/cd_catalog.xml .  Thanks for organizing and providing the data.

Here is a part of the data:

<CATALOG>

<CD>

<TITLE>Empire Burlesque</TITLE>

<ARTIST>Bob Dylan</ARTIST>

<COUNTRY>USA</COUNTRY>

<COMPANY>Columbia</COMPANY>

<PRICE>10.90</PRICE>

<YEAR>1985</YEAR>

</CD>

<CD>

<TITLE>Hide your heart</TITLE>

<ARTIST>Bonnie Tyler</ARTIST>

<COUNTRY>UK</COUNTRY>

<COMPANY>CBS Records</COMPANY>

<PRICE>9.90</PRICE>

<YEAR>1988</YEAR>

</CD>

<CD>

<TITLE>Greatest Hits</TITLE>

<ARTIST>Dolly Parton</ARTIST>

<COUNTRY>USA</COUNTRY>

<COMPANY>RCA</COMPANY>

<PRICE>9.90</PRICE>

<YEAR>1982</YEAR>

</CD>

<CD>

<TITLE>Still got the blues</TITLE>

<ARTIST>Gary Moore</ARTIST>

<COUNTRY>UK</COUNTRY>

<COMPANY>Virgin records</COMPANY>

<PRICE>10.20</PRICE>

<YEAR>1990</YEAR>

</CD>

XML is starting with tag <CATALOG> and end with </CATALOG> which is not visible here. Each CD starts with tag <CD> and ends with </CD>. Data fields are title, artist, country, company, price, and year. For example, title starts with tag <TITLE> and ends with </TITLE>. In each line there can be only one start and end tag or just one of those.

Copy and paste the sample data into Notepad++. Save with extension XML.

Data looks like this. By the way, if you don’t yet use Notepad++, I recommend that, as that holds more features than normal Notepad.

Vertical bar on left side helps to characterize each CD.

One way is just to take file – open and browse XML-file.

Excel asks you how you would like to open the file.

As an XML table.

Press ok.

Data is just file.

You can also calculate from table like value of the collection is 237 units and average year is 1988.

If you open the file with second radio button.

Read-only workbook.

Read only table.

If you select source task pane, the third option.

Press ok.

Again press ok.

At the right hand side, you will have XML catalog.

Drag and drop fields from the XML source to worksheet.

Select Developer – XML – refresh data.

Data is brought to worksheet.

One way to handle XML-data with Excel, is to make a data model and use XML file as data source.

Select data – get data – from file – from XML.

Data is organized correctly.

Press transform data.

Data type for price was changed for decimal number and year as whole number.

Just create a pivot table based on the XML file.

The total price of CD collection can be counted with a simple DAX.

Results.

Also the average year is counted.

Let’s see how XML file can be used with MS Access.

External data can be fed into Access via new data source – from file – XML file.

Before uploading the data, I changed the decimal separator to be dot, as price is with dot.

Data looks neat.

Another simple query.

The result is same as in Excel.

Query about average year in CD collection.

The result is same than in Excel.

If you happen to have data in XML-format, that can be used in Excel and in Access. XML format should not be overlooked with Excel or Access. You don’t need fancy tools to analyze XML-data.  The example I used was simple and neat. Real life XML-files can be more complex and cause issues which was not seen here.

Scenario Manager

Scenario manager functionality can be used to simulate and  compare different scenario for example the worst, the most probable and the best scenario.

Let’s use the data we had for the previous blog.

This is the most probable or normal scenario.

Volume                48

Price                     32

Var cost               18

Fix cost                700

The best scenario would be

Volume                55

Price                     33

Var cost               17

Fix cost                680

The worst scenario would be

Volume                46

Price                     30

Var cost               18,5

Fix cost                715

Of course, we can make three different cost volume profit analyses, but we can also use scenario manager functionality.

Select data – what if analysis – scenario manager

Press add.

Fill accordingly. Note, that in changing cells we are entering a range B2:B5. Separate cells could be entered, too.

This is the normal scenario. In the input box the cell addresses are visible. You need to map in your mind that volume is B2, price is B3 and so on.

Press again add to create the best scenario.

Fill the best scenario values.

Press ok.

Press add once more.

Create the worst scenario.

Fill the values for the worst scenario.

Select a scenario and double click it. When I double clicked the best scenario, the values were updated into B2:B5. Also the results were updated.

This way you can update the scenarios into Excel model.

If you press summary.

In the summary we are comparing the margin and the result. Press ok.

Excel created a Scenario Summary view. Here we can see that current values are the same as the best scenario. The values are identified by cells not by the real names in one column left.

The changing cells are

B2: Sales volume

B3: Sales price

B4: Variable cost per unit

B5: Fixed costs

Results:

F4: Margin

F6: Result

The best scenario is generating the best values as all the changing cells have the best values. The result is clearly positive. Unlike the worst scenario, which have all the lowest sales values and highest cost values. Therefore the result is clearly negative.

With scenario manager you can compare how different scenarios affect your calculations. First, you need to key in the scenarios. After that you switch from scenario to another by clicking each scenario and see how the different scenarios affect your calculations.