WeeklyAlteryxTips#12 Get the overview of CSV files using Browse tool and Auto Field tool

Alteryx

This post is how to get the overview of the CSV files easily when you get them.

There is a CSV file which is “receipt.csv”. Designer load the CSV file as V_WString which size is 254.

So let’s see this file using the Browse tool.

The Browse tool allow you to view the detail of the data by clicking each field in Results window or the field name in Configuration window. The following screenshot shows where you can click to see the details of the data on Configuration Window.

So I would like to see the details of “amount”(it is not in the screenshot above).

You can know that Ok count is 104,681, Unique data count is 488 and There is no NG data from screenshot above. And also we can know Statistics is Length Statitics. That is to say, this statistics is calculated as text data. “amount” is sales data, so it should be numeric data. In other words, you need to change it to the appropriate data type to see the correct statistics.

But you may not know the details of data which you are handed to by other people. So in that case, you can use the Auto Field tool.

The workflow is as follows.

The Auto Field tool changes the data to appropriate data type. But for string type, it changes the minimum size which can hold the text. Sometimes, this may bother you. For example, if you add a little text to the text in the same column using the Formula tool, it will not work because that column can’t hold more long length text because of the size.

I would return the story. The meta data of this data is as follows after using the Auto Field tool.

To see the more details, I click the column name “amount” in the Configuration Window.

You can find the “Value Statistics” in that window. So you can see the min, max, average and so so. And also you can see the histgram, so it is easy to check the dispersion of the data.

Summary

  • You can know the overview of the data using the Auto Field tool and the Browse tool quickly.
  • Note that the Auto Field consume much cpu resource, so I recommend that you don’t use it in the production workflow. In that case, you can use the option of “Save Field Configuration” and “Load Field Name and Types” effectively.

Contents of the next blog

The next post will be about Data Investigation.

コメント

タイトルとURLをコピーしました