WeeklyAlteryxTips#1 Multiple file read #1

Alteryx

I would like to start to post English version articles of Alteryx’s weekly tips, that I post in Japanese.

The first article is about loading multiple files in a Data Input tool.

There are many tasks to read a large number of multiple local files. In this case, if the schema (field name, data type) is the same, it is very easy to load the multiple files, but if the schema is different, you can not use the same way. In this article, I will show you how to load multiple files with the same schema.

Read multiple files with the same schema

If the files have the same schema, you can easily load multiple files by using “* (wildcard)” in the file name specification in the Data Input tool.

For example, suppose a folder contains the following files.

  • Book1.xlsx
  • Book2.xlsx

In this case, you can simply change the file name in the Data Input tool from “Book1.xlsx” to “Book*.xlsx”. In other words, by leaving the same part of the file name as it is and changing the part that changes to ” *(wildcard),” it is possible to load multiple files with the so-called wildcard designation.

Note 1: For Excel files

In the case of Excel files, sheet names must be fixed, so only a single identical sheet name can be batch-loaded using this method (you have to use a different method if you want to batch-load different name sheets in the same or different books).

Note 2: For CSV files, be careful of field swapping

Since Designer reads all CSV files as String type, the only schema for the CSV file is the number of fields. Therefore, if the field positions are swapped, the file will be read as is.

If you read files like above files with using “*(wildcard)”, you will get the following.

If such a file exists, you must read them as a different schema. How to load files with different schemas is a topic for another time(in this case, you have to use batch macro.).

Note that recent versions of the Designer have been designed to issue firm warnings, so be sure to look at the results window carefully.

Note that Excel files (xlsx) also work in the same way (in the yxdb file case, it will skip reading as a different schema if the field names are strictly different).

Note 3: For spatial files

In the case of spatial files (Shape, etc.), files with different geodetic systems of maps must also be loaded as files with different schemas (for example, if files with different plane rectangular coordinate systems are loaded as the same schema, the latitude and longitude will be wrong).

Sample workflow download

Contents of the next blog

The next post will be the second part of multiple file loading (how to load files with different schemas).

コメント

タイトルとURLをコピーしました