Recently I wrote post about XE Profiler – the new feature in SQL Server Management Studio 17.3. You can find it here. However it is not only feature added to this version of SSMS. Microsoft has also added the new Import Flat File wizard to this version. It is a response to many complaints about an inconvenient Flat File import process received from many SSMS users.
New Import Flat File wizard supposes to make a flat file import easier and more comfortable.
What a Flat File is?
Flat File is a plain text file with no internal hierarchy. It’s often use to store tabular data because its creation is simple. Two most popular types of Flat Files are:
- Delimited (such as comma- and tab-separated)
Id, Car, Price
1, Ford, 1500
2, Chevrolet, 2000
- Fixed width (when values in one column have the same length in each row)
Id Car Price
1 Ford 1500
2 Chevrolet 2000
One of the most well-known and often-used types of delimited flat files is a comma-separated values (CSV) file.
Basic CSV rules (from wikipedia):
- Adjacent fields must be separated by a single comma. However, CSV formats vary greatly in this choice of separator character. In particular, in locales where the comma is used as a decimal separator, semicolon, TAB, or other characters are used instead.
- Any field may be quoted (that is, enclosed within double-quote characters). Some fields must be quoted, as specified in following rules.
- Fields with embedded commas or double-quote characters must be quoted.
- Each of the embedded double-quote characters must be represented by a pair of double-quote characters.
- Fields with embedded line breaks must be quoted (however, many CSV implementations do not support embedded line breaks).
- The first record may be a “header”, which contains column names in each of the fields
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1999,Chevy,"Venture ""Extended Edition, Very Large""",,5000.00
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
Import and Export Data wizard
In older versions of SQL Server Management Studio we have Import and Export Data wizard available. It works pretty well when importing data between tables or databases, but it’s not so comfortable in use when importing data from flat files.
This wizard requires detailed information from the user about source file format. We have to specify a delimiter, a separator, a text qualifier, a code page and few other things. As one of the next steps we have to review whether the wizard correctly recognized column’s datatypes and their lengths. It’s a very cumbersome process which often ends with switching between the next and the previous wizard’s tabs to correct any errors and to finally import data successfully.
This is how above example of CSV file looks after import by this wizard:
Import Flat File wizard
In SQL Server Management Studio 17.3 Microsoft gave users new functionality: Import Flat File wizard.
This wizard is much simpler in use.
- Introduction: Welcome page describes shortly how to use this Import Flat File wizard. If you don’t want to see this page every time while using this process you can check “Do not show this page again” option.
- Specify input file: Use this page to choose source file and destination table name. Click browse button to search for *.txt and *.csv files. The new table should be unique, if not then the wizard will not let you to continue.
- Preview Data: Here you can take a look on your data preview. It will show you up to first 50 rows.
- Modify Columns: On this page you can correct column names, data types and additional attributes, recognized and set by wizard.
- Summary: This page presents a summary of the import process.
- Results: On this last page users receive information whether import was successful.
That’s it! Only 5 steps and file is imported. As you probably already noticed there is one huge difference between this import process and the previous one. This time we didn’t have to describe file format. That is because of a completely new import framework called PROSE. Microsoft Program Synthesis using Examples SDK can analyze text data, search for patterns and “learn” structure of the file. This is how it discovers column names, datatypes, delimiters and other properties.
Let’s take a look on import results.
As you can see it didn’t work well for quoted fields with embedded double-quote characters. On the other hand it recognized NULL value correctly.
OK, we already know the framework recognizes NULL values and can parse comma separated values. What about other possible formats?
Semicolon separated values
1997;Ford;E350;"ac, abs, moon";3000.00
1999;Chevy;"Venture ""Extended Edition""";"";4900.00
1999;Chevy;"Venture ""Extended Edition, Very Large""";;5000.00
1996;Jeep;Grand Cherokee;"MUST SELL!
air, moon roof, loaded";4799.00
No surprises – It works in the exactly same way.
Fixed width file
Year Make Model Description Price
1997 Ford E350 ac, abs, moon 3000.00
1999 Chevy Venture "Extended Ed." 4900.00
1999 Chevy Venture "Extended Ed., Large" 5000.00
Nope. I didn’t work.
1997,Ford,E350,"ac, abs, moon",3000.00,TRUE
1999,Chevy,"Venture ""Extended Edition""","",4900.00,FALSE
1999,Chevy,"Venture ""Extended Edition, Very Large""",,5000.00,TRUE
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00,FALSE
TRUE and FALSE values are recognized and correctly mapped to bit values.
Report File (*.rpt)
What about files with query results? Maybe this is the perfect solution to load this data back to the database?
Unfortunately such file format is not supported.
Not supported file format
What happens if we will provide unsupported file format like JSON or SQL Server Error Log file (after writing my last week’s blog post I had to try this)? Such tries will end with below error.
New Import Flat File functionality can decrease time required to import simple flat files. Unfortunately for more complex ones it doesn’t work well yet. It recognizes well TRUE, FALSE and NULL values. However it cannot parse fixed-width files. It can handle quoted string values, but has issues when parsing quoted fields which contain double-quoted characters. If you find this annoying I encourage you to vote up my connect item: https://connect.microsoft.com/SQLServer/feedback/details/3144015