SSIS Dataflow Discoverer (DFLD)

What is Dataflow Discoverer?

Dataflow Discoverer (DFLD) is a command-line discovery utility that detects and writes the metadata of SSIS dataflow columns to a SQL Server lookup table. DFLD detects Dataflows nested within containers up to any level of nesting.

Why use Dataflow Discoverer?

SSIS dataflow column metadata generated by this utility could be used to enhance the error reporting capability of your SSIS package. For example to get the name of ErrorColumn from the Error Outputs which is not available by default in SSIS yet.

What are the minimum system requirements to use Dataflow Discoverer?

  1. .NET framework 4 or above
  2. SQL Server Database (any version)

How to install Dataflow Discoverer?

Verify that your system meets minimum requirements, then

  1. Execute the script on your SQL Server. By default the sql script will create a SQL Server database – ETLMetadata and a lookup table – DFLMetadata. You could also create the table DFLMetadata in any other SQL Server database, in that case just comment out the database creation segment of the sql script.

  2. Copy the executable to a local folder. It is recommended to add the executable path to System Path environment variable.

How to use Dataflow Discoverer?

From the command prompt, type:

dfld /param1:value1 /param2:value2 /param3:value3 … /paramN:valueN

List of Parameters:

Parameter Shorthand Mandatory Description
/package /p Y Fully qualified path of SSIS Package in File System. If the path contains spaces, enclose in double quotes.
/server /s Y IP Address/Name of Database Server hosting the DFLMetadata table. If Instance Name is applicable, then specify as Database Server/Instance Name.
/database /d Y Database Name hosting the DFLMetadata table. This is usually ETLMetadata, unless you have deployed the table DFLMetadata to a different database.
/user /u N Database User Name for SQL authentication. This parameter is not required for Windows authentication.
/pwd /pwd N Password of Database User required for SQL authentication. This parameter is not required for Windows authentication.

If any parameter contains space, then enclose in double quotes. From command prompt, type dfld /? for more help options.

Examples:

  • Database uses windows authentication

dfld /p:C:\users\eagle\populateDim.dtsx /s:localhost /d:ETLmetadata

  • Database uses SQL server authentication

dfld /p:C:\users\eagle\populateRef.dtsx /s:localhost /d:ETLmetadata /u:userxyz /pwd:XXXXXXX

  • Package Name contains space

dfld /p:”C:\users\eagle\populate fact.dtsx” /s:localhost /d:ETLmetadata

On successful completion, your console will display statistics similar to the one shown below

Screen Capture 1 – Dataflow Discoverer Console Output

What metadata is captured by Dataflow Discoverer?

 

Metadata Data Type Description
PackageID Varchar(38) Unique GUID of the package
PackageName Varchar(200) Recommended to keep package name under 200 characters
DataflowID Varchar(38) GUID of the Dataflow
DataflowName Varchar(200) Recommended to keep dataflow name under 200 characters
IOType Char(1) Flag to indicate whether the collection is Input or Output. I= Input Collection, O= Output Collection
IOName Varchar(200) Name of Input/output collection
IOID Int ID of Input/output collection
ColumnName Varchar(200) Recommended to keep Column name under 200 characters
ColumnDesc Varchar(500)  
LineageID Int It’s an integer value which identifies the column in the buffer
ColumnID Int Unique value that identifies a column within a Dataflow in the package.
ColumnDataType Varchar(30)  
ColumnPrecision Int  
ColumnScale Int  
UpstreamComponentName Varchar(200)  
SortKeyPosition Int  
ColumnCodePage Int  
MappedColumnID Int  
UsageType Int  
ColumnType Int  

 

How to get name of ErrorColumn from the Error Outputs in my SSIS Package?

From your SSIS Dataflow, lookup DFLMetadata table to fetch the name of ErrorColumn. Check-out this post which demonstrates how this is done https://bennyaustin.wordpress.com/2011/02/04/ssis-name-of-errorcolumn/

At what stage in my ETL development cycle should I use Dataflow Discoverer?

You can generate the dataflow column metadata using DFLD at any stage in your SSIS package development cycle. It is recommended to refresh your dataflow column metadata just before deployment when your package is ready to go.

How to capture/refresh Dataflow column metadata from multiple packages using Dataflow Discoverer?

It’s a typical ops requirement to keep the DFLMetadata up to date. This can be automated by creating a SSIS package that scans your deployment folder for package (* .dtsx) files. For each package found, execute the DFLD using an Execute Process Task in the control flow. Check-out this post which demonstrates how this is done https://bennyaustin.wordpress.com/2011/02/05/refresh-dataflow-column-metadata/

Where could I find the documentation for the source code of Dataflow Discoverer?

Source code is available for download from http://dfld.codeplex.com/releases/view/60334. The source code is adequately documented to be self-explanatory.

Benny Austin

4 comments

  1. I am getting following error when I used Data Flow discoverer, can you please help?

    ERROR: Connection string parameters of DFLMetadata table is invalid

    Thanks
    Raghib

    • There might be an issue with the connection parameters you are passing to dfld. You might want to refer to “How to use Dataflow Discoverer?” section of the post for instructions and examples.

  2. Hi Benny,
    Had a quick look at your Data Flow Discovery and from the initial review it does not capture the driver details involved in the DFT. Is this correct? From my particular usage that would be of interest. The situation is we want to identify any of our packages that have MS Attunity drivers and change them.

    Thanks
    Alan

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s