DFLD: How to capture/refresh Dataflow column metadata from multiple packages using Dataflow Discoverer?

Dataflow Discoverer (DFLD) is a command-line discovery utility that detects and writes the metadata of SSIS dataflow columns to a SQL Server lookup table. DFLD detects Dataflows nested within containers up to any level of nesting. For information on how to install and use DFLD utility,visit http://dfld.codeplex.com

DFLD works on one package at a time. It’s a typical ops requirement to keep the DFLMetadata of all SSIS packages up to date. This can be automated by creating a SSIS package similar to the one shown below

Screen Capture 1 - Package to automate refresh of DFLMetadata

Use for each container to fetch all your SSIS packages (*.dtsx) from the deployment folders including sub-folders. Capture the path of SSIS package in a local variable as shown in Screen Capture 2

Screen Capture 2 - For Each Container configuration

For each package found, execute the DFLD using an Execute Process Task in the control flow with parameterized command-line arguments as shown in Screen Capture 3

Screen Capture 3 - Execute Process Task Configuration

This automation package can be scheduled to run periodically or as step after every deployment

Benny Austin


  1. Hi there,
    This is briliant post as I have developed this today
    I have a quick Question that I want to call the packages from Integration services instead from file system…is that possible ?
    I mean like this

    dfld /p:server\packagepath.dtsx /s:wdc-Sitsql002\datasource /d:ETLmetadata

    as I don’t want to call them from file system.

    Please let me know if you have any solution

    • Sai, thanks for your feedback. DFLD was written prior to SQL Server 2012 when msdb deployment was not common. So DFLD is file based at the moment. Future versions of DFLD could incorporate this feature. In the meanwhile if you come up with a workaround, pls feel free to post here.

  2. Great tool. It gets me close to populating a source to target Metadata tool. Having a bit of difficulties with mapping input and destination columns? Trying to figure out what ID maps the two in the package file. Or is it just their order in the package file? Let me know if you have any suggestions.

    • hi Joel, Thanks for your comments. The information about input to output mapping is specific to type of Dataflow component and is available as Custom Properties of that component. For e.g Derived Column component has information about the formula that makes up the new column and Pivot component has information about the source column and the pivot key. At the moment DFLD captures Common and Data Type properties. I am considering enhancing DFLD to include Custom properties as well. I’ll keep you posted. Meanwhile the UpstreamComponentName might be of some help to trace where the column originated from.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s