Unit Testing Template For Etl Tool

Unit Testing Template For Etl Tool
Active11 months ago

The first part of unit testing starts from matching the developed package against the requirements. Below are some tips that can help to make unit testing of ssis packages, thorough, broad, and efficient. I have followed these ideas or approach of testing SSIS packages, while working with SSIS 2005 or SQL Server 2005.

  • How do you unit test use TDD methods for ETL's and reporting projects? Ask Question Asked 4 years, 5 months ago. If you're testing a configuration for an ETL tool, you don't need to re-create the logic in the ETL tool to do that; just use the tool. The risk in using unit test only in ETL is that it won't cover the integrations.
  • Tool takes into consideration any foreign key applied and it works for any major ETL and/databases. This tool helps to speed up testing cycle by at least 50% (compared to manual testing) an covers 100% of all business rules. It also generates quite detailed reports and more importantly, these tests can be repeated at any time (ie regression tests).

This question mentions two libraries, both of which aren't maintained and one has broken links to the source and documentation.

Hair broadway bootleg

SSISUnit was last updated in 2008 and SSIStester has broken links in the documentation and hasn't been updated since 2013.

The answers on social.msdn.microsoft.com also generally point to one of those two libraries, or some sort of custom solution.

Are there any other options?

Is there any updates that are related to newer versions of SSIS (2015+)?

I have already checked similar questions:

Yahfoufi
1,3011 gold badge13 silver badges33 bronze badges
mattrowsboatsmattrowsboats

1 Answer

The most Basic way to perform a SSIS Unit Testing is to create your own testing package. Example below:

Template

The Most popular Tools to perform SSIS Unit Testing are the ones you listed:

  • SSISUnit
  • SSISTester

But after making a deep search i found a new way that is BizUnit. BizUnit Framework which is predominantly used for the Biz Unit testing can be customized to test SSIS Package as well. More info in the link below

Also if you mean by testing Package Validation (metadata , connections , ..etc), you can follow my answer in this SO question

HadiHadiUnit testing template for etl tools
29k8 gold badges35 silver badges80 bronze badges
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

Not the answer you're looking for? Browse other questions tagged sql-serverunit-testingssisintegration-testingetl or ask your own question.

Active2 years, 10 months ago

I know several small companies do not do testing on ETL process, but that seems to be suboptimal from the perspective of software engineering.

How do people usually do testing/unit test/functional test on ETL process?

thanks a lot

Hello ladHello lad
4,94622 gold badges72 silver badges137 bronze badges

3 Answers

Tools Used For Etl Testing

testing of an ETL is usually a problem. More precisely, testing isnt problem, problem is how to get reasonable test data. ETL is typically tested on production data. Aside of the security issue, the problem with production data is that does not cover functionality of ETL sufficiently (typically about 40% of business rules isnt covered by production data sample) and it takes too much of time to process.

Recently we have developed a test data generator (for more detail, please look for GTL QAceGen: Business Logic Driven Data Generator on Informatica Market Place) which generate test data based into source tables/files on business rule specification. Tool takes into consideration any foreign key applied and it works for any major ETL and/databases.

This tool helps to speed up testing cycle by at least 50% (compared to manual testing) an covers 100% of all business rules. It also generates quite detailed reports and more importantly, these tests can be repeated at any time (ie regression tests).

Pavel KochanPavel Kochan

We recently worked on a project where the governance board demanded 'You must have Unit Tests' and so we tried our best.

What worked for us was have each ETL solution start and end with a QA/Test package.

Anything unexpected discovered by these packages was logged into an audit table and a Fail Package event was then raised to stop the entire Job - We figured it was better to run with yesterdays good data than risk reporting against possible bad 'today' data.

The starting package would do db schema and data sanity checks. Data Sanity involved checking for duplicate or missing data caused by a lack of Referential Integrity in the source systems. Schema checks ensured that any schema changes that did not get applied during Continuous integration were detected.

The end package would check the results of any transformations. These included:

  • Comparing record counts between source destination
  • Checking specific transforms (eg: all date values changed to appropriate SK value, all string values RTrimed)
  • Ensuring all SK fields were populated (-1 instead of nulls)

Most of these tests were SQL statements the used the built in schema objects of our database, so they were not to onerous to create.

In addition, as part of our development process we would create views that had the end result of any transformations we were doing. We would make use of these views to validate our package transformations.

Each of these checks created a record in our special audit table. That way we could provide a comprehensive list of all the tests and checks we had done each running of the process to satisfy the governance peoples.

(We also had a separate set of packages that would unit test each QA test by means of creating dummy tables, populating them, running the test then confirming the appropriate audit record was written. As Nick stated, this was a lot of work and of little real value)

JoeJoe
9621 gold badge7 silver badges14 bronze badges

We've set up a system where for each ETL procedure we have defined an input dataset and an expected result dataset. Then we have created a system which, utilizing Robot Framework, runs three-part tests for each ETL procedure where the first part inserts the input dataset into the source data tables, the second part runs the ETL, and the third part compares the actual results with our expected results.

This works pretty well for us, but there are a couple of downsides: first of all, we create the test datasets manually for each ETL procedure which takes some work, and secondly, this means that testing for 'unexpected' inputs is not done.

Best Etl Tool

For the automated unit testing we have a separate environment in which we can install builds of our entire DW automatically.

Open Source Etl Testing Tools

Juha KJuha K

Not the answer you're looking for? Browse other questions tagged testingetldata-warehouse or ask your own question.