Sitting at work the other day going through some old packages that are still in production. We are in the middle of upgrading from 2008 to 2016 so it gives us a great opportunity to review any package that may be a slow running package or one that gives constant failures. Many of the packages are on there 2nd or 3rd upgrade and many still have VB script in script tasks performing simple tasks like setting dates for files. This has led to a discussion among us about the best way to go about fixing the issues, since many of the packages need to have the VB removed or changed to C# to work efficiently with catalogs in SQL Server 2016.

After some discussing and proving why my point was correct once again lol, we figured that wherever possible we are going to remove any script task and replace those with expressions in variables. Why replace something that works, some may ask. The answer is a simple one with a more complex logic behind it. Variables are almost free, yes, I know that a script task has no monetary value to it either but in terms of overhead when executing the package compared to a script task.

When a SSIS package is executed it validates connection strings, data flows and any other task that is within the package. It also gets the variables ready, so if you have a date variable or a collation for a file name or any of the other 1000s uses for expressions, those also get ready when first executing the package. Now that does not seem like a very good argument that we want the variables to be first and that’s why we should get rid of the script tasks.

Let’s look at this with script tasks used to set the dates and collate the file names. In order to make those dates usable or to have the file path in a connection string what needs to happen. That piece of the puzzle needs to be converted to yes, a variable. So now the variable has been set when the package first executed and now it is being set again with the script task. Why do something twice why not use an expression set the variable once and be done with it.

Now some of you might be saying that the script task is only adding a second here or there what’s the big deal. Well simple math shows a different story, we have about 150 packages all running on the same server throughout the day, many of those are even using the same databases and sometimes even the same tables. If out of those 150 packages 75 of them use script tasks, but some are using 2 to 3 script tasks to set multiple variables, so we are back up to the 150 possibly even higher for the number of script tasks being executed.

Now if each one takes one second, we are possibly losing a little more than 2 minutes every day because of the script tasks. Yes, I know, in the grand scheme 2 minutes is not very much time but when looking at it from 150 packages many of them running every 15 to 20 minutes that starts adding up more time. So, let’s say of the 75 packages that contain a script task, 30 of them need to run every 20 minutes and half of those have more than one script task. So now we are at 45 seconds every 20 minutes so 45(number of script tasks) * (60/20)(3 how many times the packages execute every hour) * 24(hours in a day) = 3240 seconds every day or 54 minutes.   

The 2 minutes has now increased to 54 minutes throughout the day. If the rest of the package all ran as originally designed or the database is properly indexed, then the 54 minutes from those packages may never be an issue. As we all know, databases do what they are designed for and that is to collect data. Now as more data is collected the speed to read and write to those tables now is getting slower. A package that used to take 2 minutes to run now takes 5 and the next one and the next one, eventually you have multiple packages performing large amounts of ETL processes and they just stop. You check the logs, no errors are showing, packages are showing that they executed or are running. That’s when it hits you, packages are still running, some that should never run at the same time are now running at the same time and the data is not in the tables, the business side is starting to call and send emails because they can’t get their reports and why. Because you wanted a script task to set the date for you.  

Now I know there are several other factors in the scenario that would contribute to the failure and many are enough to cover pages of discussions and arguments. The one thing to be always thinking when designing, building or repairing packages is to optimize performance. One way to optimize performance is to save the 1 to 2 seconds that the script task is using up.

Besides performance there is an excellent reason not to use script tasks for things such as dates and filenames. Think of the next person that needs to rerun the package to fix a data issue. If the date is set in a script task then the code in the task needs to be changed, run the package to fix the data, change the script back to what it was and hopefully not messing anything up. If the date was set in a variable and maybe use two for the date then one variable can be easily found, changed and changed back to fix the data. When I build packages, I prefer to build any date as flexible as possible, I use multiple variables to get to one date, this allows me to change one variable to number of days back and execute the package to recover lost data.

I always build my packages using a back-date loop, this allows me to go back and fix data in an easy and smooth way. The loop provides three parts, first a Boolean variable that if needing to back date it is set to true, next start and stop variables to set the dates that need to be fixed and finally a loop that executes the package and advances the day. I plan to eventually do a tutorial on how to set up this easy but very useful task that I use all the time in new builds and every repair that I must make that uses dates.

If you only have a few packages or they run very little, then the time constraint is not a big deal and for most people this is the case. If looking for maximum optimization and ease of data repair later then please please please stop using script tasks for something that can be done in an expression. You may want to show off you C# or VB skills to others but save those for bigger projects like pulling Active Directory information.

Welcome to Reporting Solutions, If you are reading this than thank you. You are probably wondering why did someone start another site to dedicate to something that is already out there on so many other places. The answer is simple, I hate technical jargon, yes if talking with colleagues, taking a class or trying to impress the boss that is fine. When I get annoyed by it is when I am trying to understand something, maybe how it works or how do I need to tweak the code to make it fit my situation. Why can’t someone simply say put the do hickey in the thing a bob. Instead they give a five page explanation as to why they think it it or is not working due to complex algorithms and the flex capacitor and how you need to check the specks on the inline rotary girders. When all I really needed was the syntax yet again to add the dang zero to the front of my date in the expression. Yes I have a bad memory when it comes to remembering the correct order sometimes.

My vision for this site is to provide several things, one is a place for someone with no experience in tools like SSIS and SSRS to be able to follow simple tutorials and start making packages. Second I want a place that I can put many of the things I have created that took several hours or days and when I looked online there really wasn’t much about it. The nice thing is now that I figured out some of the tricks and created some core components, I can now easily throw together a package in a matter of hours not days. Third I want to have a forum where people with little to no experience can feel free to ask questions without worrying about getting a lecture about why they didn’t already know that. Finally I just want the fun and experience of building a webpage.

As time allows I plan on adding tutorials in as plain of English as possible with as many screenshots as needed, all examples I will provide will be based off of Microsoft’s AdventureWorks database, using only tools that are free to download off of the internet. I will start at the beginning like connection strings then move onto creating a simple lift and move package (that is just grabbing a file from point A and placing it in point B). Some of the basic components will include a back date loop, I do not build a package anymore without including this nice little piece of code. A file cleanup for archived files, simple alerting and monitoring tools. 

Any questions, comments or you would like to help contribute just let me know either through the forum or contact page. I encourage any type of help as I am far from knowing everything let alone much of anything. The only thing I ask and will be enforced on this site is be kind, use simple terms as much as possible, and if someone asks a question that can be answered in a simple line or two then do it, do not create long complex solutions when there doesn’t need to be one. As time goes on and something strikes me or more often then not bugs me about something I will create another entry. Until then thank you enjoy the site.