I’m happy when more people at my job start to work with and experiment in powershell.
As someone who has been using the product since it was in the early Monad beta days, I have grown to both love and hate powershell and what it can do for me and my work. And if I’m honest I’m happy, excited, and a little scared when more people at my work start writing and running their own scripts.
I’m happy because those people are working toward advancing their career dealing with Microsoft products. They are ensuring that they will be able to administer servers and services efficiently, especially as Powershell seems to become the glue that holds more and more things together (administrative wise).
I’m excited because as more and more of my coworkers start to work and learn powershell, this means more people within the organization are able to assist with automation tasks - either by directly writing scripts, resources, or doing peer review of code.
I’m scared because some of my ugliest, dumbest, oldest, ineffective code is still floating out there on the various file shares and those are being used as examples. Bad examples.
One of the benefits powershell bring to administration is the pipeline. The ability to pass objects and manipulate them as they go though. While true this has existed in other tools and platforms, powershell has changed the game on the windows administration side. One of the benefits of using the pipeline is being able to work with objects without a huge impact on ram. However, if powershell is used incorrectly, it has the potential to use a large amount of ram. If this is a physical server this usually results in a large paging file and slow performance. If you are, however, using something along the lines of Citrix PVS - you can end up blue screening the server and crashing all the sessions on it.
This fact was highlighted recently at my work. A technician who is relatively new to powershell ran a script and took down one of our new support desktops we were testing. This desktop is running XenDesktop 7.6 on top of vmware with Citrix PVS as the backing delivering the disk. The technician ran a script to run a report on some ad users and returned this data out to a csv file - unfortunately due to not using the pipeline and some other inefficiencies in the script - this script was taking up more than 15 GB of ram. Once that Xendesktop server ran out of space, it blue screened and everyone was off. Thankfully it was still in testing - but this highlights an issue with code that is done in isolation without others looking at it - especially when newer to powershell. This is not meant to make fun of this tech - they learned their lesson and have continued to experiment in powershell and over time will more than likely get a lot better at it. But this serves as a great example of why the pipeline is so powerful.
Below is an excerpt from the original code
When I looked at the code two things went through my mind.
- Ugly ugly ugly un-optimized code. Caching the objects in ram multiple times is why the 15 gb of ram usage was being seen on this code.
- The tech is still learning and is probably using one of my old pieces of code as a template. I am 100% guilty of doing these very same items on some of of my earlier scripts - so I can’t be angry. I can only offer advice and guidance. But this bit of ugly code allowed me to talk with the tech and highlight WHY the pipeline is wonderful. To begin with, I’m going to show the code I sent to the tech to help them out not crash the server. As a matter of comparison - the below script consumed 50MB of ram. This is a drastic difference compared to 15GB of ram.
So what is the big difference between the two scripts? What makes the second one so much more optimized than the first? The pipeline.
Another big improvement (and point of optimization) is using the native Microsoft active directory module instead of the aging quest ad snap-in (it does have its place - but its place is becoming more and more limited and very specific use cases). Not only is this faster, but the default data returned by the Microsoft AD module is smaller than the quest - resulting in lower memory usage and faster execution
Continuing to compare the first and second scripts you will notice one thing in particular - in both all of the user accounts are stored in a users variable. But what about the pipeline? Why is it not used to process each use individually?!? There seems to be a small issue when using the AD Cmdlts where when processing a large number of accounts through the pipeline the module throws an error. If all of those accounts are stored in a variable first - it is able to process without issue.
Remember how the native cmdlts return less data? Even though all of these user accounts are stored in a variable, it is still quicker and consumes less ram because less properties are being returned from active directory. It is possible to optimize beyond the pipeline.
Where the time and resource savings come into play on the second script is when groups are enumerated for a user account and the creation and (in the case of the second script lack of) storage of the new objects. In the first script all of the items in memberof are stored in the groups variable, while in the second script all of the groups are processed immediately. Looking at the creation of the new objects after the group enumeration and processing, all of the new objects which are being created (in the first script) are being stored in an array. It is only after ALL the user accounts have been enumerated through that this array is then exported to a csv file. Compare this to the second script. The new object is created and immediately put out to the pipeline - and this allows it to go out to the csv file immediately as the item is processed. Doing this the only thing that really has to take up ram is the object creation and the users list. Nothing else is stored in long term in ram - which is why it can run in 50MB or so. Compare that to the first script that stores multiple objects in ram and the power of WHY the pipeline is important becomes obvious.
Objects that the first script stores:
- All of the user objects and all of their properties
- All of the account’s memberof
- Created object combining the two above
Objects that the second script stores:
- All of the user objects and a specific set of properties - not all
One thing that more beginner Powershellers have trouble understanding is that the pipeline can go many layers deep. At first this is hard to visualize in the head of how things are processed - but the nuances of it can be learned over time. Notice in the second script there are multiple layers of pipelinining. Each layer processes what was sent to it from the previous pipeline and then just sends the object on down until finally its appended to a csv file.
So next time you have a script that is taking up too much ram - look over your script and think about where you can use a pipeline. It is not always the answer - but more than likely it is not being used enough if your ram is running hot.