Jun 242014
 

As with all things, the longer you play with something the more you learn about it. It has been nearly 5 years since I wrote my original article on multithreading where I used PowerShell jobs to run multiple items at a time. In the conversations following that post I had a reader submit a fairly nasty note saying that multithreading with jobs was in fact not real multithreading. He also selected some choice words for me making sure I couldn’t post his comments. It hurt my very fragile feelings and, as a result, I started looking at how I could really, truly multithread with PowerShell. The result of that effort is this script. This is a true multithreading script no “ifs”, “ands” or “buts”.

It runs MUCH faster than my other rendition and includes a much more advanced feature set from my other script. The down side is that the script is very difficult to understand if you are new to PowerShell. If you would like to have a script that is more visible and easier to understand, please refer to the version that uses Jobs.

Okay, so the big addition for this script is the ability to either run a script or a cmdlet that’s built in. As well, you can run this within the pipeline! To do that I had to include the begin, process and end blocks. It makes the script a bit more complex, but really pays off when you pipe your custom script into Out-Gridview or pipe your advanced filtering script into a multithreaded one! I have even pulled my SCCM collections via Get-WmiObject and piped them into a multithreaded script! Cool stuff!

Okay, so on to the breakdown.

First we need to get all of our parameters. If this doesn’t make since, please find my post on parameters!

All of these are defined as follows:
Command
This is where you provide the powershell Commandlet / Script file that you want to multithread. You can also choose a built in cmdlet. Keep in mind that your script. This script is read into a scriptblock, so any unforeseen errors are likely caused by the conversion to a script block.

ObjectList
The objectlist represents the arguments that are provided to the child script. This is an open ended argument and can take a single object from the pipeline, an array, a collection, or a file name. The multithreading script does it’s best to find out which you have provided and handle it as such. If you would like to provide a file, then the file is read with one object on each line and will be provided as is to the script you are running as a string. If this is not desired, then use an array.

InputParam
This allows you to specify the parameter for which your input objects are to be evaluated. As an example, if you were to provide a computer name to the Get-Process cmdlet as just an argument, it would attempt to find all processes where the name was the provided computer name and fail. You need to specify that the parameter that you are providing is the “ComputerName”.

AddParam
This allows you to specify additional parameters to the running command. For instance, if you are trying to find the status of the “BITS” service on all servers in your list, you will need to specify the “Name” parameter. This command takes a hash pair formatted as follows:

AddSwitch
This allows you to add additional switches to the command you are running. For instance, you may want to include “RequiredServices” to the “Get-Service” cmdlet. This parameter will take a single string, or an aray of strings as follows:

MaxThreads
This is the maximum number of threads to run at any given time. If resources are too congested try lowering this number. The default value is 20.

SleepTimer
This is the time between cycles of the child process detection cycle. The default value is 200ms. If CPU utilization is high then you can consider increasing this delay. If the child script takes a long time to run, then you might increase this value to around 1000 (or 1 second in the detection cycle).

Now we need to set everything up. This stuff needs to execute outside of the pipeline, so we place it in the “begin” block of the script.

Okay, so to break this down, first we need to make our ISS, or initial session state. This is basically the session state to be used when we open our Runspace. Next we create our Runspaces. The RunspacePool is really what’s going to do the multithreading. It will handle starting our threads and continuously start new ones as required. It is the operating environment for our command pipeline. Finally we open the RunSpacePool. Note that “.Open()” opens the RunSpacePool synchronously, creating a Windows PowerShell execution environment.

Now I am running a detection on what the user provided for the $command parameter. First I will look at all of the currently loaded cmdlets. If it is one of those, then we continue. Otherwise we assume it is a script file. If it is a script file then we need to read the file in to a script block that we can pass to our future threads. To do this we need to change the default $OFS (Object Field Separator), for more understanding here, please read my other post!

Okay, so the next step is to start receiving items from the pipeline. We can do this by starting the process block. Note that the process block is executed for each item we find in the pipeline. Meanwhile, if you did not execute in the pipeline it is executed once for the script as a whole. What this means is that we need to assume that $ObjectList will either be a single item or multiple items. The best way to do that is to use a ForEach Loop.

So now we have to build the thread that we are going to execute. We do this by adding either the command, or the script. The first IF block is to determine which. A thread either takes an existing PowerShell command, or a Scriptblock. If you remember we built this out in the Begin statement above. Once that is done we need to start giving the user the power to control the item we are calling.

First things first we look at $InputParam. This is what allows the user to execute the child script not just with the argument provided, but also specify the parameter. We see this is useful with the Get-Process cmdlet. Let’s say that you want to see the processes running on 20 different servers. If you just ran Get-Process ServerName you would be looking at your local machine for any processes with the name “ServerName” and you would get no return (probably). Instead you would want to run Get-Process ComputerName ServerName. The trick here is that when you do this you’ve actually changed things! When an item is just hanging at the end of the statement, it is called an Argument. When you pair the item you are setting with the setting it is called a Parameter. So if the user wants to specify the parameter name, we are actually adding a different item to our thread!

Now we need to see if the user wanted to add some extra parameters. For this I decided that a hash table was perfect. This is because they are built much like a parameter, as they are name / value pairs. The user can provide as many as they want in a single hash table, and we can easily run a ForEach to evaluate this. Again we are going to use the .AddParameter() statement to evaluate.

Finally we need to add any Switches that the user wants to add. A good example of this is the –Force switch on cmdlets like Get-ChildItem. We can do this with an array of string which again allows the user to put in as many various switches as they like. One interesting not here is that I had to use .AddParameter() for this as well. Instead of creating a method called .AddSwitch(), Microsoft simply chose to add an overloaded definition of .AddParameter(). What’s peculiar is that even in their documentation (link below) it does in fact say that providing just a string adds a switch instead of parameter.

Now that we have out thread all set up we simply attach our RunspacePool that we setup in the begin statement, and then tell it to execute our thread!

To avoid having a thread hanging around that we lose track of we need to try to do some tracking here. To do this we first catch the thread by creating the output of the command in a $Handle variable. This will provide the link back to the handle in our “End” block. I then go ahead and create a custom object which I have named $Job to hold all these little gems of knowledge. Then I add my custom object to the array for tracking called $Jobs. This method of creating objects was taught to me by my reader from this post!

At this point all of the code to start the jobs is complete! Now we just have to grab all of the jobs back. That calls for the “End” block which is executed once per script run. Since that is the case, we need one main loop to ensure that it stay running while we need it to. Within the “End” block we will watch for jobs to finish and provide that output as they complete.

The first part of this is simply to provide some form of pretty output. First, I evaluate what jobs are still running and creating a string (truncated) that names them. I added this so that if the child script is failing on just one or two servers, you will known which they are to fix them. Then a rather complex write-progress statement which I’ll let you look into. For More info on write-progress you can read my blog articles over write-progress or getting the progress of a child job.

Okay, so now we look for jobs that are completed. We can do this by looking into our array of objects where our handle (that we captured above) has .IsCompleted set to $true! We then run a ForEach on each of these to stop it running and dispose of it. Keep in mind that dispose returns all of the output from the thread. What this means is that the script will actually write all of the output as jobs are finished instead of having to wait for all jobs to finish!

Next I added some protection to the script. I had a couple of scripts in my arsenal that would lock up and never finish. When this happened the multithreading script would continue to run and loop forever and ever. To stop this I added a maximum time to wait for additional jobs to finish. To do this I simply look at the system clock for when the last job completed and look at the time gap. If it is greater than out parameter for $MaxResultTime then we’ll throw an error and exit. Note that until PowerShell actually closes those threads will continue to hang around!

As a very last step we clean up our $RunspacePool.

Well that’s all folks! I really hope that this script provide many time saving events for my wonderful readers. I know it has saved me hundreds of man hours!

Following is the full script with the comment block intact for your cutting and pasting pleasure! Note that you can use the advanced controls here to pop out to a new window or show plain code for copy and paste.

Further Reading:
Initial Session State:

http://msdn.microsoft.com/en-us/library/system.management.automation.runspaces.initialsessionstate%28v=vs.85%29.aspx

Runspaces:

http://msdn.microsoft.com/en-us/library/System.Management.Automation.Runspaces.Runspace(v=vs.85).aspx

PowerShell Class:

http://msdn.microsoft.com/en-us/library/system.management.automation.powershell%28v=vs.85%29.aspx

PowerShell AddParameter Method:

http://msdn.microsoft.com/en-us/library/system.management.automation.powershell.addparameter%28v=vs.85%29.aspx

  26 Responses to “True Multithreading in PowerShell”

  1. Neat stuff! You should post this out on the Technet Script Repository (http://gallery.technet.microsoft.com/scriptcenter). One thing that you could look at doing is instead of enumerating through the $AddParam hash table and using AddParameter(), just put the entire hash table in using AddParameters() as it takes a hash table. On a side note (as well as plugging my own thing :)), I did a talk on runspaces at the NorCal PowerShell User Group recently (http://learn-powershell.net/2014/06/11/norcal-powershell-user-group-presentation-on-runspaces-is-available/), the audio isn’t the best but I have a bunch of code demos as well as the slide deck posted. Keep up the awesome work!

  2. Wow. Insane performance increase in massive WMI calls. I’m talking 4000 servers with 10 calls each in 2 minutes!

    • Yea, I am really glad I got this polished enough to release. As an admin, this type of speed in multithreading definitely makes your life easier!

  3. Hey, nice work this looks great! When I run a command e.g. .\Run-CommandMultiThreaded.ps1 -Command “Get-Service” I get nothing returned to the console (just a blank line). Is there something obvious I am missing? How do I get the results or check the status of the job?

    Much appreciated

    Andrew

    • You aren’t passing any objects for it to run against. You are basically looping through nothing. If you wanted to loop through a bunch of remote systems you could make a text file with the names and provide that to the script.

  4. I’ve noticed that on line 123:

    ForEach($Switch in $AddSwitch){
    >>> $Switch
    $PowershellThread.AddParameter($Switch) | out-null
    }

    You output $Switch directly. This was really badly breaking my script and it took forever to figure out why ;)

    Great stuff otherwise, I’ve converted it to a cmdlet and we’re using it to parallelise all kinds of tasks, typically slow wmi queries or other remote commands.

  5. This is fantastic code. Well done. The speed is amazing. The only change I would like to see is to wrap this so I can simply import the script and call it like a function or object.

    As it is now, I have to use some path to get it which adds a few more steps to my code.

    For example, if I am querying SQL, my current path is something like PS SQLSERVER:\> So if I pipe to .\.ps1, it will fail. I have to add more code to exit out of that ps drive path. It’s more of a nuisance than anything else.

    Would just be easier as a function I can import. :)
    Either way, thanks for this awesome script.

    • @Corey,

      It is possible to wrap it yourself by putting

      However, this can result in a inadvertent match to a user defined function in your .ps1 file while processing the Begin block. What that block does is see if there is a function with that name available. Since this is in a separate file, it will find all of the defaults available. If you put it in your .ps1 file, it will find those in your file.

  6. This is exactly what I was looking for. Very well explained. Thank you fo your post and the time it took you to write this. Now that I see the code and understand the concept I can Implement it into my own scripts.

  7. Forgive my stupidity but i’m extremely new to programming languages in general and Powershell is my first language… i can’t seem to figure out what i’m doing wrong..

    i have a script where i’m querying the functionality of WMI on a bunch of remote machines via psexec (it does a buttload of other stuff as well but this is the bit i want to multithread..), code works fine when run sequentially it runs the command “winmgmt /verifyRepository” command on each machine.. but my GOD is it slow as shit… and so we come to this.. runspacepool type stuff which… is too complex for me to be honest but i was able to get Start-job to work but Start-job might as well be just as bloody slow as doing it sequentially cause of the overhead :(

    so i have the WMI check written as a function called “Verify-WMI”… since i used to run it through a;

    foreach ($Item in $Masterlist) {# do some stuff}

    it has a variable $computername inside [and a few others] that have values of \\$Item and the like…

    so i call it like so:

    .\MultiThread\Run-CommandMultiThreaded.ps1 -Command “Verify-WMI” -InputParam computername -ObjectList $Masterlist

    I’ve tried this as well but doesn’t work:

    .\MultiThread\Run-CommandMultiThreaded.ps1 -Command “Verify-WMI” -InputParam item -ObjectList $Masterlist

    somebody pls help :( i have posted my code in pastebin if it helps… here: http://pastebin.com/hvUyFqrg

    Line 2460 is where the explosions happen…

    if anyone has the time please.. i understand things so much better if i have a working example… i have stolen a LOT of my code from the internet but i’m just lost with this bit…

    any assistance MOST welcome.

    Cheers,
    Chris
    arker_@hotmail.com

  8. Can’t figure out how to define parameters when multiple parameters are required. This is my parameter definition:

    param (
    [string] $AWSAccount = $null,
    [string] $AWSRegion = $null,
    [string] $AWSService = $null
    )

    Need to run the child script 5*9*17 times.

  9. This is really great, thank you! The performance improvement is wonderful. I’m all giddy thinking about the uses I have for this. One note, and I admit I didn’t dig into it very much, but I ran into an issue with escaping folder paths when specifying a script file for the Command parameter. Following this example gave me the error below:

    .\Run-CommandMultiThreaded.ps1 -Command .\ServerInfo.ps1 -ObjectList (gc .\AllServers.txt)

    parsing “X:\Folder 1\Subfolder1\Project-Name\ServerInfo.ps1″ – Unrecognized escape sequence \F.
    At X:\Folder 1\Subfolder1\Project-Name\Run-CommandMultiThreaded.ps1:93 char:11
    + If ($(Get-Command | Select-Object Name) -match $Command){
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : OperationStopped: (:) [], ArgumentException
    + FullyQualifiedErrorId : System.ArgumentException

    To avoid the error I had to remove the .\ prefix from the Command parameter:

    .\Run-CommandMultiThreaded.ps1 -Command ServerInfo.ps1 -ObjectList (gc .\AllServers.txt)

    • Hi,
      I was also getting a similar ‘Unrecognized escape sequence’.

      I ended up fixing this by changing line 96:
      From:
      If ($(Get-Command | Select-Object Name) -match $Command){
      to:
      If ($(Get-Command | Select-Object Name) -match ([regex]::Escape($Command))){

      Seems to have solved the problem.

  10. Thanks, many thanks!!!

  11. As another participant, started – Please forgive my stupidity, but after multiple tests and after reading this thread several times I started to understand that this nice script is not going to solve my problem. Is it helpful ONLY when a certain query should be run against many machines ?

    My problem is different. I need to sort info about thousands of mailboxes. So my script queries a single machine (an Exchange server). Since the task takes about 30 minutes I hoped I can cut this time by multithreading. Is it true that this script is not going to help in my case ?

    • You can run ANY script with ANY input. The script takes any input and run the specified script with the input as an argument.

      • Well, thank you for the answer, but when I run it with a script of mine (which btw, does not require any arguments) – just nothing happens. Instantly exits back to the console and that’s all. What am I doing wrong ?

        • Moreover I don’t understand the answer to Andrew (question #4) – you said he gets nothing because he was running “get-services” against nothing. Why nothing, why not getting the services of the current machine ?

          • The script he has provided requires that you supply -ObjectList (in some manner), which is a list of objects to evaluate the code you are running against. If you look at one of his examples and break it down it’s kind of clear (but I understand why you might get confused):

            # gc AllServers.txt | .\Run-CommandMultiThreaded.ps1 -Command .\ServerInfo.ps1
            # .\Run-CommandMultiThreaded.ps1 -Command .\ServerInfo.ps1 -ObjectList (gc .\AllServers.txt)

            Lets say AllServers.txt ended up with “computer1, computer2, serverA, serverB” in the txt file. In effect, this will execute:
            ServerInfo.ps1 computer1 … and return the result
            ServerInfo.ps1 computer2 … and return the result
            ServerInfo.ps1 computerA … and return the result
            ServerInfo.ps1 computerB … and return the result

            Take a look at his examples closely. ObjectList is absolutely required. If you do not supply ObjectList, the script literally sets itself up to do and return nothing. Look at the “Process” section: “ForEach ($Object in $ObjectList)”. If there is no ObjectList, the script is done.

            The author could have made that a tad bit more clear, but if you look closely, the information is there.

  12. Hello,

    Thanks for posting this code and explaining every step, it helped me a lot understanding how multi-threading works.

    I have, though, encountered a problem when using your script invoking a powershell script existing within the $env:Path variable, since it could not detect it with the Get-Command cmdlet (inside the following if statement) If ($(Get-Command | Select-Object Name) -match $Command)
    That’s why I instead instanciate a variable containing the result of (Get-Command $Command), then test if the Path property is null or not. If the Path property is not null, I can try to get the content of the file.

  13. Nifty stuff! Now that PS v5 is out and supports classes, maybe this can be simplified a lot.

  14. You’re my hero! I have a number of reporting scripts that poll thousands of machines over high latency connections and have been using single threading or jobs up till now. I can’t wait to convert them. The performance difference between this and the jobs is dramatic and you’ve made it so easy to use. I just ran a quick test performing WMI queries against 5 remote machines and the performance improvement of using this script vs a foreach loop, as measured by Measure-Command, was almost linear with the number of threads (just as you would hope for a multithreaded process). Thanks!

  15. Hi Ryan

    Many Belated Thanks for posting this jewel.

    It was today that I found this section for comments.

    I have been using your script for one year or more, it rocks big time: I have been able to populate sets of 40 databases in several servers, using the script and a CSV file to store parameters: run it doing backup in the sources, run it doing restores and processing (sync logins, etc) in destinations.

    Regards from Jorge from Miami Nice

  16. Although this isn’t multithreading, it might be good enough for some things. Issues that I’ve resolved are:

    1. $MaxThreads default value. Changed:

    to:

    This prevents overextending the CPU which will actually reduce performance.

    2. Command matching. Changed:

    to:

    which will not match a substring by accident.
    3. Added code snippet availability. Changed:

    to:

    In this way, a piece of code can be placed there. This is useful since you can’t state a function name, you can instead place the function in there. Smaller the code fragment the better the performance.

    It is probably the size of the fragment which caused the powershell ISE to go wonky. When I gave it the name of the powershell script, it would dump the entire file into the code snippet (the last else shown in my 3rd fix) which might have caused some issues. It works fine now.

    Other than that, this is somewhat useful. Good work! :)

    A

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">