runspace output, parameters, and variables

by mark wilkinson, 2021-08-23 estimated read time: 8-10 minutes

tag(s): powershell, runspaces, programming

In a previous post I discussed runspaces and instances and explained some of the components involved. In this post, I'll dig a little deeper and show you a simpler way to get data out of a runspace, how to pass parameters in, and how to create shared variables between any number of concurrently executing runspaces.

In the previous post, I pointed out the differences between runspaces and instances. I have since given up on trying to change how I refer to them, so in this post I will be using the term interchangeably. Also, if you haven't read the previous post, or it's been a while, I would give it a read so you are familiar with the example code.

Output

The typical pattern when developing using runspaces usually involves creating a new PowerShell instance, assigning it a runspace pool, adding a script to execute, and then calling BeginInvoke(). BeginInvoke() then returns an async result object you can store in a variable for later use to retrieve your results by running EndInvoke(). I found this to be a bit confusing, so I set out to find something easier to use. It turns out BeginInvoke() has a number of ways you can call it, one of which allows you to specify an object to store results in.

Here is the example code from the last post illustrating the usual method for getting data out of your runspaces:

$RunspacePool = [RunspaceFactory]::CreateRunspacePool(1, 5)
$RunspacePool.Open()

$Results = @()

$Instances = @()
(1..10) | ForEach-Object {
    $Instance = [powershell]::Create().AddScript({Get-Random})
    $Instance.RunspacePool = $RunspacePool
    $Instances += New-Object PSObject -Property @{
        Instance = $Instance
        State = $Instance.BeginInvoke()
    }
}


$Instances | ForEach-Object {
    $Results += $_.Instance.EndInvoke($_.State)
}

In the second to last line, we are running EndInvoke() and storing the output in $Results. But, if you call BeginInvoke() in a slightly different way, you can skip running EndInvoke() altogether:

State = $Instance.BeginInvoke($Inputs, $Results)

This method takes two objects of type PSDataCollection. The first object $Inputs is a collection of objects that will be passed via the pipeline to the code in your script block, the second $Results captures the pipeline output of your script block.

I am not going to go into any examples about using the input parameter here, but since it is required I wanted to make sure I explained what it can be used for.

The cool thing about this method is that the data output by your runspaces is all collected in the $Results object as soon as execution completes. You don't have to call EndInvoke to capture the data later. Here is a full example using this method to replace the method we used in the original script in the previous post:

$RunspacePool = [RunspaceFactory]::CreateRunspacePool(1, 5)
$RunspacePool.Open()

# Create pipeline input and output (results) objects
$Inputs  = New-Object 'System.Management.Automation.PSDataCollection[PSObject]'
$Results = New-Object 'System.Management.Automation.PSDataCollection[PSObject]'

$ScriptBlock = { Get-Random; }

$Instances = (1..10) | ForEach-Object {
    $Instance = [powershell]::Create().AddScript($ScriptBlock)
    $Instance.RunspacePool = $RunspacePool
    [PSCustomObject]@{
        Instance = $Instance
        State = $Instance.BeginInvoke($Inputs,$Results)
    }
}

while ( $Instances.State.IsCompleted -contains $False) { Start-Sleep -Milliseconds 10 }

If you now take a look at $Results it should contain 10 random numbers. I prefer this method as we don't have to loop through and collect the results after everything is done, and it's a little easier to read and understand.

Parameters

Running a script via a bunch of runspaces is fantastic if you need to get something done fast, but what if you need to pass some data into the runspace? Like a server address to connect to?

In the code above, the script is being passed into the runspace via $Instance = [powershell]::Create().AddScript($ScriptBlock). It turns out adding parameters is as easy as adding .AddParameter() to that same line. For this example I'll make the scriptblock being executed a little more complex:

$ScriptBlock = {
    param(
        [int]$Min = 0,
        [int]$Max = 100
    )

    Get-Random -Maximum $Max -Minimum $Min
}

With these changes we can now pass in a minimum and maximum value for our random number. There are two ways we can pass parameter values into the runspace, one is by calling AddParameter() one at a time, the other is by using splatting.

Calling AddParameter() is good for passing a single parameter value into a runspace, but can get a little tedious when you have many parameters. When calling AddParameter() you pass in the name of the parameter followed by the value:

$Instance = [powershell]::Create().AddScript($ScriptBlock).AddParameter("Min",100).AddParameter("Max",1000)

As you can see, you can string multiple calls to AddParameter() together to pass in multiple parameters. This can be made a little neater using parameter splatting. To use splatting, define a hashtable @{} with entries for each of the parameters you want to pass in, and then make single call to AddParameters():

$Params = @{
    Min = 100
    Max = 1000
}

$Instance = [powershell]::Create().AddScript($ScriptBlock).AddParameters($Params)

The important part here is that you use AddParameters() (plural) instead of AddParameter().

Shared Data

To finish off this post let's take a look at sharing data in runspaces. Running tasks in parallel is great but sometimes those tasks need to work on a shared set of data. Luckily there are a set of synchronized .NET data types designed specifically for thread-safe operations (like those executed in multiple runspaces). Creating a thread-safe variable is simple:

$SharedData = [System.Collections.ArrayList]::Synchronized([System.Collections.ArrayList]::new())

That's it! We now have a synchronized ArrayList we can operate on in multiple runspaces at a time.

ArrayList is just one of the synchronized types available. To see a full list, check out the MSDN Documentation on System.Collections. Many of the collections support Synchronized().

We have our variable, so what can we do now? One interesting example involves working through a queue of work using a limited runspace pool. For example, queuing up a list of work to do and trying to process that work using no more than 4 runspaces that pull work from the queue as they finish other work. To give the runspace access to the synchronized queue we just pass it in like any other variable using AddParameter():

# Create an empty synchronized queue
$ServerQueue = [System.Collections.Queue]::Synchronized([System.Collections.Queue]::new())

# Add some fake servers to the queue
1..25 | ForEach-Object {
    $ServerQueue.Enqueue("Server$($_)")
}

$QueueCount = $ServerQueue.Count

# Create some fake work
$ScriptBlock = {
    param (
        $ServerQueue
    )

    while( $ServerQueue.Count -gt 0 ) {
        $Server = $ServerQueue.Dequeue()
        Write-Output "Starting work on $($Server)"
        Start-Sleep -Seconds $(Get-Random -Minimum 1 -Maximum 4)
        Write-Output "Work Complete"
    }
}

$Inputs = New-Object 'System.Management.Automation.PSDataCollection[PSObject]'
$Results = New-Object 'System.Management.Automation.PSDataCollection[PSObject]'

#Spin up 4 runspaces to process the work
$Instances = @()
(1..4) | ForEach-Object {
    $Instance = [powershell]::Create().AddScript($ScriptBlock).AddParameter('ServerQueue', $ServerQueue)
    $Instances += New-Object PSObject -Property @{
        Instance = $Instance
        State = $Instance.BeginInvoke($Inputs,$Results)
    }
}

# Lets loop and wait for work to complete
while ( $Instances.State.IsCompleted -contains $False) {
    # Report the servers left in the queue
    Write-Host "Server(s) Remaining: $($ServerQueue.Count)/$($QueueCount)"
    Start-Sleep -Milliseconds 1000
}

In this example, we first create an empty queue and then fill the using some fake server names, and then keep track of the original queue count for display purposes. The scriptblock in this example is a little interesting:

param (
    $ServerQueue
)

while( $ServerQueue.Count -gt 0 ) {
    $Server = $ServerQueue.Dequeue()
    Write-Output "Starting work on $($Server)"
    Start-Sleep -Seconds $(Get-Random -Minimum 1 -Maximum 4)
    Write-Output "Work Complete"
}

In this script we are sitting in a while loop and looping while the queue has data in it:

while( $ServerQueue.Count -gt 0 ) {

Within the loop we Dequeue() the next item on the queue and do a little fake work:

$Server = $ServerQueue.Dequeue()
Write-Output "Starting work on $($Server)"
Start-Sleep -Seconds $(Get-Random -Minimum 1 -Maximum 4)

This means that this "worker" script will run until the queue is empty.

The rest of this script looks as you would expect, we create some instances and start some work. The big difference here is that we are only creating 4 instances to run our work on:

(1..4) | ForEach-Object {

This approach can be useful if you have a lot of work to get through and don't want to use all the memory necessary to spin up 100+ instances when you are only going to allow a few to run at a time. So this queue method allows us to limit our runing instances while still allowing us to churn through all the work that needs to be done.

Conclusion

In this post, we covered some of the ways to move data around when using runspaces. We also discussed the synchronized data types and how they can be useful in cases where you require concurrent access to a set of data. All of these tools combined can allow you to create some pretty interesting workflow patterns. In a future post, I'll share a pattern I use that allows me to process work but exit early when errors are detected.