markw.dev - the blog 2020-04-18 https://markw.dev Mark Wilkinson me@markw.dev { sql server query store - two stories 2020-03-11 qds_production Trials and tribulations of running QDS in production. <p>This post is a part of the SQL Server community's T-SQL Tuesday event. This month is being hosted by <a href="https://tracyboggiano.com/archive/2020/03/t-sql-tuesday-124-using-query-store-or-not-lets-blog/">Tracy Boggiano</a>. Thanks Tracy!</p> <p>When the Query Data Store (QDS) feature was announced for SQL Server 2016, we were excited about the prospect of being able to have deep insight on any query running in our environment. I work for a company that deals heavily in the e-commerce space, and we have a large SQL Server footprint. Our environment is unique in that is essentially a multi-tenant system, but all the tenants could have wildly different workloads. It's really the kind of query execution scenario QDS was built for. We had the pleasure of working with the Microsoft SQLCAT team to get 2016 and QDS up and running in our production environment before it was GA. </p> <p>In this post I'm going to share two stories about our QDS experience (from pre and post GA of the feature). One from the perspective of the Database Developer, and one from the Database Administrator. For the most part this is not a technical post full of queries and code samples. It's just me talking about some things I have experienced using QDS in production.</p> <blockquote> <p>DISCLAIMER - These are MY observations. Your story may be different, and in many respects I hope it was. We worked closely with Microsoft to roll out 2016 before it was GA, and after to make sure things were running smoothly. It just turns out that our systems stretch the limitations of a lot of products, and SQL Server is no exception.</p> </blockquote> <h1>The Developer</h1> <p>Our initial interactions with QDS were nothing short of amazing. We now had full details about every query that was being executed against our instances. This came with some immediate benefits that created actionable tasks. We could start asking questions like "do we need to run this query 2 million times per minute?" and "when did this procedure start performing poorly?". </p> <p>Since those early days, QDS has gotten more and more useful for us. The addition of Automatic Plan Regression Correction (APRC) was huge, showing us upwards of 15% CPU usage reduction in a very short period of time. Later they added wait information as well, which allowed us to see what waits specific plans typically accumulated. Overall it has given us all the information we could ever need to to troubleshoot performance issues at the database level, but there lyes the rub. QDS is database specific, which is not super helpful when you are looking at the performance of thousands of databases.</p> <p>Without a consolidated view of our total environment, or even a whole server, QDS data was starting to become less useful. Querying the data by hand also presented some challenges, as it seems the data was hyper-optimized for INSERTs, but not for SELECTs. Because of this, running huge SELECT statements across all databases on a server can pretty expensive, especially if you are doing it ad-hoc. Enter CenteralQDS. </p> <h2>CentralQDS</h2> <p>With no obvious communications from Microsoft about a future "full server QDS", a member of my team embarked on a project to create an environment-wide QDS system. With the help of our development team (a true DBA/Dev collaboration project that any manager would be proud of) we now have a fully centralized view of all QDS data gathered across our hundreds of servers and thousands of databases. With the help of a PowerBI front-end, anyone can now ask even more important questions than the original "when did this procedure start performing poorly?"; instead they can ask questions like "do we have any procedures that aren't executed anymore?" or "what is the most expensive procedure in production?". This system has revolutionized how we troubleshoot performance. In most cases it has even moved performance troubleshooting away from my team (the DBAs) and back to the developers themselves.</p> <h2>Conclusions for the Developer</h2> <p>From a developer standpoint QDS is an outstanding feature (even if it has some database-centric short-comings). It has changed how troubleshooting is done and also allows for more "big picture" planning and analysis. Without having a full-environment view it can be less useful, but the fact that the data is there to aggregate and analyse is very very useful.</p> <h1>The Administrator</h1> <p>The developer story of QDS is a good one. The administrator story is less so. If you are looking at an ancient map, and part of the map is labeled "there be dragons here", QDS is beyond that, in the part of the world that folks hadn't dared go before. While QDS has been a boon for performance improvement projects, it has a lot of hidden impact on your instances. I struggled with how best to write this section and decided to first present the catastrophic failures we had with QDS, followed by a break down by issue.</p> <h2>The Perfect Storm</h2> <p>Going to production with a CTP version of SQL Server can be a bit nerve wracking to be honest. In testing SQL Server 2016 was solid and we didn't see any issues except maybe the odd query regression. As anyone supporting production systems knows though, all the testing and planning in the world tend to crumble under the weight of production.</p> <p>A month or so after upgrading production to SQL Server 2016 we started to see breathtaking (it's the only word clean enough to use here) amounts of tempdb contention on a number of our higher-volume instances. We struggled to find a root cause of the contention, but assumed there had been a workload change due to a recent release (there was a release the day before this started happening). We couldn't pin down the exact change, but figured it had to be related. When the contention would occur, the entire instance would become unresponsive, to the point where you couldn't even connect to it. Luckily an AG failover to the secondary would still work and seemed to clear the contention. </p> <p>After further investigation we discovered that contention was happening on the base system tables in tempdb; it wasn't standard PFS contention. This made even less sense. Based on the tables involved it pointed at auto-stats potentially being the issue. Whenever we saw the issue we usually saw contention on the base table that stores stats objects for tempdb. This was the beginning of a chain of trial and error that lead us to suspect a lot of different issues:</p> <ul> <li>Too much tempdb usage</li> <li>Large sorting operations spilling into tempdb</li> <li>Index maintenance operations using tempdb</li> <li>Large table variables and other objects that can use tempdb on the backend</li> </ul> <p>In the end though, we finally discovered that it was none of that, and all of it. It turns out that in 2016 a change was made that resulted in ~100% MORE <code>PAGELATCH</code> waits when creating objects in tempdb. This combined with a few other issues to result in what I came to call "pagelatch storms". Eventually Microsoft released a fix for the <code>PAGELATCH</code> issue which brought things back to pre-2016 levels (<a href="https://support.microsoft.com/en-us/help/4013999">KB4013999</a>). So what were these other issues? Here's a list:</p> <ul> <li><strong>Size-based cleanup</strong> - The size-based cleanup process in QDS is BAD. On systems with a lot of unique queries it can consume upwards of 70% CPU (this is on boxes with 8+ cores) on an instance, especially if multiple cleanups are running. This is mostly because QDS is designed to handle lots of writes, but reads are expensive. Whenever size-based cleanup runs on a database with lots of unique queries, it will likely spill into tempdb. Combine that spilling with the increase in <code>PAGELATCH</code> waits in 2016 and you already have an issue. Combine it with the others issues below and it can bring down a server.<blockquote> <p>Look for future posts about some custom QDS clean-up scripts we run to avoid the size-based clean-up operations.</p> </blockquote> </li> <li><strong>Lots of tempdb usage</strong> - We use a lot of tempdb, and when the issue would occur, anything that used tempdb was dead in it's tracks.</li> <li><strong>Auto-stats</strong> - While the issue was occurring auto stats updates would get blocked and then end up blocking other operations, adding to the chaos.</li> <li><strong>Missing and misplaced information</strong> - When QDS was first released the QDS operations were hidden from <code>dm_exec_requests</code>, so we had no way of telling what was happening. Beyond that, a lot of the resource consumption for QDS occurs in the <code>Default</code> resource pool, which again masked the issues.</li> <li><strong>Timing</strong> - These issues didn't start until QDS started hitting max size, so that was over a month in some cases, and it didn't hit all instances and databases at the same time. This means the issue seemed "random" when it was happening. Not only was it random, in some cases (which we still see today) the clean-up would just happen to fire off for multiple databases at a time, further adding to contention.</li> </ul> <h1>Conclusion</h1> <p>Overall QDS is an amazing feature worth using. Like any new technology though, you have to be on the lookout for unexpected issues. I wish Microsoft would release their own Central QDS system, as I think it would make it infinitely more useful for larger shops. When these issues first started cropping up it made me question installing CTPs in production, but honestly we would have been bitten by this issue regardless. If you are thinking about using QDS, you just need to remember a few things:</p> <ul> <li>Start with the default settings and adjust as needed</li> <li>Keep an eye on how much of the allocated QDS space is being consumed to avoid the size-based clean-up</li> <li>The more unique your queries, the quicker space will fill up</li> </ul> eightkb 2020-04-12 eightkb I'm helping organize a SQL Server internals conference! <p>I'm excited and a little scared to announce that (with the help of some friends) I'm helping launch my first conference! It's a team effort with Andrew Pruski, Anthony Nocentino, and myself. So far it's been a great experience and the community response has been amazing. We've already got a fantastic set of speakers that have submitted talks with more to come!</p> <h2>EightKB</h2> <p>EightKB is a virtual mini-conference focusing on SQL Server internals. We'll be hosting 5-7 sessions on internals topics from 45 to 75 minutes long. I've been to a few conferences and SQL Saturdays in my time and have always wished I could attend more sessions on database internals, I'm talking about things like SQLOS memory architecture talks, and how DBCC commands work under the covers. Internals just isn't the focus of most events though (with good reason), so we wanted to launch our own. </p> <p>Besides wanting to organize a conference around the things I'm interested in I also thought folks would appreciate a highly technical conference during quarantine. It helps me stay busy as well. I have four children so I'm usually pretty busy anyhow, but being able to dedicate some time to an event I think the SQL community could really benefit from has been great. I don't get to give back to the community that often so I'm excited to be a part of this.</p> <h2>Join Us</h2> <p>If an internals conference (with session levels from <code>300</code> to <code>Insanity</code>!!) sounds fun to you, join us on June 17th. You can learn more at the EightKB site: <a href="https://eightkb.online">https://EightkB.online</a>. See you there!</p> data céilí and my first pre-con 2020-02-14 data_ceili_and_first_pre-con A few exciting opportunities for me in the coming months. <p>While I am admittedly a home body, I do get out every once in a while and present at conferences (mostly SQL Saturday events), but this year I have two exciting opportunities that will allow me to branch out a bit. sql server</p> <h1>Data Céilí</h1> <p>I'm honored to have been selected to present about SQL Server indexing on the green track at <a href="https://www.dataceili.io/">Data Céilí</a>. Data Céilí (Gaelic for "social visit/gathering") is an exciting new data conference in Dublin Ireland organized in part by my amazing co-worker Andrew Pruski (<a href="https://dbafromthecold.com/">B</a>|<a href="https://twitter.com/dbafromthecold">T</a>). The green track is interesting in that it is a remote track, the speakers will all be presenting remotely. While I wish I could make it out to Ireland to present in person, I'm just thankful to be a part of this event. I love seeing conferences trying new things and I think the green track is a great idea. I'm excited to see how this track goes and how this conference grows in the future. Read more about the conference <a href="https://www.dataceili.io/">here</a> (it's not too late to sign up!).</p> <h1>SQL Saturday Raleigh - PowerShell Pre-Con</h1> <p>SQL Saturday Raleigh is coming up in April and I'm proud to say that my pre-con submission "<a href="https://www.eventbrite.com/e/powershell-top-to-bottom-by-mark-wilkinson-sql-saturday-raleigh-2020-tickets-94553803973">PowerShell Top to Bottom</a>" was accepted. This will not be my first time presenting about PowerShell, but it will be my first pre-con. It's exciting and scary at the same time, but I am really looking forward to having and entire day to focus on such a great topic. This pre-con is designed to give folks the knowledge they need to start building custom functions and modules with PowerShell. I'm covering everything from the basics to using runspaces. I know a ton of people are using PowerShell in their day-to-day work, but I think a smaller percentage would be able to fix issues if they arose, or tweak a script they downloaded to better fit their needs. My hope is that this pre-con will get people to the point where they can start doing that and potentially even start contributing their work to the community at large. You can read more about the pre-con, and the SQL Saturday event, at the <a href="https://www.sqlsaturday.com/981/eventhome.aspx">SQL Saturday Raleigh site</a>. </p> <p>I look forward to seeing (or at least talking in the case of Data Céilí) with folks at both events, sign up while you still can!</p> something new 2020-01-06 something_new A new blog with a new focus. <p>Hello friends, welcome to the new blog! My name is Mark Wilkinson and I live in the small town of Garner, outside of Raleigh North Carolina. I live here with my wife, four children, dog, and seven chickens. I've been working in technology for over 15 years in one capacity or another. In my current role I lead a group of outstanding DBAs at a company called ChannelAdvisor in Morrisville. </p> <p>I started this blog to focus more on me and what is happening in my life and less on SQL Server (like I do here: <a href="https://m82labs.com">m82labs</a>). While I do enjoy SQL Server and you will see some SQL Server posts here, I'll primarily be focussing on whatever I currently find interesting with a smattering of commentary on articles/books I've read. Currently I do a lot of professional work in PowerShell and a little bit of Python. In my personal life I use a lot of Python and Bash, <a href="https://github.com/influxdata/telegraf/tree/master/plugins/inputs/sqlserver">some Go</a>, and a little PowerShell. I've also taken to working on craft projects with my wife (what a great reason to buy new tools). </p> <p>This blog runs on a static site generator I wrote in Python3 called Antiquity (that I don't plan on releasing) that attempts to output mostly plain text, responsive web pages. I've been using static site generators for a while, starting on Github pages and eventually moving to a Jekyll blog running in an S3 bucket. I wanted to write my own just to see how hard it was, and to see how light weight I could make it. My current setup is very similar to many others:</p> <ul> <li>Uses markdown for the post format</li> <li>Uses yaml frontmatter on each post to determine posting date, title, short link, and draft status</li> <li>Supports post "collections"</li> <li>Uses a custom templating system (just find/replace at this point)</li> <li>Supports code formatting via pygments</li> </ul> <p>I hope you enjoy what you see here. I'm primarily trying to blog for myself with this blog, so you won't see any buttons to share my posts on social media, and you won't see anything trying to track you (if you do, <a href="/about/">PLEASE CONTACT ME!</a>)</p> the chasm 2020-04-18 the_chasm Self-doubt, the silent hero of production. <p>My last post was from before the world exploded. The current events related to COVID-19 have made it tough to write technical posts, so I'm going to mix it up a bit and talk about something else: the chasm of self-doubt.</p> <p>This post was inspired by a conversation I had with my team the other morning. We were troubleshooting a partition-based GC process and after looking at it a while folks started doubting what they knew about partitioning. This is common in our line of work. Things in the SQL Server world change quickly, and it's not uncommon for old myths to be taken as truths (like table variables being stored completely in memory). I think for an outsider, this amount of self-doubt and questioning might look unproductive and unhealthy, but I have a different take:</p> <blockquote> <p>Ideally, all engineers of any kind should always have their legs dangling into the chasm, at a minimum. It's the only thing that protects production.</p> </blockquote> <h1>falling in</h1> <p>Falling into the chasm can feel like an uncontrollable downward spiral. You will find yourself researching things you thought you knew, and often finding that you were right. Sometimes though you <em>do</em> discover that you weren't 100% correct about something, but that small misunderstanding hasn't had much impact on your work. Other times you may find that you were entirely wrong about something. This can be a bit crushing, but the important thing is to acknowledge what you find and learn from it. Nobody can get everything right all the time, and as long as you learn along the way, things will be fine.</p> <h1>dangling your feet</h1> <p>Back to my quote above, I think self-doubt is extremely healthy in an engineering team. Without a constant low-level feeling of self-doubt you can start to get cocky. Think about that period maybe 2 years into your current profession. You've learned a lot but haven't yet learned enough to understand how dangerous you are. If you have a good amount of self-confidence you'll likely cause a production outage that will naturally instill you with that fear and self-doubt. If not, it might take a bit longer to get there. Obviously everyone is different, and I've certainly met talented engineers that started out and already had this self-doubt. The thing to remember is that it's ok, and can even be helpful to have these feelings.</p> <h1>to new engineers</h1> <p>This may be wrong, but I have a hard time trusting folks that don't experience that fear and doubt. Most of the time it is absolutely not the fault of the engineer, it's just a matter of circumstance. If you are new and are experiencing situations where you feel folks don't trust what you are doing, just remember that a lot of those people have probably been burned in the past by an over-abundance of confidence. To a lot of us this self-doubt is the only thing protecting production.</p> <p>You can have all the controls in the world in place, but things will always break in production. Knowing that, you have to plan as much as possible and approach solutions with caution and care. If you encounter situations where you think folks don't trust your judgement, take a second and try to have a conversation about what could go wrong with the idea you are proposing; ask others for input. Like with most things, communication is key here. If you can show folks that you have some doubts, and can show how you think you've mitigated the risks, you'll build trust a lot faster.</p> <h1>living with it</h1> <p>Self-doubt is just a part of life as an engineer. It will always be there. Just know that it's normal, and can play an important role in keeping your production systems running. As engineers we should be talking about this, and encouraging conversations about it. Phrases like "I don't know" and "I need to double check that" should be welcome, and if you have doubts they should be expressed and addressed. Resolving your doubts as a team can be a great way to increase trust on your teams and make people more comfortable sharing their ideas. It can also be a great training tool, as the whole team can learn from eachother. So next time you have some doubts, say it out loud (or via chat) and get a conversation started, you and your team will only be stronger for it.</p> powershell classes (without classes) 2020-01-07 posh-classes-without Implementing class functionality in older versions of PowerShell. <p>The "Classes" feature in PowerShell v5+ can be super useful, but what happens if you need class-like functionality but are forced to use an older version of PowerShell? Never fear! We can get much of the functionality we see when using classes using the <code>PSCustomObject</code> type.</p> <p>This post was inspired by a conversation I recently had with a friend. He had implemented a module that relied on classes only to find out that he needed to support some systems running PowerShell 4. With an upgrade being out of the question, I came up with an example that could get him most of what he was looking for.</p> <blockquote> <p>If you haven't used PS classes before, I suggest you read up on them: <a href="https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_classes?view=powershell-6">Microsoft Docs: About Classes</a>. Classes are great when you are working on a project where a compiled language might be overkill but you need tighter control over data types, or need to pass complex objects around between functions. </p> </blockquote> <h2>PSCustomObject</h2> <p><code>PSCustomObject</code> has been around since PowerShell v3.0. <code>PSCustomObject</code> is similar to a hashtable in that it can store named properties, but it behaves a bit differently when displaying the data (much nicer formatting). The killer feature though, is the ability to not only add properties but also methods (just like classes!) to your object.</p> <p>Lets say we wanted a class the would expect a number when the object was created, and then allowed you to do some simple math on the number via class methods. I know this seems like a silly use case, but it very clearly shows how you can use custom objects as if they were classes. </p> <p>If you were developing this in PowerShell v5+, you might use a class:</p> <div class="codehilite"><pre><span></span><span class="n">class</span> <span class="n">MyClass</span> <span class="p">{</span> <span class="no">[int]</span><span class="nv">$SomeNum</span> <span class="n">MyClass</span><span class="p">(</span><span class="no">[int]</span><span class="nv">$SomeNum</span><span class="p">)</span> <span class="p">{</span> <span class="nv">$this</span><span class="p">.</span><span class="n">SomeNum</span> <span class="p">=</span> <span class="nv">$SomeNum</span> <span class="p">}</span> <span class="no">[int]</span> <span class="n">TimesFive</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="nv">$this</span><span class="p">.</span><span class="n">SomeNum</span> <span class="p">*</span> <span class="n">5</span> <span class="p">}</span> <span class="no">[int]</span> <span class="n">TimesNum</span><span class="p">(</span><span class="no">[int]</span><span class="nv">$Num</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="nv">$this</span><span class="p">.</span><span class="n">SomeNum</span> <span class="p">*</span> <span class="nv">$Num</span> <span class="p">}</span> <span class="p">}</span> <span class="nv">$MyClassInstance</span> <span class="p">=</span> <span class="no">[MyClass]</span><span class="p">::</span><span class="n">New</span><span class="p">(</span><span class="n">5</span><span class="p">)</span> <span class="nv">$MyClassInstance</span><span class="p">.</span><span class="n">TimesFive</span><span class="p">()</span> </pre></div> <p>This code creates a new class called <code>MyClass</code>, then we instantiate a new instance of the class with <code>$SomeNum</code> defined as <code>5</code>.</p> <p>Now, let's do the same thing using <code>PSCustomObject</code> and <code>Add-Member</code>:</p> <div class="codehilite"><pre><span></span><span class="k">function</span> <span class="nb">New-MyFakeClass</span> <span class="p">{</span> <span class="k">param</span><span class="p">(</span> <span class="nv">$SomeNum</span> <span class="p">)</span> <span class="nv">$MyFakeClass</span> <span class="p">=</span> <span class="no">[PSCustomObject]</span><span class="p">@{}</span> <span class="no">[scriptblock]</span><span class="nv">$TimesFive</span> <span class="p">=</span> <span class="p">{</span> <span class="cm">&lt;#</span> <span class="cm"> Returns &quot;SomeNum&quot; multiplied by five</span> <span class="cm"> #&gt;</span> <span class="nv">$this</span><span class="p">.</span><span class="n">SomeNumber</span> <span class="p">*</span> <span class="n">5</span> <span class="p">}</span> <span class="no">[scriptblock]</span><span class="nv">$TimesNum</span> <span class="p">=</span> <span class="p">{</span> <span class="cm">&lt;#</span> <span class="cm"> Returns &quot;SomeNum&quot; multiplied by an arbitrary integer</span> <span class="cm"> #&gt;</span> <span class="k">param</span><span class="p">(</span> <span class="no">[int]</span><span class="nv">$Num</span> <span class="p">)</span> <span class="nv">$this</span><span class="p">.</span><span class="n">SomeNumber</span> <span class="p">*</span> <span class="nv">$Num</span> <span class="p">}</span> <span class="nv">$MyFakeClass</span> <span class="p">|</span> <span class="nb">Add-Member</span> <span class="n">-MemberType</span> <span class="n">NoteProperty</span> <span class="n">-Name</span> <span class="n">SomeNumber</span> <span class="n">-Value</span> <span class="nv">$SomeNum</span> <span class="nv">$MyFakeClass</span> <span class="p">|</span> <span class="nb">Add-Member</span> <span class="n">-MemberType</span> <span class="n">ScriptMethod</span> <span class="n">-Name</span> <span class="n">TimesFive</span> <span class="n">-Value</span> <span class="nv">$TimesFive</span> <span class="nv">$MyFakeClass</span> <span class="p">|</span> <span class="nb">Add-Member</span> <span class="n">-MemberType</span> <span class="n">ScriptMethod</span> <span class="n">-Name</span> <span class="n">TimesNum</span> <span class="n">-Value</span> <span class="nv">$TimesNum</span> <span class="nv">$MyFakeClass</span> <span class="p">}</span> <span class="nv">$MyFakeClassInstance</span> <span class="p">=</span> <span class="nb">New-MyFakeClass</span> <span class="n">-SomeNum</span> <span class="n">5</span> <span class="nv">$MyFakeClassInstance</span><span class="p">.</span><span class="n">TimesFive</span><span class="p">()</span> </pre></div> <p>This is a bit more code, so lets break it down a little. </p> <p>First we create a "factory" function <code>New-MyFakeClass</code> (a function which creates new objects):</p> <div class="codehilite"><pre><span></span><span class="k">function</span> <span class="nb">New-MyFakeClass</span> <span class="p">{</span> <span class="k">param</span><span class="p">(</span> <span class="nv">$SomeNum</span> <span class="p">)</span> </pre></div> <p>Since we aren't using classes, we have to create a custom function to take the place of the class constructor. In a class, the constructor is responsible for initializing a new class object (setting initial values for properties, etc.).</p> <p>Within this function we create an empty object called $MyFakeClass. Next we define two script blocks. These script blocks take the place of the method definitions in the class-based example above. Once we have those defined we can start adding members to the object:</p> <div class="codehilite"><pre><span></span><span class="nv">$MyFakeClass</span> <span class="p">|</span> <span class="nb">Add-Member</span> <span class="n">-MemberType</span> <span class="n">NoteProperty</span> <span class="n">-Name</span> <span class="n">SomeNumber</span> <span class="n">-Value</span> <span class="nv">$SomeNum</span> <span class="nv">$MyFakeClass</span> <span class="p">|</span> <span class="nb">Add-Member</span> <span class="n">-MemberType</span> <span class="n">ScriptMethod</span> <span class="n">-Name</span> <span class="n">TimesFive</span> <span class="n">-Value</span> <span class="nv">$TimesFive</span> <span class="nv">$MyFakeClass</span> <span class="p">|</span> <span class="nb">Add-Member</span> <span class="n">-MemberType</span> <span class="n">ScriptMethod</span> <span class="n">-Name</span> <span class="n">TimesNum</span> <span class="n">-Value</span> <span class="nv">$TimesNum</span> </pre></div> <p>This adds a property named <code>SomeNumber</code> of type <code>NoteProperty</code>. <code>NoteProperty</code> members act just as a class property would, you can easily access them via: <code>$MyFakeClassInstance.SomeNumber</code>. After that we now add our two script blocks, but this time we use a type of <code>ScriptMethod</code>. The interesting thing about <code>ScriptMethod</code> is that the script blocks get access to the <code>$this</code> variable just like in a more formal class object. This gives the script methods access to the internal properties of the class instance, in this case the <code>SomeNumber</code> property.</p> <p>Finally we can see how we use this new function to create new objects, and how we can access the <code>ScriptMethod</code> just like a standard class method: </p> <div class="codehilite"><pre><span></span><span class="nv">$MyFakeClassInstance</span> <span class="p">=</span> <span class="nb">New-MyFakeClass</span> <span class="n">-SomeNum</span> <span class="n">5</span> <span class="nv">$MyFakeClassInstance</span><span class="p">.</span><span class="n">TimesFive</span><span class="p">()</span> </pre></div> <h2>Thoughts</h2> <p>Another friend of mine told me that when you start using classes in PowerShell, it's time to look at a compiled language. In some cases I agree, but the maintainability of PowerShell when your team is NOT made up of software developers just can't be beat. I highly suggest reading about both <code>Class</code> and <code>PSCustomObject</code>. Even on PowerShell v5+, BOTH can be used to great effect.</p> handling unicode in powershell 2020-02-19 unicode_powershell How to properly handle unicode characters in PowerShell. <p>This post is inspired by an odd situation I ran into in a project I'm working on. I have the need to pull specific revisions of files out of a git repository, save those files, and then execute the contents. This all worked fine until it didn't. I received some complaints that unicode characters in the files we getting mangled, and sure enough they were. But why? In this post I'll explain what happened to me, and ways you can avoid it yourself.</p> <p>In the examples below we are going to be working with a file called "PoShUnicodeSample.txt" that contains the following:</p> <div class="codehilite"><pre><span></span>Here is some text with a Unicode character embedded: ⁋ </pre></div> <blockquote> <p><strong>NOTE</strong> The issue we are discussing in this post seems to be specific to Windows, Linux does not have the same behavior, but everything we are talking about will work on any OS.</p> </blockquote> <h1>Command Specific Encodings</h1> <p>Many commands in PowerShell will take <code>-Encoding</code> as a parameter. For example, if you want to read a file into a variable, and that file has unicode characters, the following will result in mangled data:</p> <div class="codehilite"><pre><span></span><span class="nv">$data</span> <span class="p">=</span> <span class="nb">Get-Content</span> <span class="n">-Path</span> <span class="s2">&quot;PoShUnicodeSample.txt&quot;</span> <span class="nv">$data</span> <span class="p">|</span> <span class="nb">Out-File</span> <span class="n">-FilePath</span> <span class="s2">&quot;Temp.txt&quot;</span> </pre></div> <p>If we open "Temp.txt" we'll see the following:</p> <div class="codehilite"><pre><span></span>Here is some text with a Unicode character embedded: ⁋ </pre></div> <p>Luckily we can fix this with <code>Encoding</code>!</p> <div class="codehilite"><pre><span></span><span class="nv">$data</span> <span class="p">=</span> <span class="nb">Get-Content</span> <span class="n">-Path</span> <span class="s2">&quot;PoShUnicodeSample.txt&quot;</span> <span class="n">-Encoding</span> <span class="n">UTF8</span> <span class="nv">$data</span> <span class="p">|</span> <span class="nb">Out-File</span> <span class="n">-FilePath</span> <span class="s2">&quot;Temp.txt&quot;</span> </pre></div> <p>Tada! We now have a proper unicode encoded output file, right? Almost. If you open the file in a text editor like VSCode it reports the file as being encoded in <code>UTF16LE</code>. If you look at the <code>Out-File</code> documentation you'll see the default output encoding is <code>UTF8NoBOM</code>. If we want straight UTF-8 we have to tell it to use that encoding via <code>-Encoding UTF8</code>.</p> <p>So, if you are working with unicode, and the encoding is important, make sure you are always setting the encoding explicitly. When I was troubleshooting this issue, I thought this solved my issue, but when I put the changes into the project I was working on, I was still seeing the issue. It took a little help from the folks in #PowershellHelp on the <a href="https://sqlcommunity.slack.com">SQL Community Slack</a> to get the issue solved.</p> <h1>Default Encodings</h1> <p>PowerShell has a set of default encodings it uses for all input and output operations. You can check what your current settings are by looking at the <code>OutputEncoding</code> and <code>InputEncoding</code> property of the console:</p> <div class="codehilite"><pre><span></span>PS&gt; [Console]::OutputEncoding IsSingleByte : True BodyName : iso-8859-1 EncodingName : Western European (Windows) HeaderName : Windows-1252 WebName : Windows-1252 WindowsCodePage : 1252 IsBrowserDisplay : True IsBrowserSave : True IsMailNewsDisplay : True IsMailNewsSave : True EncoderFallback : System.Text.InternalEncoderBestFitFallback DecoderFallback : System.Text.InternalDecoderBestFitFallback IsReadOnly : True CodePage : 1252 PS&gt; [Console]::InputEncoding IsSingleByte : True BodyName : iso-8859-1 EncodingName : Western European (Windows) HeaderName : Windows-1252 WebName : Windows-1252 WindowsCodePage : 1252 IsBrowserDisplay : True IsBrowserSave : True IsMailNewsDisplay : True IsMailNewsSave : True EncoderFallback : System.Text.InternalEncoderBestFitFallback DecoderFallback : System.Text.InternalDecoderBestFitFallback IsReadOnly : True CodePage : 1252 </pre></div> <p>As you can see, on my system, the default encoding is <code>iso-8859-1</code>. Yours may be different, and if you are using a Linux system it most certainly will be (it will likely be UTF-8 in that case). </p> <h1>Solving my Problem</h1> <p>As I said at the top of this post, when I encountered this issue I was using <code>git show</code> to pull the content of a script file from a git repo and store it in a local file. the following syntax will accomplish that:</p> <div class="codehilite"><pre><span></span><span class="n">git</span> <span class="n">show</span> <span class="s2">&quot;origin/Branch:path/to/file.txt&quot;</span> <span class="p">|</span> <span class="nb">Out-File</span> <span class="n">-FilePath</span> <span class="s2">&quot;LocalFile.txt&quot;</span> <span class="n">-Encoding</span> <span class="s2">&quot;utf8&quot;</span> </pre></div> <p>But I found that the unicode characters were STILL being mangled. This is because the default output of the console was not <code>UTF-8</code>, so any commands executed in that console would output to the <code>iso-8859-1</code> encoding. This includes non-powershell commands, like <code>git</code>. To fix this, we have to change the default encoding of the console to UTF-8:</p> <div class="codehilite"><pre><span></span><span class="n">PS</span><span class="p">&gt;</span> <span class="no">[Console]</span><span class="p">::</span><span class="n">OutputEncoding</span> <span class="p">=</span> <span class="no">[System.Text.Encoding]</span><span class="p">::</span><span class="n">UTF8</span> </pre></div> <p>Re-running my <code>git show</code> command after that results in the unicode characters being preserved. Success! </p> <p>If you are always executing scripts under your own PowerShell console, and want to make sure you are always handling unicode data properly, you could add the following to your PowerShell profile:</p> <div class="codehilite"><pre><span></span><span class="no">[Console]</span><span class="p">::</span><span class="n">OutputEncoding</span> <span class="p">=</span> <span class="no">[System.Text.Encoding]</span><span class="p">::</span><span class="n">UTF8</span> <span class="no">[Console]</span><span class="p">::</span><span class="n">InputEncoding</span> <span class="p">=</span> <span class="no">[System.Text.Encoding]</span><span class="p">::</span><span class="n">UTF8</span> </pre></div> <p>That combined with the <code>-Encoding</code> parameter used when working with files should cover most of your needs. If you are working in an environment where you don't have access to the profile you'll just have to make sure to include the console encoding changes in your scripts.</p> <h1>Conclusion</h1> <p>Overall PowerShell offers a lot of flexibility around handling different file encodings. Unfortunately it's not overly obvious what encoding you'll end up with if you don't set them explicitly.</p> runspaces explained 2020-02-06 runspaces-explained Getting started with runspaces <p>PowerShell runspaces are a great, if often confusing, feature of PowerShell. If you need to get a lot of work done fast, and have capacity to do lots of work in parallel, runspaces can help you out.</p> <p>In this series of posts on runspaces I hope to give you the information you need to understand, use, and troubleshoot runspaces more effectively.</p> <h2>simple runspaces example</h2> <p>Below is the typical code you might see when reading about working with runspaces:</p> <div class="codehilite"><pre><span></span><span class="nv">$RunspacePool</span> <span class="p">=</span> <span class="no">[RunspaceFactory]</span><span class="p">::</span><span class="n">CreateRunspacePool</span><span class="p">(</span><span class="n">1</span><span class="p">,</span> <span class="n">5</span><span class="p">)</span> <span class="nv">$RunspacePool</span><span class="p">.</span><span class="n">Open</span><span class="p">()</span> <span class="nv">$ScriptBlock</span> <span class="p">=</span> <span class="p">{</span> <span class="nb">Get-Random</span> <span class="p">}</span> <span class="nv">$Runspaces</span> <span class="p">=</span> <span class="p">@()</span> <span class="p">(</span><span class="n">1</span><span class="p">..</span><span class="n">10</span><span class="p">)</span> <span class="p">|</span> <span class="k">ForEach</span><span class="n">-Object</span> <span class="p">{</span> <span class="nv">$Runspace</span> <span class="p">=</span> <span class="no">[powershell]</span><span class="p">::</span><span class="n">Create</span><span class="p">().</span><span class="n">AddScript</span><span class="p">(</span><span class="nv">$ScriptBlock</span><span class="p">)</span> <span class="nv">$Runspace</span><span class="p">.</span><span class="n">RunspacePool</span> <span class="p">=</span> <span class="nv">$RunspacePool</span> <span class="nv">$Runspaces</span> <span class="p">+=</span> <span class="nb">New-Object</span> <span class="n">PSObject</span> <span class="n">-Property</span> <span class="p">@{</span> <span class="n">Runspace</span> <span class="p">=</span> <span class="nv">$Runspace</span> <span class="n">State</span> <span class="p">=</span> <span class="nv">$Runspace</span><span class="p">.</span><span class="n">BeginInvoke</span><span class="p">()</span> <span class="p">}</span> <span class="p">}</span> <span class="k">while</span> <span class="p">(</span> <span class="nv">$Runspaces</span><span class="p">.</span><span class="n">State</span><span class="p">.</span><span class="n">IsCompleted</span> <span class="o">-contains</span> <span class="nv">$False</span><span class="p">)</span> <span class="p">{</span> <span class="nb">Start-Sleep</span> <span class="n">-Milliseconds</span> <span class="n">10</span> <span class="p">}</span> <span class="nv">$Results</span> <span class="p">=</span> <span class="p">@()</span> <span class="nv">$Runspaces</span> <span class="p">|</span> <span class="k">ForEach</span><span class="n">-Object</span> <span class="p">{</span> <span class="nv">$Results</span> <span class="p">+=</span> <span class="nv">$_</span><span class="p">.</span><span class="n">Runspace</span><span class="p">.</span><span class="n">EndInvoke</span><span class="p">(</span><span class="nv">$_</span><span class="p">.</span><span class="n">State</span><span class="p">)</span> <span class="p">}</span> </pre></div> <p>This code is executing a code block ten times (just returning a random number), and allowing 5 executions to run at a time via a runspace pool. This code will run just fine, and in many cases you can probably just copy and paste it into your script and be good to go. But that would be a pretty lame way to end a blog post, so lets dig a little deeper and see what all of this code does.</p> <h2>But First, Pedantics</h2> <p>So I have a problem with some of the posts I've read about runspaces. It all comes down to a small detail that I think makes a big difference in your understanding of them. </p> <div class="codehilite"><pre><span></span><span class="nv">$Runspace</span> <span class="p">=</span> <span class="no">[powershell]</span><span class="p">::</span><span class="n">Create</span><span class="p">()</span> </pre></div> <p>This code looks innocent. What does it do? You'd probably think it's creating a new runspace, but it's not. This code is instead creating a fresh instance of PowerShell. If you run this code and run <code>Get-Runspace</code> you'll see there is still just one listed, the one attached to your current session. So what is this <em>instance</em> we just created? </p> <p>A PowerShell instance handles almost everything about executing PowerShell code, except executing the actual commands. A PowerShell instance is a "wrapper" of sorts that abstracts a lot of the functionality related to the <strong>runspace</strong> that is doing all the work. The PowerShell instance handles creating the command pipeline (think of it like a queue of commands to run) that the runspace will use, and also handles adding commands to it. A quick example script can show how you might do this manually without directly using the instance:</p> <div class="codehilite"><pre><span></span><span class="nv">$PowerShell</span> <span class="p">=</span> <span class="no">[powershell]</span><span class="p">::</span><span class="n">Create</span><span class="p">()</span> <span class="nv">$Pipeline</span> <span class="p">=</span> <span class="nv">$PowerShell</span><span class="p">.</span><span class="n">Runspace</span><span class="p">.</span><span class="n">CreatePipeline</span><span class="p">()</span> <span class="nv">$Pipeline</span><span class="p">.</span><span class="n">Commands</span><span class="p">.</span><span class="n">Add</span><span class="p">({</span><span class="nb">Get-Variable</span><span class="p">})</span> <span class="nv">$Pipeline</span><span class="p">.</span><span class="n">Invoke</span><span class="p">()</span> </pre></div> <p>When you create a new PowerShell instance it comes with a default runspace, in this script we are using that runspace directly to do some work. This approach is pretty verbose though and can get really complicated, so instead we typically use the PowerShell instance itself to do this work:</p> <div class="codehilite"><pre><span></span><span class="nv">$PowerShell</span> <span class="p">=</span> <span class="no">[powershell]</span><span class="p">::</span><span class="n">Create</span><span class="p">().</span><span class="n">AddScript</span><span class="p">({</span><span class="nb">Get-Variable</span><span class="p">})</span> <span class="nv">$PowerShell</span><span class="p">.</span><span class="n">Invoke</span><span class="p">()</span> </pre></div> <p>The distinction between instances and runspaces isn't important for simple examples, but as we get deeper in future posts it will make it easier to understand more complex examples. Now that we have that out of the way we can dive into the example in a little more depth.</p> <h2>example explained</h2> <p>Starting with the first two lines in our example we are creating a runspace pool, and then opening it.</p> <div class="codehilite"><pre><span></span><span class="nv">$RunspacePool</span> <span class="p">=</span> <span class="no">[RunspaceFactory]</span><span class="p">::</span><span class="n">CreateRunspacePool</span><span class="p">(</span><span class="n">1</span><span class="p">,</span> <span class="n">5</span><span class="p">)</span> <span class="nv">$RunspacePool</span><span class="p">.</span><span class="n">Open</span><span class="p">()</span> </pre></div> <p>A <strong>runspace pool</strong> is a mechanism to control the number of active runspaces executing at any given time. Think of it as a simple concurrency limiter. A runspace pool is attached to any number of PowerShell instances, and in turn those instances communicate with the pool to ensure only a certain number of runspaces execute code at a time. In this case we are creating a pool with a minimum of 1 executing runspace, and a maximum of 5. If we attempt to execute more they will wait for slots to become available as other runspaces complete their work. </p> <p>Next we are creating an array to hold our instances as they are created, and then entering into a loop using the PowerShell range operator <code>(1..10)</code>. The range operator is a quick way to generate an iterable array of a given size in PowerShell. In this case this operator is just generating an array with 10 elements in it, integers from 1 to 10, which means the code within the loop will be executed 10 times:</p> <div class="codehilite"><pre><span></span><span class="nv">$Instances</span> <span class="p">=</span> <span class="p">@()</span> <span class="p">(</span><span class="n">1</span><span class="p">..</span><span class="n">10</span><span class="p">)</span> <span class="p">|</span> <span class="k">ForEach</span><span class="n">-Object</span> <span class="p">{</span> <span class="nv">$Instance</span> <span class="p">=</span> <span class="no">[powershell]</span><span class="p">::</span><span class="n">Create</span><span class="p">().</span><span class="n">AddScript</span><span class="p">({</span><span class="nb">Get-Random</span><span class="p">})</span> <span class="nv">$Instance</span><span class="p">.</span><span class="n">RunspacePool</span> <span class="p">=</span> <span class="nv">$RunspacePool</span> <span class="nv">$Instances</span> <span class="p">+=</span> <span class="nb">New-Object</span> <span class="n">PSObject</span> <span class="n">-Property</span> <span class="p">@{</span> <span class="n">Instance</span> <span class="p">=</span> <span class="nv">$Instance</span> <span class="n">State</span> <span class="p">=</span> <span class="nv">$Instance</span><span class="p">.</span><span class="n">BeginInvoke</span><span class="p">()</span> <span class="p">}</span> <span class="p">}</span> </pre></div> <p>Within the loop we:</p> <ul> <li>Create a new instance, and add a scriptblock to it: <code>{Get-Random}</code></li> <li>Bind the instance to the runspace pool we created</li> <li>Add the new instance to our <code>$Instances</code> array (it's more complicated than this, but we'll discuss it more below)</li> </ul> <p>The code we are using to add the instance to the array is a little odd:</p> <div class="codehilite"><pre><span></span><span class="nv">$Instances</span> <span class="p">+=</span> <span class="nb">New-Object</span> <span class="n">PSObject</span> <span class="n">-Property</span> <span class="p">@{</span> <span class="n">Instance</span> <span class="p">=</span> <span class="nv">$Instance</span> <span class="n">State</span> <span class="p">=</span> <span class="nv">$Instance</span><span class="p">.</span><span class="n">BeginInvoke</span><span class="p">()</span> <span class="p">}</span> </pre></div> <p>Here we are creating a custom object, <code>PSCustomObject</code>, with two properties:</p> <ul> <li><code>instance</code> - This is the new PowerShell instance we created</li> <li><code>State</code> - This is the output of <code>BeginInvoke()</code></li> </ul> <p>When we call <code>BeginInvoke()</code> we are telling the instance to execute its scriptblock <strong>asynchronously</strong>, and return an object we can use to determine the state of that execution. The object returned is an <code>AsyncResult</code> object, this object has an <code>IsCompleted</code> property to tell us if the script is complete or not, and also stores the final results of the execution when it completes.</p> <p>Now back to the rest of the script. After all of our instances have been added to the array we enter a while loop and use the object we got back from <code>BeginInvoke()</code> to wait until all of our instances have finished executing:</p> <div class="codehilite"><pre><span></span><span class="k">while</span> <span class="p">(</span> <span class="nv">$Instances</span><span class="p">.</span><span class="n">State</span><span class="p">.</span><span class="n">IsCompleted</span> <span class="o">-contains</span> <span class="nv">$False</span><span class="p">)</span> <span class="p">{</span> <span class="nb">Start-Sleep</span> <span class="n">-Milliseconds</span> <span class="n">10</span> <span class="p">}</span> </pre></div> <p>Specifically we are generating an array of <code>IsCompleted</code> properties for all of the instances we created, and then seeing if that array contains <code>$False</code>, which would indicate something is still running.</p> <blockquote> <p>Many examples omit the sleep statement in the while loop. This can lead to lots of extra CPU usage. When you omit the sleep you enter a tight loop where the computer will check the completion state of your instances as fast as it can. Adding the sleep slows this process down and can reduce CPU consumption considerably. In simple tests I have seen scripts go from consuming 25% CPU during this loop to not really consuming any noticable amount at all, just by adding this sleep statement.</p> </blockquote> <p>Once everything has completed we break out of our while loop and finally loop through the instances and get our results:</p> <div class="codehilite"><pre><span></span><span class="nv">$Results</span> <span class="p">=</span> <span class="p">@()</span> <span class="nv">$Instances</span> <span class="p">|</span> <span class="k">ForEach</span><span class="n">-Object</span> <span class="p">{</span> <span class="nv">$Results</span> <span class="p">+=</span> <span class="nv">$_</span><span class="p">.</span><span class="n">Instance</span><span class="p">.</span><span class="n">EndInvoke</span><span class="p">(</span><span class="nv">$_</span><span class="p">.</span><span class="n">State</span><span class="p">)</span> <span class="p">}</span> </pre></div> <p>To get the results from a completed instance you have to execute the <code>EndInvoke()</code> method. <code>EndInvoke()</code> is kind of a misleading name, it isn't ending anything, instead it is retrieving whatever output was generated by an asynchronous process in a instance. If you recall, when we started the executions on our instances we called <code>BeginInvoke()</code> which returned an <code>AsyncResult</code> object which we then stored in the <code>State</code> property of our <code>$Instances</code> array. So the above code is looping through each of our instances, and calling <code>EndInvoke()</code> for the <code>State</code> property of that instance. It is then taking whatever data is returned and putting into a <code>$Results</code> array for use later.</p> <blockquote> <p>While not required by any means, if you want to read a bit more on the async objects being passed around for this to work, take a look at this: <a href="https://docs.microsoft.com/en-us/dotnet/api/system.iasyncresult?redirectedfrom=MSDN&amp;view=netframework-4.8">Microsoft Docs: IAsyncResult</a></p> </blockquote> <h2>summary</h2> <p>In this post we covered a few key concepts related to runspaces and instances:</p> <ul> <li><strong>Instance</strong> - A fresh PowerShell child process spawned under the current process</li> <li><strong>Runspace</strong> - A thread that executes PowerShell code within a PowerShell instance</li> <li><strong>RunspacePool</strong> - Controls the number of runspaces that can be executing at any given time</li> <li><strong>BeingInvoke()</strong> - Method that can be called on any instance object to start execution, this method call will return an <code>AsyncResult</code> object that can be used to track completion and obtain the output of the script</li> <li><strong>IsCompleted</strong> - Property of the <code>AsyncResult</code> object output by <code>BeginInvoke()</code>, this will tell you when an instance has completed execution</li> <li><strong>EndInvoke()</strong> - Method that can be called on any instance object to end execution and return results - it expects an <code>AsyncResult</code> object as an argument</li> </ul> <h2>conclusion</h2> <p>This was a relatively quick introduce runspaces and instances. Hopefully you've come away with a better understanding of what they do and why you should think about using them. In future posts we'll go into more advanced topics like passing data into your instances, sharing data between instances, and debugging methods. When this series wraps up we will go over a more complex structure I developed to break out of work being done on parallel instances early to allow you to fail fast and not waste time waiting for everything to finish.</p> }