ResultsĪt a high-level, our results looked markedly improved compared to our previous Automatic WLM tests. Next, we assigned a priority to each workload and decided if any should have Concurrency Scaling enabled: Workloadįinally, we rebooted and started checking our Intermix.io dashboard for results. We then grouped their users into groups based on workload: WorkloadĪutomated queries from other parts of our system This occurs only when a higher priority query arrives.įollowing the guidelines in our Automatic WLM configuration post, we looked at one of our development Redshift clusters and first enumerated the workloads we have running. Queries in lower priority queues may get aborted (and requeued) more often to make room for queries in higher priority queues.Lower priority queries get all of the system resources when there are no higher priority queries. When queries run, higher priority queries are allocated more resources (including memory, CPU, and I/O throughput) than lower priority queries.Queries in lower priority queues will still run, but will queue longer on average than queries in higher priority queues. When queries are submitted to the cluster, Redshift uses the priority to decide which queries should run and which should queue.Redshift uses these query priorities in three ways: The idea is that you simply rank your workloads in terms of priority, and Redshift handles allocating cluster resources optimally. Query Priorities are managed via Automatic WLM query queues, with each queue getting a priority:Īs with Manual WLM, queries are assigned to queues using either database user groups or “query groups” (a label you can set in the database session using the SET query_group TO statement before you submit a query). Automatic WLM with Query PriorityĪutomatic WLM with Query Priorities intends to solve this exact problem: they allow you to let Amazon Redshift know how to prioritize queries when allocating cluster resources. Manual WLM allows us to tune that tradeoff, letting some queries go disk-based if it means keeping the overall latency of our pipeline below our data SLA. The net result was a decrease in overall throughput in our data pipeline. But this came at the expense of additional queue time for queries overall. This is exactly the effect we saw in our earlier post: long-running queries ran up to 3x faster (and fewer went to disk), since big queries were allocated more resources. So the net result of Automatic WLM on your data SLA might be negative, even if the cluster’s use of resources is more efficient by some measure. In practice however, Automatic WLM ( without Query Priority) has no way of knowing the impact additional queue time will have on your data SLAs and hence your business. This should lead to a more optimal use of your cluster, resulting in a higher query throughput and less wasted memory. The benefit of having Automatic WLM choose these values for you is less about making it easier (optimally configuring Manual WLM is pretty easy), and more about making it dynamic–in theory each query should get the right amount of cluster resources, no more, no less. This is in contrast to Manual WLM, where you need to manually choose a memory and concurrency value for each queue that queries run in. Automatic WLM: the basicsĪs a reminder, Automatic WLM dynamically assigns cluster resources to queries as they run. We’ve already covered the basics of setting up Automatic WLM with Query Priorities from a very high-level, and in this post are going to give a first look at the performance we’re seeing with this new feature. Since writing that blog post, we ended up reverting all of our clusters to our well-tuned Manual WLM. For our use-case, this tradeoff didn’t work–the additional queueing caused unacceptable delays to our data pipeline. We first covered Redshift’s new Automatic WLM feature on our blog before Query Priority was fully released, and found mixed results: it was highly effective (maybe too effective!) at reducing the percentage of disk-based queries, but had the side effect of increasing overall queuing on our clusters since big queries consumed most of the memory for the cluster. This is a crucial performance enhancement that is needed to achieve Data SLAs. it allows you to prioritize some queries over other queries. This feature aims to address the main limitation of Auto WLM head-on, i.e. The AWS team recently released a new feature of their Automatic WLM that we’re really excited about: Query Priorities.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |