linux/tools/testing/selftests/resctrl/fill_buf.c

145 lines
3 KiB
C
Raw Permalink Normal View History

// SPDX-License-Identifier: GPL-2.0
/*
* fill_buf benchmark
*
* Copyright (C) 2018 Intel Corporation
*
* Authors:
* Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
* Fenghua Yu <fenghua.yu@intel.com>
*/
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <inttypes.h>
#include <string.h>
#include "resctrl.h"
#define CL_SIZE (64)
#define PAGE_SIZE (4 * 1024)
#define MB (1024 * 1024)
static void sb(void)
{
#if defined(__i386) || defined(__x86_64)
asm volatile("sfence\n\t"
: : : "memory");
#endif
}
static void cl_flush(void *p)
{
#if defined(__i386) || defined(__x86_64)
asm volatile("clflush (%0)\n\t"
: : "r"(p) : "memory");
#endif
}
selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test CAT test spawns two processes into two different control groups with exclusive schemata. Both the processes alloc a buffer from memory matching their allocated LLC block size and flush the entire buffer out of caches. Since the processes are reading through the buffer only once during the measurement and initially all the buffer was flushed, the test isn't testing CAT. Rewrite the CAT test to allocate a buffer sized to half of LLC. Then perform a sequence of tests with different LLC alloc sizes starting from half of the CBM bits down to 1-bit CBM. Flush the buffer before each test and read the buffer twice. Observe the LLC misses on the second read through the buffer. As the allocated LLC block gets smaller and smaller, the LLC misses will become larger and larger giving a strong signal on CAT working properly. The new CAT test is using only a single process because it relies on measured effect against another run of itself rather than another process adding noise. The rest of the system is set to use the CBM bits not used by the CAT test to keep the test isolated. Replace count_bits() with count_contiguous_bits() to get the first bit position in order to be able to calculate masks based on it. This change has been tested with a number of systems from different generations. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-15 17:05:08 +02:00
void mem_flush(unsigned char *buf, size_t buf_size)
{
unsigned char *cp = buf;
size_t i = 0;
buf_size = buf_size / CL_SIZE; /* mem size in cache lines */
for (i = 0; i < buf_size; i++)
cl_flush(&cp[i * CL_SIZE]);
sb();
}
/*
* Buffer index step advance to workaround HW prefetching interfering with
* the measurements.
*
* Must be a prime to step through all indexes of the buffer.
*
* Some primes work better than others on some architectures (from MBA/MBM
* result stability point of view).
*/
#define FILL_IDX_MULT 23
static int fill_one_span_read(unsigned char *buf, size_t buf_size)
{
unsigned int size = buf_size / (CL_SIZE / 2);
unsigned int i, idx = 0;
unsigned char sum = 0;
/*
* Read the buffer in an order that is unexpected by HW prefetching
* optimizations to prevent them interfering with the caching pattern.
*
* The read order is (in terms of halves of cachelines):
* i * FILL_IDX_MULT % size
* The formula is open-coded below to avoiding modulo inside the loop
* as it improves MBA/MBM result stability on some architectures.
*/
for (i = 0; i < size; i++) {
sum += buf[idx * (CL_SIZE / 2)];
idx += FILL_IDX_MULT;
while (idx >= size)
idx -= size;
}
return sum;
}
selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test CAT test spawns two processes into two different control groups with exclusive schemata. Both the processes alloc a buffer from memory matching their allocated LLC block size and flush the entire buffer out of caches. Since the processes are reading through the buffer only once during the measurement and initially all the buffer was flushed, the test isn't testing CAT. Rewrite the CAT test to allocate a buffer sized to half of LLC. Then perform a sequence of tests with different LLC alloc sizes starting from half of the CBM bits down to 1-bit CBM. Flush the buffer before each test and read the buffer twice. Observe the LLC misses on the second read through the buffer. As the allocated LLC block gets smaller and smaller, the LLC misses will become larger and larger giving a strong signal on CAT working properly. The new CAT test is using only a single process because it relies on measured effect against another run of itself rather than another process adding noise. The rest of the system is set to use the CBM bits not used by the CAT test to keep the test isolated. Replace count_bits() with count_contiguous_bits() to get the first bit position in order to be able to calculate masks based on it. This change has been tested with a number of systems from different generations. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-12-15 17:05:08 +02:00
void fill_cache_read(unsigned char *buf, size_t buf_size, bool once)
{
int ret = 0;
while (1) {
ret = fill_one_span_read(buf, buf_size);
if (once)
break;
}
/* Consume read result so that reading memory is not optimized out. */
*value_sink = ret;
}
selftests/resctrl: Make benchmark parameter passing robust The benchmark used during the CMT, MBM, and MBA tests can be provided by the user via (-b) parameter, if not provided the default "fill_buf" benchmark is used. The user is additionally able to override any of the "fill_buf" default parameters when running the tests with "-b fill_buf <fill_buf parameters>". The "fill_buf" parameters are managed as an array of strings. Using an array of strings is complex because it requires transformations to/from strings at every producer and consumer. This is made worse for the individual tests where the default benchmark parameters values may not be appropriate and additional data wrangling is required. For example, the CMT test duplicates the entire array of strings in order to replace one of the parameters. More issues appear when combining the usage of an array of strings with the use case of user overriding default parameters by specifying "-b fill_buf <parameters>". This use case is fragile with opportunities to trigger a SIGSEGV because of opportunities for NULL pointers to exist in the array of strings. For example, by running below (thus by specifying "fill_buf" should be used but all parameters are NULL): $ sudo resctrl_tests -t mbm -b fill_buf Replace the "array of strings" parameters used for "fill_buf" with new struct fill_buf_param that contains the "fill_buf" parameters that can be used directly without transformations to/from strings. Two instances of struct fill_buf_param may exist at any point in time: * If the user provides new parameters to "fill_buf", the user parameter structure (struct user_params) will point to a fully initialized and immutable struct fill_buf_param containing the user provided parameters. * If "fill_buf" is the benchmark that should be used by a test, then the test parameter structure (struct resctrl_val_param) will point to a fully initialized struct fill_buf_param. The latter may contain (a) the user provided parameters verbatim, (b) user provided parameters adjusted to be appropriate for the test, or (c) the default parameters for "fill_buf" that is appropriate for the test if the user did not provide "fill_buf" parameters nor an alternate benchmark. The existing behavior of CMT test is to use test defined value for the buffer size even if the user provides another value via command line. This behavior is maintained since the test requires that the buffer size matches the size of the cache allocated, and the amount of cache allocated can instead be changed by the user with the "-n" command line parameter. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-10-24 14:18:47 -07:00
unsigned char *alloc_buffer(size_t buf_size, bool memflush)
{
void *buf = NULL;
uint64_t *p64;
ssize_t s64;
int ret;
ret = posix_memalign(&buf, PAGE_SIZE, buf_size);
if (ret < 0)
return NULL;
/* Initialize the buffer */
p64 = buf;
s64 = buf_size / sizeof(uint64_t);
while (s64 > 0) {
*p64 = (uint64_t)rand();
p64 += (CL_SIZE / sizeof(uint64_t));
s64 -= (CL_SIZE / sizeof(uint64_t));
}
/* Flush the memory before using to avoid "cache hot pages" effect */
if (memflush)
mem_flush(buf, buf_size);
return buf;
}
ssize_t get_fill_buf_size(int cpu_no, const char *cache_type)
{
unsigned long cache_total_size = 0;
int ret;
ret = get_cache_size(cpu_no, cache_type, &cache_total_size);
if (ret)
return ret;
return cache_total_size * 2 > MINIMUM_SPAN ?
cache_total_size * 2 : MINIMUM_SPAN;
}