linux/arch/x86/boot/compressed/misc.c
Yinghai Lu 67b6662559 x86/boot: Fix "run_size" calculation
Currently, the "run_size" variable holds the total kernel size
(size of code plus brk and bss) and is calculated via the shell script
arch/x86/tools/calc_run_size.sh. It gets the file offset and mem size
of the .bss and .brk sections from the vmlinux, and adds them as follows:

  run_size = $(( $offsetA + $sizeA + $sizeB ))

However, this is not correct (it is too large). To illustrate, here's
a walk-through of the script's calculation, compared to the correct way
to find it.

First, offsetA is found as the starting address of the first .bss or
.brk section seen in the ELF file. The sizeA and sizeB values are the
respective section sizes.

 [bhe@x1 linux]$ objdump -h vmlinux

 vmlinux:     file format elf64-x86-64

 Sections:
 Idx Name    Size      VMA               LMA               File off  Algn
  27 .bss    00170000  ffffffff81ec8000  0000000001ec8000  012c8000  2**12
             ALLOC
  28 .brk    00027000  ffffffff82038000  0000000002038000  012c8000  2**0
             ALLOC

Here, offsetA is 0x012c8000, with sizeA at 0x00170000 and sizeB at
0x00027000. The resulting run_size is 0x145f000:

 0x012c8000 + 0x00170000 + 0x00027000 = 0x145f000

However, if we instead examine the ELF LOAD program headers, we see a
different picture.

 [bhe@x1 linux]$ readelf -l vmlinux

 Elf file type is EXEC (Executable file)
 Entry point 0x1000000
 There are 5 program headers, starting at offset 64

 Program Headers:
  Type        Offset             VirtAddr           PhysAddr
              FileSiz            MemSiz              Flags  Align
  LOAD        0x0000000000200000 0xffffffff81000000 0x0000000001000000
              0x0000000000b5e000 0x0000000000b5e000  R E    200000
  LOAD        0x0000000000e00000 0xffffffff81c00000 0x0000000001c00000
              0x0000000000145000 0x0000000000145000  RW     200000
  LOAD        0x0000000001000000 0x0000000000000000 0x0000000001d45000
              0x0000000000018158 0x0000000000018158  RW     200000
  LOAD        0x000000000115e000 0xffffffff81d5e000 0x0000000001d5e000
              0x000000000016a000 0x0000000000301000  RWE    200000
  NOTE        0x000000000099bcac 0xffffffff8179bcac 0x000000000179bcac
              0x00000000000001bc 0x00000000000001bc         4

 Section to Segment mapping:
  Segment Sections...
   00     .text .notes __ex_table .rodata __bug_table .pci_fixup .tracedata
          __ksymtab __ksymtab_gpl __ksymtab_strings __init_rodata __param
          __modver
   01     .data .vvar
   02     .data..percpu
   03     .init.text .init.data .x86_cpu_dev.init .parainstructions
          .altinstructions .altinstr_replacement .iommu_table .apicdrivers
          .exit.text .smp_locks .bss .brk
   04     .notes

As mentioned, run_size needs to be the size of the running kernel
including .bss and .brk. We can see from the Section/Segment mapping
above that .bss and .brk are included in segment 03 (which corresponds
to the final LOAD program header). To find the run_size, we calculate
the end of the LOAD segment from its PhysAddr start (0x0000000001d5e000)
and its MemSiz (0x0000000000301000), minus the physical load address of
the kernel (the first LOAD segment's PhysAddr: 0x0000000001000000). The
resulting run_size is 0x105f000:

 0x0000000001d5e000 + 0x0000000000301000 - 0x0000000001000000 = 0x105f000

So, from this we can see that the existing run_size calculation is
0x400000 too high. And, as it turns out, the correct run_size is
actually equal to VO_end - VO_text, which is certainly easier to calculate.
_end: 0xffffffff8205f000
_text:0xffffffff81000000

 0xffffffff8205f000 - 0xffffffff81000000 = 0x105f000

As a result, run_size is a simple constant, so we don't need to pass it
around; we already have voffset.h for such things. We can share voffset.h
between misc.c and header.S instead of getting run_size in other ways.
This patch moves voffset.h creation code to boot/compressed/Makefile,
and switches misc.c to use the VO_end - VO_text calculation for run_size.

Dependence before:

 boot/header.S ==> boot/voffset.h ==> vmlinux
 boot/header.S ==> compressed/vmlinux ==> compressed/misc.c

Dependence after:

 boot/header.S ==> compressed/vmlinux ==> compressed/misc.c ==> boot/voffset.h ==> vmlinux

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Baoquan He <bhe@redhat.com>
[ Rewrote the changelog. ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Junjie Mao <eternal.n08@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: lasse.collin@tukaani.org
Fixes: e6023367d7 ("x86, kaslr: Prevent .bss from overlaping initrd")
Link: http://lkml.kernel.org/r/1461888548-32439-5-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29 11:03:30 +02:00

418 lines
10 KiB
C

/*
* misc.c
*
* This is a collection of several routines used to extract the kernel
* which includes KASLR relocation, decompression, ELF parsing, and
* relocation processing. Additionally included are the screen and serial
* output functions and related debugging support functions.
*
* malloc by Hannu Savolainen 1993 and Matthias Urlichs 1994
* puts by Nick Holloway 1993, better puts by Martin Mares 1995
* High loaded stuff by Hans Lermen & Werner Almesberger, Feb. 1996
*/
#include "misc.h"
#include "../string.h"
#include "../voffset.h"
/*
* WARNING!!
* This code is compiled with -fPIC and it is relocated dynamically at
* run time, but no relocation processing is performed. This means that
* it is not safe to place pointers in static structures.
*/
/* Macros used by the included decompressor code below. */
#define STATIC static
/*
* Use normal definitions of mem*() from string.c. There are already
* included header files which expect a definition of memset() and by
* the time we define memset macro, it is too late.
*/
#undef memcpy
#undef memset
#define memzero(s, n) memset((s), 0, (n))
#define memmove memmove
/* Functions used by the included decompressor code below. */
static void error(char *m);
void *memmove(void *dest, const void *src, size_t n);
/*
* This is set up by the setup-routine at boot-time
*/
struct boot_params *boot_params;
memptr free_mem_ptr;
memptr free_mem_end_ptr;
static char *vidmem;
static int vidport;
static int lines, cols;
#ifdef CONFIG_KERNEL_GZIP
#include "../../../../lib/decompress_inflate.c"
#endif
#ifdef CONFIG_KERNEL_BZIP2
#include "../../../../lib/decompress_bunzip2.c"
#endif
#ifdef CONFIG_KERNEL_LZMA
#include "../../../../lib/decompress_unlzma.c"
#endif
#ifdef CONFIG_KERNEL_XZ
#include "../../../../lib/decompress_unxz.c"
#endif
#ifdef CONFIG_KERNEL_LZO
#include "../../../../lib/decompress_unlzo.c"
#endif
#ifdef CONFIG_KERNEL_LZ4
#include "../../../../lib/decompress_unlz4.c"
#endif
/*
* NOTE: When adding a new decompressor, please update the analysis in
* ../header.S.
*/
static void scroll(void)
{
int i;
memmove(vidmem, vidmem + cols * 2, (lines - 1) * cols * 2);
for (i = (lines - 1) * cols * 2; i < lines * cols * 2; i += 2)
vidmem[i] = ' ';
}
#define XMTRDY 0x20
#define TXR 0 /* Transmit register (WRITE) */
#define LSR 5 /* Line Status */
static void serial_putchar(int ch)
{
unsigned timeout = 0xffff;
while ((inb(early_serial_base + LSR) & XMTRDY) == 0 && --timeout)
cpu_relax();
outb(ch, early_serial_base + TXR);
}
void __putstr(const char *s)
{
int x, y, pos;
char c;
if (early_serial_base) {
const char *str = s;
while (*str) {
if (*str == '\n')
serial_putchar('\r');
serial_putchar(*str++);
}
}
if (boot_params->screen_info.orig_video_mode == 0 &&
lines == 0 && cols == 0)
return;
x = boot_params->screen_info.orig_x;
y = boot_params->screen_info.orig_y;
while ((c = *s++) != '\0') {
if (c == '\n') {
x = 0;
if (++y >= lines) {
scroll();
y--;
}
} else {
vidmem[(x + cols * y) * 2] = c;
if (++x >= cols) {
x = 0;
if (++y >= lines) {
scroll();
y--;
}
}
}
}
boot_params->screen_info.orig_x = x;
boot_params->screen_info.orig_y = y;
pos = (x + cols * y) * 2; /* Update cursor position */
outb(14, vidport);
outb(0xff & (pos >> 9), vidport+1);
outb(15, vidport);
outb(0xff & (pos >> 1), vidport+1);
}
void __puthex(unsigned long value)
{
char alpha[2] = "0";
int bits;
for (bits = sizeof(value) * 8 - 4; bits >= 0; bits -= 4) {
unsigned long digit = (value >> bits) & 0xf;
if (digit < 0xA)
alpha[0] = '0' + digit;
else
alpha[0] = 'a' + (digit - 0xA);
__putstr(alpha);
}
}
void warn(char *m)
{
error_putstr("\n\n");
error_putstr(m);
error_putstr("\n\n");
}
static void error(char *m)
{
warn(m);
error_putstr(" -- System halted");
while (1)
asm("hlt");
}
#if CONFIG_X86_NEED_RELOCS
static void handle_relocations(void *output, unsigned long output_len)
{
int *reloc;
unsigned long delta, map, ptr;
unsigned long min_addr = (unsigned long)output;
unsigned long max_addr = min_addr + output_len;
/*
* Calculate the delta between where vmlinux was linked to load
* and where it was actually loaded.
*/
delta = min_addr - LOAD_PHYSICAL_ADDR;
if (!delta) {
debug_putstr("No relocation needed... ");
return;
}
debug_putstr("Performing relocations... ");
/*
* The kernel contains a table of relocation addresses. Those
* addresses have the final load address of the kernel in virtual
* memory. We are currently working in the self map. So we need to
* create an adjustment for kernel memory addresses to the self map.
* This will involve subtracting out the base address of the kernel.
*/
map = delta - __START_KERNEL_map;
/*
* Process relocations: 32 bit relocations first then 64 bit after.
* Three sets of binary relocations are added to the end of the kernel
* before compression. Each relocation table entry is the kernel
* address of the location which needs to be updated stored as a
* 32-bit value which is sign extended to 64 bits.
*
* Format is:
*
* kernel bits...
* 0 - zero terminator for 64 bit relocations
* 64 bit relocation repeated
* 0 - zero terminator for inverse 32 bit relocations
* 32 bit inverse relocation repeated
* 0 - zero terminator for 32 bit relocations
* 32 bit relocation repeated
*
* So we work backwards from the end of the decompressed image.
*/
for (reloc = output + output_len - sizeof(*reloc); *reloc; reloc--) {
long extended = *reloc;
extended += map;
ptr = (unsigned long)extended;
if (ptr < min_addr || ptr > max_addr)
error("32-bit relocation outside of kernel!\n");
*(uint32_t *)ptr += delta;
}
#ifdef CONFIG_X86_64
while (*--reloc) {
long extended = *reloc;
extended += map;
ptr = (unsigned long)extended;
if (ptr < min_addr || ptr > max_addr)
error("inverse 32-bit relocation outside of kernel!\n");
*(int32_t *)ptr -= delta;
}
for (reloc--; *reloc; reloc--) {
long extended = *reloc;
extended += map;
ptr = (unsigned long)extended;
if (ptr < min_addr || ptr > max_addr)
error("64-bit relocation outside of kernel!\n");
*(uint64_t *)ptr += delta;
}
#endif
}
#else
static inline void handle_relocations(void *output, unsigned long output_len)
{ }
#endif
static void parse_elf(void *output)
{
#ifdef CONFIG_X86_64
Elf64_Ehdr ehdr;
Elf64_Phdr *phdrs, *phdr;
#else
Elf32_Ehdr ehdr;
Elf32_Phdr *phdrs, *phdr;
#endif
void *dest;
int i;
memcpy(&ehdr, output, sizeof(ehdr));
if (ehdr.e_ident[EI_MAG0] != ELFMAG0 ||
ehdr.e_ident[EI_MAG1] != ELFMAG1 ||
ehdr.e_ident[EI_MAG2] != ELFMAG2 ||
ehdr.e_ident[EI_MAG3] != ELFMAG3) {
error("Kernel is not a valid ELF file");
return;
}
debug_putstr("Parsing ELF... ");
phdrs = malloc(sizeof(*phdrs) * ehdr.e_phnum);
if (!phdrs)
error("Failed to allocate space for phdrs");
memcpy(phdrs, output + ehdr.e_phoff, sizeof(*phdrs) * ehdr.e_phnum);
for (i = 0; i < ehdr.e_phnum; i++) {
phdr = &phdrs[i];
switch (phdr->p_type) {
case PT_LOAD:
#ifdef CONFIG_RELOCATABLE
dest = output;
dest += (phdr->p_paddr - LOAD_PHYSICAL_ADDR);
#else
dest = (void *)(phdr->p_paddr);
#endif
memmove(dest, output + phdr->p_offset, phdr->p_filesz);
break;
default: /* Ignore other PT_* */ break;
}
}
free(phdrs);
}
/*
* The compressed kernel image (ZO), has been moved so that its position
* is against the end of the buffer used to hold the uncompressed kernel
* image (VO) and the execution environment (.bss, .brk), which makes sure
* there is room to do the in-place decompression. (See header.S for the
* calculations.)
*
* |-----compressed kernel image------|
* V V
* 0 extract_offset +INIT_SIZE
* |-----------|---------------|-------------------------|--------|
* | | | |
* VO__text startup_32 of ZO VO__end ZO__end
* ^ ^
* |-------uncompressed kernel image---------|
*
*/
asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
unsigned char *input_data,
unsigned long input_len,
unsigned char *output,
unsigned long output_len,
unsigned long run_size)
{
unsigned char *output_orig = output;
/* Retain x86 boot parameters pointer passed from startup_32/64. */
boot_params = rmode;
/* Clear flags intended for solely in-kernel use. */
boot_params->hdr.loadflags &= ~KASLR_FLAG;
sanitize_boot_params(boot_params);
if (boot_params->screen_info.orig_video_mode == 7) {
vidmem = (char *) 0xb0000;
vidport = 0x3b4;
} else {
vidmem = (char *) 0xb8000;
vidport = 0x3d4;
}
lines = boot_params->screen_info.orig_video_lines;
cols = boot_params->screen_info.orig_video_cols;
run_size = VO__end - VO__text;
console_init();
debug_putstr("early console in extract_kernel\n");
free_mem_ptr = heap; /* Heap */
free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
/* Report initial kernel position details. */
debug_putaddr(input_data);
debug_putaddr(input_len);
debug_putaddr(output);
debug_putaddr(output_len);
debug_putaddr(run_size);
/*
* The memory hole needed for the kernel is the larger of either
* the entire decompressed kernel plus relocation table, or the
* entire decompressed kernel plus .bss and .brk sections.
*/
output = choose_random_location(input_data, input_len, output,
output_len > run_size ? output_len
: run_size);
/* Validate memory location choices. */
if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
error("Destination address inappropriately aligned");
#ifdef CONFIG_X86_64
if (heap > 0x3fffffffffffUL)
error("Destination address too large");
#else
if (heap > ((-__PAGE_OFFSET-(128<<20)-1) & 0x7fffffff))
error("Destination address too large");
#endif
#ifndef CONFIG_RELOCATABLE
if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
error("Wrong destination address");
#endif
debug_putstr("\nDecompressing Linux... ");
__decompress(input_data, input_len, NULL, NULL, output, output_len,
NULL, error);
parse_elf(output);
/*
* 32-bit always performs relocations. 64-bit relocations are only
* needed if kASLR has chosen a different load address.
*/
if (!IS_ENABLED(CONFIG_X86_64) || output != output_orig)
handle_relocations(output, output_len);
debug_putstr("done.\nBooting the kernel.\n");
return output;
}