mirror of
				git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
				synced 2025-11-01 09:13:37 +00:00 
			
		
		
		
	Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
		
			
				
	
	
		
			198 lines
		
	
	
	
		
			8.7 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			198 lines
		
	
	
	
		
			8.7 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			 =============================
 | 
						|
			 NO-MMU MEMORY MAPPING SUPPORT
 | 
						|
			 =============================
 | 
						|
 | 
						|
The kernel has limited support for memory mapping under no-MMU conditions, such
 | 
						|
as are used in uClinux environments. From the userspace point of view, memory
 | 
						|
mapping is made use of in conjunction with the mmap() system call, the shmat()
 | 
						|
call and the execve() system call. From the kernel's point of view, execve()
 | 
						|
mapping is actually performed by the binfmt drivers, which call back into the
 | 
						|
mmap() routines to do the actual work.
 | 
						|
 | 
						|
Memory mapping behaviour also involves the way fork(), vfork(), clone() and
 | 
						|
ptrace() work. Under uClinux there is no fork(), and clone() must be supplied
 | 
						|
the CLONE_VM flag.
 | 
						|
 | 
						|
The behaviour is similar between the MMU and no-MMU cases, but not identical;
 | 
						|
and it's also much more restricted in the latter case:
 | 
						|
 | 
						|
 (*) Anonymous mapping, MAP_PRIVATE
 | 
						|
 | 
						|
	In the MMU case: VM regions backed by arbitrary pages; copy-on-write
 | 
						|
	across fork.
 | 
						|
 | 
						|
	In the no-MMU case: VM regions backed by arbitrary contiguous runs of
 | 
						|
	pages.
 | 
						|
 | 
						|
 (*) Anonymous mapping, MAP_SHARED
 | 
						|
 | 
						|
	These behave very much like private mappings, except that they're
 | 
						|
	shared across fork() or clone() without CLONE_VM in the MMU case. Since
 | 
						|
	the no-MMU case doesn't support these, behaviour is identical to
 | 
						|
	MAP_PRIVATE there.
 | 
						|
 | 
						|
 (*) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, !PROT_WRITE
 | 
						|
 | 
						|
	In the MMU case: VM regions backed by pages read from file; changes to
 | 
						|
	the underlying file are reflected in the mapping; copied across fork.
 | 
						|
 | 
						|
	In the no-MMU case:
 | 
						|
 | 
						|
         - If one exists, the kernel will re-use an existing mapping to the
 | 
						|
           same segment of the same file if that has compatible permissions,
 | 
						|
           even if this was created by another process.
 | 
						|
 | 
						|
         - If possible, the file mapping will be directly on the backing device
 | 
						|
           if the backing device has the BDI_CAP_MAP_DIRECT capability and
 | 
						|
           appropriate mapping protection capabilities. Ramfs, romfs, cramfs
 | 
						|
           and mtd might all permit this.
 | 
						|
 | 
						|
	 - If the backing device device can't or won't permit direct sharing,
 | 
						|
           but does have the BDI_CAP_MAP_COPY capability, then a copy of the
 | 
						|
           appropriate bit of the file will be read into a contiguous bit of
 | 
						|
           memory and any extraneous space beyond the EOF will be cleared
 | 
						|
 | 
						|
	 - Writes to the file do not affect the mapping; writes to the mapping
 | 
						|
	   are visible in other processes (no MMU protection), but should not
 | 
						|
	   happen.
 | 
						|
 | 
						|
 (*) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, PROT_WRITE
 | 
						|
 | 
						|
	In the MMU case: like the non-PROT_WRITE case, except that the pages in
 | 
						|
	question get copied before the write actually happens. From that point
 | 
						|
	on writes to the file underneath that page no longer get reflected into
 | 
						|
	the mapping's backing pages. The page is then backed by swap instead.
 | 
						|
 | 
						|
	In the no-MMU case: works much like the non-PROT_WRITE case, except
 | 
						|
	that a copy is always taken and never shared.
 | 
						|
 | 
						|
 (*) Regular file / blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE
 | 
						|
 | 
						|
	In the MMU case: VM regions backed by pages read from file; changes to
 | 
						|
	pages written back to file; writes to file reflected into pages backing
 | 
						|
	mapping; shared across fork.
 | 
						|
 | 
						|
	In the no-MMU case: not supported.
 | 
						|
 | 
						|
 (*) Memory backed regular file, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE
 | 
						|
 | 
						|
	In the MMU case: As for ordinary regular files.
 | 
						|
 | 
						|
	In the no-MMU case: The filesystem providing the memory-backed file
 | 
						|
	(such as ramfs or tmpfs) may choose to honour an open, truncate, mmap
 | 
						|
	sequence by providing a contiguous sequence of pages to map. In that
 | 
						|
	case, a shared-writable memory mapping will be possible. It will work
 | 
						|
	as for the MMU case. If the filesystem does not provide any such
 | 
						|
	support, then the mapping request will be denied.
 | 
						|
 | 
						|
 (*) Memory backed blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE
 | 
						|
 | 
						|
	In the MMU case: As for ordinary regular files.
 | 
						|
 | 
						|
	In the no-MMU case: As for memory backed regular files, but the
 | 
						|
	blockdev must be able to provide a contiguous run of pages without
 | 
						|
	truncate being called. The ramdisk driver could do this if it allocated
 | 
						|
	all its memory as a contiguous array upfront.
 | 
						|
 | 
						|
 (*) Memory backed chardev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE
 | 
						|
 | 
						|
	In the MMU case: As for ordinary regular files.
 | 
						|
 | 
						|
	In the no-MMU case: The character device driver may choose to honour
 | 
						|
	the mmap() by providing direct access to the underlying device if it
 | 
						|
	provides memory or quasi-memory that can be accessed directly. Examples
 | 
						|
	of such are frame buffers and flash devices. If the driver does not
 | 
						|
	provide any such support, then the mapping request will be denied.
 | 
						|
 | 
						|
 | 
						|
============================
 | 
						|
FURTHER NOTES ON NO-MMU MMAP
 | 
						|
============================
 | 
						|
 | 
						|
 (*) A request for a private mapping of less than a page in size may not return
 | 
						|
     a page-aligned buffer. This is because the kernel calls kmalloc() to
 | 
						|
     allocate the buffer, not get_free_page().
 | 
						|
 | 
						|
 (*) A list of all the mappings on the system is visible through /proc/maps in
 | 
						|
     no-MMU mode.
 | 
						|
 | 
						|
 (*) Supplying MAP_FIXED or a requesting a particular mapping address will
 | 
						|
     result in an error.
 | 
						|
 | 
						|
 (*) Files mapped privately usually have to have a read method provided by the
 | 
						|
     driver or filesystem so that the contents can be read into the memory
 | 
						|
     allocated if mmap() chooses not to map the backing device directly. An
 | 
						|
     error will result if they don't. This is most likely to be encountered
 | 
						|
     with character device files, pipes, fifos and sockets.
 | 
						|
 | 
						|
============================================
 | 
						|
PROVIDING SHAREABLE CHARACTER DEVICE SUPPORT
 | 
						|
============================================
 | 
						|
 | 
						|
To provide shareable character device support, a driver must provide a
 | 
						|
file->f_op->get_unmapped_area() operation. The mmap() routines will call this
 | 
						|
to get a proposed address for the mapping. This may return an error if it
 | 
						|
doesn't wish to honour the mapping because it's too long, at a weird offset,
 | 
						|
under some unsupported combination of flags or whatever.
 | 
						|
 | 
						|
The driver should also provide backing device information with capabilities set
 | 
						|
to indicate the permitted types of mapping on such devices. The default is
 | 
						|
assumed to be readable and writable, not executable, and only shareable
 | 
						|
directly (can't be copied).
 | 
						|
 | 
						|
The file->f_op->mmap() operation will be called to actually inaugurate the
 | 
						|
mapping. It can be rejected at that point. Returning the ENOSYS error will
 | 
						|
cause the mapping to be copied instead if BDI_CAP_MAP_COPY is specified.
 | 
						|
 | 
						|
The vm_ops->close() routine will be invoked when the last mapping on a chardev
 | 
						|
is removed. An existing mapping will be shared, partially or not, if possible
 | 
						|
without notifying the driver.
 | 
						|
 | 
						|
It is permitted also for the file->f_op->get_unmapped_area() operation to
 | 
						|
return -ENOSYS. This will be taken to mean that this operation just doesn't
 | 
						|
want to handle it, despite the fact it's got an operation. For instance, it
 | 
						|
might try directing the call to a secondary driver which turns out not to
 | 
						|
implement it. Such is the case for the framebuffer driver which attempts to
 | 
						|
direct the call to the device-specific driver. Under such circumstances, the
 | 
						|
mapping request will be rejected if BDI_CAP_MAP_COPY is not specified, and a
 | 
						|
copy mapped otherwise.
 | 
						|
 | 
						|
IMPORTANT NOTE:
 | 
						|
 | 
						|
	Some types of device may present a different appearance to anyone
 | 
						|
	looking at them in certain modes. Flash chips can be like this; for
 | 
						|
	instance if they're in programming or erase mode, you might see the
 | 
						|
	status reflected in the mapping, instead of the data.
 | 
						|
 | 
						|
	In such a case, care must be taken lest userspace see a shared or a
 | 
						|
	private mapping showing such information when the driver is busy
 | 
						|
	controlling the device. Remember especially: private executable
 | 
						|
	mappings may still be mapped directly off the device under some
 | 
						|
	circumstances!
 | 
						|
 | 
						|
 | 
						|
==============================================
 | 
						|
PROVIDING SHAREABLE MEMORY-BACKED FILE SUPPORT
 | 
						|
==============================================
 | 
						|
 | 
						|
Provision of shared mappings on memory backed files is similar to the provision
 | 
						|
of support for shared mapped character devices. The main difference is that the
 | 
						|
filesystem providing the service will probably allocate a contiguous collection
 | 
						|
of pages and permit mappings to be made on that.
 | 
						|
 | 
						|
It is recommended that a truncate operation applied to such a file that
 | 
						|
increases the file size, if that file is empty, be taken as a request to gather
 | 
						|
enough pages to honour a mapping. This is required to support POSIX shared
 | 
						|
memory.
 | 
						|
 | 
						|
Memory backed devices are indicated by the mapping's backing device info having
 | 
						|
the memory_backed flag set.
 | 
						|
 | 
						|
 | 
						|
========================================
 | 
						|
PROVIDING SHAREABLE BLOCK DEVICE SUPPORT
 | 
						|
========================================
 | 
						|
 | 
						|
Provision of shared mappings on block device files is exactly the same as for
 | 
						|
character devices. If there isn't a real device underneath, then the driver
 | 
						|
should allocate sufficient contiguous memory to honour any supported mapping.
 |