linux/arch/sparc/kernel/syscalls/syscall.tbl

511 lines
21 KiB
Text
Raw Normal View History

sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
#
# system call numbers and entry vectors for sparc
#
# The format is:
# <number> <abi> <name> <entry point> <compat entry point>
#
# The <abi> can be common, 64, or 32 for this file.
#
0 common restart_syscall sys_restart_syscall
1 32 exit sys_exit sparc_exit
1 64 exit sparc_exit
2 common fork sys_fork
3 common read sys_read
4 common write sys_write
5 common open sys_open compat_sys_open
6 common close sys_close
7 common wait4 sys_wait4 compat_sys_wait4
8 common creat sys_creat
9 common link sys_link
10 common unlink sys_unlink
11 32 execv sunos_execv
11 64 execv sys_nis_syscall
12 common chdir sys_chdir
13 32 chown sys_chown16
13 64 chown sys_chown
14 common mknod sys_mknod
15 common chmod sys_chmod
16 32 lchown sys_lchown16
16 64 lchown sys_lchown
17 common brk sys_brk
18 common perfctr sys_nis_syscall
19 common lseek sys_lseek compat_sys_lseek
20 common getpid sys_getpid
21 common capget sys_capget
22 common capset sys_capset
23 32 setuid sys_setuid16
23 64 setuid sys_setuid
24 32 getuid sys_getuid16
24 64 getuid sys_getuid
25 common vmsplice sys_vmsplice
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
26 common ptrace sys_ptrace compat_sys_ptrace
27 common alarm sys_alarm
28 common sigaltstack sys_sigaltstack compat_sys_sigaltstack
29 32 pause sys_pause
29 64 pause sys_nis_syscall
30 32 utime sys_utime32
30 64 utime sys_utime
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
31 32 lchown32 sys_lchown
32 32 fchown32 sys_fchown
33 common access sys_access
34 common nice sys_nice
35 32 chown32 sys_chown
36 common sync sys_sync
37 common kill sys_kill
38 common stat sys_newstat compat_sys_newstat
39 32 sendfile sys_sendfile compat_sys_sendfile
39 64 sendfile sys_sendfile64
40 common lstat sys_newlstat compat_sys_newlstat
41 common dup sys_dup
42 common pipe sys_sparc_pipe
43 common times sys_times compat_sys_times
44 32 getuid32 sys_getuid
45 common umount2 sys_umount
46 32 setgid sys_setgid16
46 64 setgid sys_setgid
47 32 getgid sys_getgid16
47 64 getgid sys_getgid
48 common signal sys_signal
49 32 geteuid sys_geteuid16
49 64 geteuid sys_geteuid
50 32 getegid sys_getegid16
50 64 getegid sys_getegid
51 common acct sys_acct
52 64 memory_ordering sys_memory_ordering
53 32 getgid32 sys_getgid
54 common ioctl sys_ioctl compat_sys_ioctl
55 common reboot sys_reboot
56 32 mmap2 sys_mmap2 sys32_mmap2
57 common symlink sys_symlink
58 common readlink sys_readlink
59 32 execve sys_execve sys32_execve
59 64 execve sys64_execve
60 common umask sys_umask
61 common chroot sys_chroot
62 common fstat sys_newfstat compat_sys_newfstat
63 common fstat64 sys_fstat64 compat_sys_fstat64
64 common getpagesize sys_getpagesize
65 common msync sys_msync
66 common vfork sys_vfork
67 common pread64 sys_pread64 compat_sys_pread64
68 common pwrite64 sys_pwrite64 compat_sys_pwrite64
69 32 geteuid32 sys_geteuid
70 32 getegid32 sys_getegid
71 common mmap sys_mmap
72 32 setreuid32 sys_setreuid
73 32 munmap sys_munmap
73 64 munmap sys_64_munmap
74 common mprotect sys_mprotect
75 common madvise sys_madvise
76 common vhangup sys_vhangup
77 32 truncate64 sys_truncate64 compat_sys_truncate64
78 common mincore sys_mincore
79 32 getgroups sys_getgroups16
79 64 getgroups sys_getgroups
80 32 setgroups sys_setgroups16
80 64 setgroups sys_setgroups
81 common getpgrp sys_getpgrp
82 32 setgroups32 sys_setgroups
83 common setitimer sys_setitimer compat_sys_setitimer
84 32 ftruncate64 sys_ftruncate64 compat_sys_ftruncate64
85 common swapon sys_swapon
86 common getitimer sys_getitimer compat_sys_getitimer
87 32 setuid32 sys_setuid
88 common sethostname sys_sethostname
89 32 setgid32 sys_setgid
90 common dup2 sys_dup2
91 32 setfsuid32 sys_setfsuid
92 common fcntl sys_fcntl compat_sys_fcntl
93 common select sys_select
94 32 setfsgid32 sys_setfsgid
95 common fsync sys_fsync
96 common setpriority sys_setpriority
97 common socket sys_socket
98 common connect sys_connect
99 common accept sys_accept
100 common getpriority sys_getpriority
101 common rt_sigreturn sys_rt_sigreturn sys32_rt_sigreturn
102 common rt_sigaction sys_rt_sigaction compat_sys_rt_sigaction
103 common rt_sigprocmask sys_rt_sigprocmask compat_sys_rt_sigprocmask
104 common rt_sigpending sys_rt_sigpending compat_sys_rt_sigpending
105 32 rt_sigtimedwait sys_rt_sigtimedwait_time32 compat_sys_rt_sigtimedwait_time32
105 64 rt_sigtimedwait sys_rt_sigtimedwait
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
106 common rt_sigqueueinfo sys_rt_sigqueueinfo compat_sys_rt_sigqueueinfo
107 common rt_sigsuspend sys_rt_sigsuspend compat_sys_rt_sigsuspend
108 32 setresuid32 sys_setresuid
108 64 setresuid sys_setresuid
109 32 getresuid32 sys_getresuid
109 64 getresuid sys_getresuid
110 32 setresgid32 sys_setresgid
110 64 setresgid sys_setresgid
111 32 getresgid32 sys_getresgid
111 64 getresgid sys_getresgid
112 32 setregid32 sys_setregid
113 common recvmsg sys_recvmsg compat_sys_recvmsg
114 common sendmsg sys_sendmsg compat_sys_sendmsg
115 32 getgroups32 sys_getgroups
116 common gettimeofday sys_gettimeofday compat_sys_gettimeofday
117 common getrusage sys_getrusage compat_sys_getrusage
118 common getsockopt sys_getsockopt sys_getsockopt
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
119 common getcwd sys_getcwd
120 common readv sys_readv
121 common writev sys_writev
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
122 common settimeofday sys_settimeofday compat_sys_settimeofday
123 32 fchown sys_fchown16
123 64 fchown sys_fchown
124 common fchmod sys_fchmod
125 common recvfrom sys_recvfrom
126 32 setreuid sys_setreuid16
126 64 setreuid sys_setreuid
127 32 setregid sys_setregid16
127 64 setregid sys_setregid
128 common rename sys_rename
129 common truncate sys_truncate compat_sys_truncate
130 common ftruncate sys_ftruncate compat_sys_ftruncate
131 common flock sys_flock
132 common lstat64 sys_lstat64 compat_sys_lstat64
133 common sendto sys_sendto
134 common shutdown sys_shutdown
135 common socketpair sys_socketpair
136 common mkdir sys_mkdir
137 common rmdir sys_rmdir
138 32 utimes sys_utimes_time32
138 64 utimes sys_utimes
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
139 common stat64 sys_stat64 compat_sys_stat64
140 common sendfile64 sys_sendfile64
141 common getpeername sys_getpeername
142 32 futex sys_futex_time32
142 64 futex sys_futex
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
143 common gettid sys_gettid
144 common getrlimit sys_getrlimit compat_sys_getrlimit
145 common setrlimit sys_setrlimit compat_sys_setrlimit
146 common pivot_root sys_pivot_root
147 common prctl sys_prctl
148 common pciconfig_read sys_pciconfig_read
149 common pciconfig_write sys_pciconfig_write
150 common getsockname sys_getsockname
151 common inotify_init sys_inotify_init
152 common inotify_add_watch sys_inotify_add_watch
153 common poll sys_poll
154 common getdents64 sys_getdents64
155 32 fcntl64 sys_fcntl64 compat_sys_fcntl64
156 common inotify_rm_watch sys_inotify_rm_watch
157 common statfs sys_statfs compat_sys_statfs
158 common fstatfs sys_fstatfs compat_sys_fstatfs
159 common umount sys_oldumount
160 common sched_set_affinity sys_sched_setaffinity compat_sys_sched_setaffinity
161 common sched_get_affinity sys_sched_getaffinity compat_sys_sched_getaffinity
162 common getdomainname sys_getdomainname
163 common setdomainname sys_setdomainname
164 64 utrap_install sys_utrap_install
165 common quotactl sys_quotactl
166 common set_tid_address sys_set_tid_address
167 common mount sys_mount
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
168 common ustat sys_ustat compat_sys_ustat
169 common setxattr sys_setxattr
170 common lsetxattr sys_lsetxattr
171 common fsetxattr sys_fsetxattr
172 common getxattr sys_getxattr
173 common lgetxattr sys_lgetxattr
174 common getdents sys_getdents compat_sys_getdents
175 common setsid sys_setsid
176 common fchdir sys_fchdir
177 common fgetxattr sys_fgetxattr
178 common listxattr sys_listxattr
179 common llistxattr sys_llistxattr
180 common flistxattr sys_flistxattr
181 common removexattr sys_removexattr
182 common lremovexattr sys_lremovexattr
183 32 sigpending sys_sigpending compat_sys_sigpending
183 64 sigpending sys_nis_syscall
184 common query_module sys_ni_syscall
185 common setpgid sys_setpgid
186 common fremovexattr sys_fremovexattr
187 common tkill sys_tkill
188 32 exit_group sys_exit_group sparc_exit_group
188 64 exit_group sparc_exit_group
189 common uname sys_newuname
190 common init_module sys_init_module
191 32 personality sys_personality sys_sparc64_personality
191 64 personality sys_sparc64_personality
192 32 remap_file_pages sys_sparc_remap_file_pages sys_remap_file_pages
192 64 remap_file_pages sys_remap_file_pages
193 common epoll_create sys_epoll_create
194 common epoll_ctl sys_epoll_ctl
195 common epoll_wait sys_epoll_wait
196 common ioprio_set sys_ioprio_set
197 common getppid sys_getppid
198 32 sigaction sys_sparc_sigaction compat_sys_sparc_sigaction
198 64 sigaction sys_nis_syscall
199 common sgetmask sys_sgetmask
200 common ssetmask sys_ssetmask
201 32 sigsuspend sys_sigsuspend
201 64 sigsuspend sys_nis_syscall
202 common oldlstat sys_newlstat compat_sys_newlstat
203 common uselib sys_uselib
204 32 readdir sys_old_readdir compat_sys_old_readdir
204 64 readdir sys_nis_syscall
205 common readahead sys_readahead compat_sys_readahead
206 common socketcall sys_socketcall sys32_socketcall
207 common syslog sys_syslog
208 common lookup_dcookie sys_ni_syscall
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
209 common fadvise64 sys_fadvise64 compat_sys_fadvise64
210 common fadvise64_64 sys_fadvise64_64 compat_sys_fadvise64_64
211 common tgkill sys_tgkill
212 common waitpid sys_waitpid
213 common swapoff sys_swapoff
214 common sysinfo sys_sysinfo compat_sys_sysinfo
215 32 ipc sys_ipc compat_sys_ipc
215 64 ipc sys_sparc_ipc
216 32 sigreturn sys_sigreturn sys32_sigreturn
216 64 sigreturn sys_nis_syscall
217 common clone sys_clone
218 common ioprio_get sys_ioprio_get
219 32 adjtimex sys_adjtimex_time32
219 64 adjtimex sys_sparc_adjtimex
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
220 32 sigprocmask sys_sigprocmask compat_sys_sigprocmask
220 64 sigprocmask sys_nis_syscall
221 common create_module sys_ni_syscall
222 common delete_module sys_delete_module
223 common get_kernel_syms sys_ni_syscall
224 common getpgid sys_getpgid
225 common bdflush sys_ni_syscall
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
226 common sysfs sys_sysfs
227 common afs_syscall sys_nis_syscall
228 common setfsuid sys_setfsuid16
229 common setfsgid sys_setfsgid16
230 common _newselect sys_select compat_sys_select
231 32 time sys_time32
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
232 common splice sys_splice
233 32 stime sys_stime32
233 64 stime sys_stime
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
234 common statfs64 sys_statfs64 compat_sys_statfs64
235 common fstatfs64 sys_fstatfs64 compat_sys_fstatfs64
236 common _llseek sys_llseek
237 common mlock sys_mlock
238 common munlock sys_munlock
239 common mlockall sys_mlockall
240 common munlockall sys_munlockall
241 common sched_setparam sys_sched_setparam
242 common sched_getparam sys_sched_getparam
243 common sched_setscheduler sys_sched_setscheduler
244 common sched_getscheduler sys_sched_getscheduler
245 common sched_yield sys_sched_yield
246 common sched_get_priority_max sys_sched_get_priority_max
247 common sched_get_priority_min sys_sched_get_priority_min
248 32 sched_rr_get_interval sys_sched_rr_get_interval_time32
248 64 sched_rr_get_interval sys_sched_rr_get_interval
249 32 nanosleep sys_nanosleep_time32
249 64 nanosleep sys_nanosleep
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
250 32 mremap sys_mremap
250 64 mremap sys_64_mremap
all arch: remove system call sys_sysctl Since commit 61a47c1ad3a4dc ("sysctl: Remove the sysctl system call"), sys_sysctl is actually unavailable: any input can only return an error. We have been warning about people using the sysctl system call for years and believe there are no more users. Even if there are users of this interface if they have not complained or fixed their code by now they probably are not going to, so there is no point in warning them any longer. So completely remove sys_sysctl on all architectures. [nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu] Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.com Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Will Deacon <will@kernel.org> [arm/arm64] Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bin Meng <bin.meng@windriver.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: chenzefeng <chenzefeng2@huawei.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christian Brauner <christian@brauner.io> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Diego Elio Pettenò <flameeyes@flameeyes.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kars de Jong <jongk@linux-m68k.org> Cc: Kees Cook <keescook@chromium.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Nick Piggin <npiggin@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Olof Johansson <olof@lixom.net> Cc: Paul Burton <paulburton@kernel.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Sargun Dhillon <sargun@sargun.me> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Sven Schnelle <svens@stackframe.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com> Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-14 17:31:07 -07:00
251 common _sysctl sys_ni_syscall
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
252 common getsid sys_getsid
253 common fdatasync sys_fdatasync
254 32 nfsservctl sys_ni_syscall sys_nis_syscall
254 64 nfsservctl sys_nis_syscall
255 common sync_file_range sys_sync_file_range compat_sys_sync_file_range
256 32 clock_settime sys_clock_settime32
256 64 clock_settime sys_clock_settime
257 32 clock_gettime sys_clock_gettime32
257 64 clock_gettime sys_clock_gettime
258 32 clock_getres sys_clock_getres_time32
258 64 clock_getres sys_clock_getres
259 32 clock_nanosleep sys_clock_nanosleep_time32
259 64 clock_nanosleep sys_clock_nanosleep
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
260 common sched_getaffinity sys_sched_getaffinity compat_sys_sched_getaffinity
261 common sched_setaffinity sys_sched_setaffinity compat_sys_sched_setaffinity
262 32 timer_settime sys_timer_settime32
262 64 timer_settime sys_timer_settime
263 32 timer_gettime sys_timer_gettime32
263 64 timer_gettime sys_timer_gettime
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
264 common timer_getoverrun sys_timer_getoverrun
265 common timer_delete sys_timer_delete
266 common timer_create sys_timer_create compat_sys_timer_create
# 267 was vserver
267 common vserver sys_nis_syscall
268 common io_setup sys_io_setup compat_sys_io_setup
269 common io_destroy sys_io_destroy
270 common io_submit sys_io_submit compat_sys_io_submit
271 common io_cancel sys_io_cancel
272 32 io_getevents sys_io_getevents_time32
272 64 io_getevents sys_io_getevents
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
273 common mq_open sys_mq_open compat_sys_mq_open
274 common mq_unlink sys_mq_unlink
275 32 mq_timedsend sys_mq_timedsend_time32
275 64 mq_timedsend sys_mq_timedsend
276 32 mq_timedreceive sys_mq_timedreceive_time32
276 64 mq_timedreceive sys_mq_timedreceive
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
277 common mq_notify sys_mq_notify compat_sys_mq_notify
278 common mq_getsetattr sys_mq_getsetattr compat_sys_mq_getsetattr
279 common waitid sys_waitid compat_sys_waitid
280 common tee sys_tee
281 common add_key sys_add_key
282 common request_key sys_request_key
283 common keyctl sys_keyctl compat_sys_keyctl
284 common openat sys_openat compat_sys_openat
285 common mkdirat sys_mkdirat
286 common mknodat sys_mknodat
287 common fchownat sys_fchownat
288 32 futimesat sys_futimesat_time32
288 64 futimesat sys_futimesat
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
289 common fstatat64 sys_fstatat64 compat_sys_fstatat64
290 common unlinkat sys_unlinkat
291 common renameat sys_renameat
292 common linkat sys_linkat
293 common symlinkat sys_symlinkat
294 common readlinkat sys_readlinkat
295 common fchmodat sys_fchmodat
296 common faccessat sys_faccessat
297 32 pselect6 sys_pselect6_time32 compat_sys_pselect6_time32
297 64 pselect6 sys_pselect6
298 32 ppoll sys_ppoll_time32 compat_sys_ppoll_time32
298 64 ppoll sys_ppoll
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
299 common unshare sys_unshare
300 common set_robust_list sys_set_robust_list compat_sys_set_robust_list
301 common get_robust_list sys_get_robust_list compat_sys_get_robust_list
302 common migrate_pages sys_migrate_pages
303 common mbind sys_mbind
304 common get_mempolicy sys_get_mempolicy
305 common set_mempolicy sys_set_mempolicy
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
306 common kexec_load sys_kexec_load compat_sys_kexec_load
307 common move_pages sys_move_pages
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
308 common getcpu sys_getcpu
309 common epoll_pwait sys_epoll_pwait compat_sys_epoll_pwait
310 32 utimensat sys_utimensat_time32
310 64 utimensat sys_utimensat
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
311 common signalfd sys_signalfd compat_sys_signalfd
312 common timerfd_create sys_timerfd_create
313 common eventfd sys_eventfd
314 common fallocate sys_fallocate compat_sys_fallocate
315 32 timerfd_settime sys_timerfd_settime32
315 64 timerfd_settime sys_timerfd_settime
316 32 timerfd_gettime sys_timerfd_gettime32
316 64 timerfd_gettime sys_timerfd_gettime
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
317 common signalfd4 sys_signalfd4 compat_sys_signalfd4
318 common eventfd2 sys_eventfd2
319 common epoll_create1 sys_epoll_create1
320 common dup3 sys_dup3
321 common pipe2 sys_pipe2
322 common inotify_init1 sys_inotify_init1
323 common accept4 sys_accept4
324 common preadv sys_preadv compat_sys_preadv
325 common pwritev sys_pwritev compat_sys_pwritev
326 common rt_tgsigqueueinfo sys_rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
327 common perf_event_open sys_perf_event_open
328 32 recvmmsg sys_recvmmsg_time32 compat_sys_recvmmsg_time32
328 64 recvmmsg sys_recvmmsg
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
329 common fanotify_init sys_fanotify_init
330 common fanotify_mark sys_fanotify_mark compat_sys_fanotify_mark
331 common prlimit64 sys_prlimit64
332 common name_to_handle_at sys_name_to_handle_at
333 common open_by_handle_at sys_open_by_handle_at compat_sys_open_by_handle_at
334 32 clock_adjtime sys_clock_adjtime32
334 64 clock_adjtime sys_sparc_clock_adjtime
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
335 common syncfs sys_syncfs
336 common sendmmsg sys_sendmmsg compat_sys_sendmmsg
337 common setns sys_setns
338 common process_vm_readv sys_process_vm_readv
339 common process_vm_writev sys_process_vm_writev
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
340 32 kern_features sys_ni_syscall sys_kern_features
340 64 kern_features sys_kern_features
341 common kcmp sys_kcmp
342 common finit_module sys_finit_module
343 common sched_setattr sys_sched_setattr
344 common sched_getattr sys_sched_getattr
345 common renameat2 sys_renameat2
346 common seccomp sys_seccomp
347 common getrandom sys_getrandom
348 common memfd_create sys_memfd_create
349 common bpf sys_bpf
350 32 execveat sys_execveat sys32_execveat
350 64 execveat sys64_execveat
351 common membarrier sys_membarrier
352 common userfaultfd sys_userfaultfd
353 common bind sys_bind
354 common listen sys_listen
355 common setsockopt sys_setsockopt sys_setsockopt
sparc: add system call table generation support The system call tables are in different format in all architecture and it will be difficult to manually add, modify or delete the syscall table entries in the res- pective files. To make it easy by keeping a script and which will generate the uapi header and syscall table file. This change will also help to unify the implemen- tation across all architectures. The system call table generation script is added in kernel/syscalls directory which contain the scripts to generate both uapi header file and system call table files. The syscall.tbl will be input for the scripts. syscall.tbl contains the list of available system calls along with system call number and corresponding entry point. Add a new system call in this architecture will be possible by adding new entry in the syscall.tbl file. Adding a new table entry consisting of: - System call number. - ABI. - System call name. - Entry point name. - Compat entry name, if required. syscallhdr.sh and syscalltbl.sh will generate uapi header unistd_32/64.h and syscall_table_32/64/c32.h files respect- ively. Both .sh files will parse the content syscall.tbl to generate the header and table files. unistd_32/64.h will be included by uapi/asm/unistd.h and syscall_table_32/64/- c32.h is included by kernel/syscall.S - the real system call table. ARM, s390 and x86 architecuture does have similar support. I leverage their implementation to come up with a generic solution. Signed-off-by: Firoz Khan <firoz.khan@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 10:56:30 +05:30
356 common mlock2 sys_mlock2
357 common copy_file_range sys_copy_file_range
358 common preadv2 sys_preadv2 compat_sys_preadv2
359 common pwritev2 sys_pwritev2 compat_sys_pwritev2
360 common statx sys_statx
361 32 io_pgetevents sys_io_pgetevents_time32 compat_sys_io_pgetevents
361 64 io_pgetevents sys_io_pgetevents
362 common pkey_mprotect sys_pkey_mprotect
363 common pkey_alloc sys_pkey_alloc
364 common pkey_free sys_pkey_free
365 common rseq sys_rseq
# room for arch specific syscalls
392 64 semtimedop sys_semtimedop
393 common semget sys_semget
394 common semctl sys_semctl compat_sys_semctl
395 common shmget sys_shmget
396 common shmctl sys_shmctl compat_sys_shmctl
397 common shmat sys_shmat compat_sys_shmat
398 common shmdt sys_shmdt
399 common msgget sys_msgget
400 common msgsnd sys_msgsnd compat_sys_msgsnd
401 common msgrcv sys_msgrcv compat_sys_msgrcv
402 common msgctl sys_msgctl compat_sys_msgctl
403 32 clock_gettime64 sys_clock_gettime sys_clock_gettime
404 32 clock_settime64 sys_clock_settime sys_clock_settime
405 32 clock_adjtime64 sys_clock_adjtime sys_clock_adjtime
406 32 clock_getres_time64 sys_clock_getres sys_clock_getres
407 32 clock_nanosleep_time64 sys_clock_nanosleep sys_clock_nanosleep
408 32 timer_gettime64 sys_timer_gettime sys_timer_gettime
409 32 timer_settime64 sys_timer_settime sys_timer_settime
410 32 timerfd_gettime64 sys_timerfd_gettime sys_timerfd_gettime
411 32 timerfd_settime64 sys_timerfd_settime sys_timerfd_settime
412 32 utimensat_time64 sys_utimensat sys_utimensat
413 32 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64
414 32 ppoll_time64 sys_ppoll compat_sys_ppoll_time64
416 32 io_pgetevents_time64 sys_io_pgetevents sys_io_pgetevents
417 32 recvmmsg_time64 sys_recvmmsg compat_sys_recvmmsg_time64
418 32 mq_timedsend_time64 sys_mq_timedsend sys_mq_timedsend
419 32 mq_timedreceive_time64 sys_mq_timedreceive sys_mq_timedreceive
420 32 semtimedop_time64 sys_semtimedop sys_semtimedop
421 32 rt_sigtimedwait_time64 sys_rt_sigtimedwait compat_sys_rt_sigtimedwait_time64
422 32 futex_time64 sys_futex sys_futex
423 32 sched_rr_get_interval_time64 sys_sched_rr_get_interval sys_sched_rr_get_interval
424 common pidfd_send_signal sys_pidfd_send_signal
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
428 common open_tree sys_open_tree
429 common move_mount sys_move_mount
430 common fsopen sys_fsopen
431 common fsconfig sys_fsconfig
432 common fsmount sys_fsmount
433 common fspick sys_fspick
434 common pidfd_open sys_pidfd_open
# 435 reserved for clone3
436 common close_range sys_close_range
open: introduce openat2(2) syscall /* Background. */ For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). Userspace also has a hard time figuring out whether a particular flag is supported on a particular kernel. While it is now possible with contemporary kernels (thanks to [3]), older kernels will expose unknown flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during openat(2) time matches modern syscall designs and is far more fool-proof. In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_* flag set -- they affect all components of path lookup. We'd therefore like to add a new flag argument. Adding a new syscall allows us to finally fix the flag-ignoring problem, and we can make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. */ /* * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. */ struct open_how { /* ... */ }; int openat2(int dfd, const char *pathname, struct open_how *how, size_t size); /* Description. */ The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR) will result in -EINVAL. In addition, this field is 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. resolve Restrict path resolution (in contrast to O_* flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). It also only contains u64s (even though ->mode arguably should be a u16) to avoid having padding fields which are never used in the future. Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE is no longer permitted for openat(2). As far as I can tell, this has always been a bug and appears to not be used by userspace (and I've not seen any problems on my machines by disallowing it). If it turns out this breaks something, we can special-case it and only permit it for openat(2) but not openat2(2). After input from Florian Weimer, the new open_how and flag definitions are inside a separate header from uapi/linux/fcntl.h, to avoid problems that glibc has with importing that header. /* Testing. */ In a follow-up patch there are over 200 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). /* Future Work. */ Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking auto-mount during resolution). Furthermore, there are some other proposed changes to the openat(2) interface (the most obvious example is magic-link hardening[5]) which would be a good opportunity to add a way for userspace to restrict how O_PATH file descriptors can be re-opened. Another possible avenue of future work would be some kind of CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace which openat2(2) flags and fields are supported by the current kernel (to avoid userspace having to go through several guesses to figure it out). [1]: https://lwn.net/Articles/588444/ [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com [3]: commit 629e014bb834 ("fs: completely ignore unknown open flags") [4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 [5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/ [6]: https://youtu.be/ggD-eb3yPVs Suggested-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-18 23:07:59 +11:00
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
mm/madvise: introduce process_madvise() syscall: an external memory hinting API There is usecase that System Management Software(SMS) want to give a memory hint like MADV_[COLD|PAGEEOUT] to other processes and in the case of Android, it is the ActivityManagerService. The information required to make the reclaim decision is not known to the app. Instead, it is known to the centralized userspace daemon(ActivityManagerService), and that daemon must be able to initiate reclaim on its own without any app involvement. To solve the issue, this patch introduces a new syscall process_madvise(2). It uses pidfd of an external process to give the hint. It also supports vector address range because Android app has thousands of vmas due to zygote so it's totally waste of CPU and power if we should call the syscall one by one for each vma.(With testing 2000-vma syscall vs 1-vector syscall, it showed 15% performance improvement. I think it would be bigger in real practice because the testing ran very cache friendly environment). Another potential use case for the vector range is to amortize the cost ofTLB shootdowns for multiple ranges when using MADV_DONTNEED; this could benefit users like TCP receive zerocopy and malloc implementations. In future, we could find more usecases for other advises so let's make it happens as API since we introduce a new syscall at this moment. With that, existing madvise(2) user could replace it with process_madvise(2) with their own pid if they want to have batch address ranges support feature. ince it could affect other process's address range, only privileged process(PTRACE_MODE_ATTACH_FSCREDS) or something else(e.g., being the same UID) gives it the right to ptrace the process could use it successfully. The flag argument is reserved for future use if we need to extend the API. I think supporting all hints madvise has/will supported/support to process_madvise is rather risky. Because we are not sure all hints make sense from external process and implementation for the hint may rely on the caller being in the current context so it could be error-prone. Thus, I just limited hints as MADV_[COLD|PAGEOUT] in this patch. If someone want to add other hints, we could hear the usecase and review it for each hint. It's safer for maintenance rather than introducing a buggy syscall but hard to fix it later. So finally, the API is as follows, ssize_t process_madvise(int pidfd, const struct iovec *iovec, unsigned long vlen, int advice, unsigned int flags); DESCRIPTION The process_madvise() system call is used to give advice or directions to the kernel about the address ranges from external process as well as local process. It provides the advice to address ranges of process described by iovec and vlen. The goal of such advice is to improve system or application performance. The pidfd selects the process referred to by the PID file descriptor specified in pidfd. (See pidofd_open(2) for further information) The pointer iovec points to an array of iovec structures, defined in <sys/uio.h> as: struct iovec { void *iov_base; /* starting address */ size_t iov_len; /* number of bytes to be advised */ }; The iovec describes address ranges beginning at address(iov_base) and with size length of bytes(iov_len). The vlen represents the number of elements in iovec. The advice is indicated in the advice argument, which is one of the following at this moment if the target process specified by pidfd is external. MADV_COLD MADV_PAGEOUT Permission to provide a hint to external process is governed by a ptrace access mode PTRACE_MODE_ATTACH_FSCREDS check; see ptrace(2). The process_madvise supports every advice madvise(2) has if target process is in same thread group with calling process so user could use process_madvise(2) to extend existing madvise(2) to support vector address ranges. RETURN VALUE On success, process_madvise() returns the number of bytes advised. This return value may be less than the total number of requested bytes, if an error occurred. The caller should check return value to determine whether a partial advice occurred. FAQ: Q.1 - Why does any external entity have better knowledge? Quote from Sandeep "For Android, every application (including the special SystemServer) are forked from Zygote. The reason of course is to share as many libraries and classes between the two as possible to benefit from the preloading during boot. After applications start, (almost) all of the APIs end up calling into this SystemServer process over IPC (binder) and back to the application. In a fully running system, the SystemServer monitors every single process periodically to calculate their PSS / RSS and also decides which process is "important" to the user for interactivity. So, because of how these processes start _and_ the fact that the SystemServer is looping to monitor each process, it does tend to *know* which address range of the application is not used / useful. Besides, we can never rely on applications to clean things up themselves. We've had the "hey app1, the system is low on memory, please trim your memory usage down" notifications for a long time[1]. They rely on applications honoring the broadcasts and very few do. So, if we want to avoid the inevitable killing of the application and restarting it, some way to be able to tell the OS about unimportant memory in these applications will be useful. - ssp Q.2 - How to guarantee the race(i.e., object validation) between when giving a hint from an external process and get the hint from the target process? process_madvise operates on the target process's address space as it exists at the instant that process_madvise is called. If the space target process can run between the time the process_madvise process inspects the target process address space and the time that process_madvise is actually called, process_madvise may operate on memory regions that the calling process does not expect. It's the responsibility of the process calling process_madvise to close this race condition. For example, the calling process can suspend the target process with ptrace, SIGSTOP, or the freezer cgroup so that it doesn't have an opportunity to change its own address space before process_madvise is called. Another option is to operate on memory regions that the caller knows a priori will be unchanged in the target process. Yet another option is to accept the race for certain process_madvise calls after reasoning that mistargeting will do no harm. The suggested API itself does not provide synchronization. It also apply other APIs like move_pages, process_vm_write. The race isn't really a problem though. Why is it so wrong to require that callers do their own synchronization in some manner? Nobody objects to write(2) merely because it's possible for two processes to open the same file and clobber each other's writes --- instead, we tell people to use flock or something. Think about mmap. It never guarantees newly allocated address space is still valid when the user tries to access it because other threads could unmap the memory right before. That's where we need synchronization by using other API or design from userside. It shouldn't be part of API itself. If someone needs more fine-grained synchronization rather than process level, there were two ideas suggested - cookie[2] and anon-fd[3]. Both are applicable via using last reserved argument of the API but I don't think it's necessary right now since we have already ways to prevent the race so don't want to add additional complexity with more fine-grained optimization model. To make the API extend, it reserved an unsigned long as last argument so we could support it in future if someone really needs it. Q.3 - Why doesn't ptrace work? Injecting an madvise in the target process using ptrace would not work for us because such injected madvise would have to be executed by the target process, which means that process would have to be runnable and that creates the risk of the abovementioned race and hinting a wrong VMA. Furthermore, we want to act the hint in caller's context, not the callee's, because the callee is usually limited in cpuset/cgroups or even freezed state so they can't act by themselves quick enough, which causes more thrashing/kill. It doesn't work if the target process are ptraced(e.g., strace, debugger, minidump) because a process can have at most one ptracer. [1] https://developer.android.com/topic/performance/memory" [2] process_getinfo for getting the cookie which is updated whenever vma of process address layout are changed - Daniel Colascione - https://lore.kernel.org/lkml/20190520035254.57579-1-minchan@kernel.org/T/#m7694416fd179b2066a2c62b5b139b14e3894e224 [3] anonymous fd which is used for the object(i.e., address range) validation - Michal Hocko - https://lore.kernel.org/lkml/20200120112722.GY18451@dhcp22.suse.cz/ [minchan@kernel.org: fix process_madvise build break for arm64] Link: http://lkml.kernel.org/r/20200303145756.GA219683@google.com [minchan@kernel.org: fix build error for mips of process_madvise] Link: http://lkml.kernel.org/r/20200508052517.GA197378@google.com [akpm@linux-foundation.org: fix patch ordering issue] [akpm@linux-foundation.org: fix arm64 whoops] [minchan@kernel.org: make process_madvise() vlen arg have type size_t, per Florian] [akpm@linux-foundation.org: fix i386 build] [sfr@canb.auug.org.au: fix syscall numbering] Link: https://lkml.kernel.org/r/20200905142639.49fc3f1a@canb.auug.org.au [sfr@canb.auug.org.au: madvise.c needs compat.h] Link: https://lkml.kernel.org/r/20200908204547.285646b4@canb.auug.org.au [minchan@kernel.org: fix mips build] Link: https://lkml.kernel.org/r/20200909173655.GC2435453@google.com [yuehaibing@huawei.com: remove duplicate header which is included twice] Link: https://lkml.kernel.org/r/20200915121550.30584-1-yuehaibing@huawei.com [minchan@kernel.org: do not use helper functions for process_madvise] Link: https://lkml.kernel.org/r/20200921175539.GB387368@google.com [akpm@linux-foundation.org: pidfd_get_pid() gained an argument] [sfr@canb.auug.org.au: fix up for "iov_iter: transparently handle compat iovecs in import_iovec"] Link: https://lkml.kernel.org/r/20200928212542.468e1fef@canb.auug.org.au Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <christian@brauner.io> Cc: Daniel Colascione <dancol@google.com> Cc: Jann Horn <jannh@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Dias <joaodias@google.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleksandr Natalenko <oleksandr@redhat.com> Cc: Sandeep Patil <sspatil@google.com> Cc: SeongJae Park <sj38.park@gmail.com> Cc: SeongJae Park <sjpark@amazon.de> Cc: Shakeel Butt <shakeelb@google.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Tim Murray <timmurray@google.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Florian Weimer <fw@deneb.enyo.de> Cc: <linux-man@vger.kernel.org> Link: http://lkml.kernel.org/r/20200302193630.68771-3-minchan@kernel.org Link: http://lkml.kernel.org/r/20200508183320.GA125527@google.com Link: http://lkml.kernel.org/r/20200622192900.22757-4-minchan@kernel.org Link: https://lkml.kernel.org/r/20200901000633.1920247-4-minchan@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-17 16:14:59 -07:00
440 common process_madvise sys_process_madvise
441 common epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2
fs: add mount_setattr() This implements the missing mount_setattr() syscall. While the new mount api allows to change the properties of a superblock there is currently no way to change the properties of a mount or a mount tree using file descriptors which the new mount api is based on. In addition the old mount api has the restriction that mount options cannot be applied recursively. This hasn't changed since changing mount options on a per-mount basis was implemented in [1] and has been a frequent request not just for convenience but also for security reasons. The legacy mount syscall is unable to accommodate this behavior without introducing a whole new set of flags because MS_REC | MS_REMOUNT | MS_BIND | MS_RDONLY | MS_NOEXEC | [...] only apply the mount option to the topmost mount. Changing MS_REC to apply to the whole mount tree would mean introducing a significant uapi change and would likely cause significant regressions. The new mount_setattr() syscall allows to recursively clear and set mount options in one shot. Multiple calls to change mount options requesting the same changes are idempotent: int mount_setattr(int dfd, const char *path, unsigned flags, struct mount_attr *uattr, size_t usize); Flags to modify path resolution behavior are specified in the @flags argument. Currently, AT_EMPTY_PATH, AT_RECURSIVE, AT_SYMLINK_NOFOLLOW, and AT_NO_AUTOMOUNT are supported. If useful, additional lookup flags to restrict path resolution as introduced with openat2() might be supported in the future. The mount_setattr() syscall can be expected to grow over time and is designed with extensibility in mind. It follows the extensible syscall pattern we have used with other syscalls such as openat2(), clone3(), sched_{set,get}attr(), and others. The set of mount options is passed in the uapi struct mount_attr which currently has the following layout: struct mount_attr { __u64 attr_set; __u64 attr_clr; __u64 propagation; __u64 userns_fd; }; The @attr_set and @attr_clr members are used to clear and set mount options. This way a user can e.g. request that a set of flags is to be raised such as turning mounts readonly by raising MOUNT_ATTR_RDONLY in @attr_set while at the same time requesting that another set of flags is to be lowered such as removing noexec from a mount tree by specifying MOUNT_ATTR_NOEXEC in @attr_clr. Note, since the MOUNT_ATTR_<atime> values are an enum starting from 0, not a bitmap, users wanting to transition to a different atime setting cannot simply specify the atime setting in @attr_set, but must also specify MOUNT_ATTR__ATIME in the @attr_clr field. So we ensure that MOUNT_ATTR__ATIME can't be partially set in @attr_clr and that @attr_set can't have any atime bits set if MOUNT_ATTR__ATIME isn't set in @attr_clr. The @propagation field lets callers specify the propagation type of a mount tree. Propagation is a single property that has four different settings and as such is not really a flag argument but an enum. Specifically, it would be unclear what setting and clearing propagation settings in combination would amount to. The legacy mount() syscall thus forbids the combination of multiple propagation settings too. The goal is to keep the semantics of mount propagation somewhat simple as they are overly complex as it is. The @userns_fd field lets user specify a user namespace whose idmapping becomes the idmapping of the mount. This is implemented and explained in detail in the next patch. [1]: commit 2e4b7fcd9260 ("[PATCH] r/o bind mounts: honor mount writer counts at remount") Link: https://lore.kernel.org/r/20210121131959.646623-35-christian.brauner@ubuntu.com Cc: David Howells <dhowells@redhat.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-api@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-21 14:19:53 +01:00
442 common mount_setattr sys_mount_setattr
443 common quotactl_fd sys_quotactl_fd
444 common landlock_create_ruleset sys_landlock_create_ruleset
445 common landlock_add_rule sys_landlock_add_rule
446 common landlock_restrict_self sys_landlock_restrict_self
# 447 reserved for memfd_secret
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
cachestat: wire up cachestat for other architectures cachestat is previously only wired in for x86 (and architectures using the generic unistd.h table): https://lore.kernel.org/lkml/20230503013608.2431726-1-nphamcs@gmail.com/ This patch wires cachestat in for all the other architectures. [nphamcs@gmail.com: wire up cachestat for arm64] Link: https://lkml.kernel.org/r/20230511092843.3896327-1-nphamcs@gmail.com Link: https://lkml.kernel.org/r/20230510195806.2902878-1-nphamcs@gmail.com Signed-off-by: Nhat Pham <nphamcs@gmail.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> [s390] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-10 12:58:06 -07:00
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
arch: Reserve map_shadow_stack() syscall number for all architectures commit c35559f94ebc ("x86/shstk: Introduce map_shadow_stack syscall") recently added support for map_shadow_stack() but it is limited to x86 only for now. There is a possibility that other architectures (namely, arm64 and RISC-V), that are implementing equivalent support for shadow stacks, might need to add support for it. Independent of that, reserving arch-specific syscall numbers in the syscall tables of all architectures is good practice and would help avoid future conflicts. map_shadow_stack() is marked as a conditional syscall in sys_ni.c. Adding it to the syscall tables of other architectures is harmless and would return ENOSYS when exercised. Note, map_shadow_stack() was assigned #453 during the merge process since #452 was taken by fchmodat2(). For Powerpc, map it to sys_ni_syscall() as is the norm for Powerpc syscall tables. For Alpha, map_shadow_stack() takes up #563 as Alpha still diverges from the common syscall numbering system in the other architectures. Link: https://lore.kernel.org/lkml/20230515212255.GA562920@debug.ba.rivosinc.com/ Link: https://lore.kernel.org/lkml/b402b80b-a7c6-4ef0-b977-c0f5f582b78a@sirena.org.uk/ Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-09-14 18:58:03 +00:00
453 common map_shadow_stack sys_map_shadow_stack
454 common futex_wake sys_futex_wake
455 common futex_wait sys_futex_wait
456 common futex_requeue sys_futex_requeue
457 common statmount sys_statmount
458 common listmount sys_listmount
lsm/stable-6.8 PR 20240105 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmWYKUIUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNyHw/+IKnqL1MZ5QS+/HtSzi4jCL47N9yZ OHLol6XswyEGHH9myKPPGnT5lVA93v98v4ty2mws7EJUSGZQQUntYBPbU9Gi40+B XDzYSRocoj96sdlKeOJMgaWo3NBRD9HYSoGPDNWZixy6m+bLPk/Dqhn3FabKf1lo 2qQSmstvChFRmVNkmgaQnBCAtWVqla4EJEL0EKX6cspHbuzRNTeJdTPn6Q/zOUVL O2znOZuEtSVpYS7yg3uJT0hHD8H0GnIciAcDAhyPSBL5Uk5l6gwJiACcdRfLRbgp QM5Z4qUFdKljV5XBCzYnfhhrx1df08h1SG84El8UK8HgTTfOZfYmawByJRWNJSQE TdCmtyyvEbfb61CKBFVwD7Tzb9/y8WgcY5N3Un8uCQqRzFIO+6cghHri5NrVhifp nPFlP4klxLHh3d7ZVekLmCMHbpaacRyJKwLy+f/nwbBEID47jpPkvZFIpbalat+r QaKRBNWdTeV+GZ+Yu0uWsI029aQnpcO1kAnGg09fl6b/dsmxeKOVWebir25AzQ++ a702S8HRmj80X+VnXHU9a64XeGtBH7Nq0vu0lGHQPgwhSx/9P6/qICEPwsIriRjR I9OulWt4OBPDtlsonHFgDs+lbnd0Z0GJUwYT8e9pjRDMxijVO9lhAXyglVRmuNR8 to2ByKP5BO+Vh8Y= =Py+n -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull security module updates from Paul Moore: - Add three new syscalls: lsm_list_modules(), lsm_get_self_attr(), and lsm_set_self_attr(). The first syscall simply lists the LSMs enabled, while the second and third get and set the current process' LSM attributes. Yes, these syscalls may provide similar functionality to what can be found under /proc or /sys, but they were designed to support multiple, simultaneaous (stacked) LSMs from the start as opposed to the current /proc based solutions which were created at a time when only one LSM was allowed to be active at a given time. We have spent considerable time discussing ways to extend the existing /proc interfaces to support multiple, simultaneaous LSMs and even our best ideas have been far too ugly to support as a kernel API; after +20 years in the kernel, I felt the LSM layer had established itself enough to justify a handful of syscalls. Support amongst the individual LSM developers has been nearly unanimous, with a single objection coming from Tetsuo (TOMOYO) as he is worried that the LSM_ID_XXX token concept will make it more difficult for out-of-tree LSMs to survive. Several members of the LSM community have demonstrated the ability for out-of-tree LSMs to continue to exist by picking high/unused LSM_ID values as well as pointing out that many kernel APIs rely on integer identifiers, e.g. syscalls (!), but unfortunately Tetsuo's objections remain. My personal opinion is that while I have no interest in penalizing out-of-tree LSMs, I'm not going to penalize in-tree development to support out-of-tree development, and I view this as a necessary step forward to support the push for expanded LSM stacking and reduce our reliance on /proc and /sys which has occassionally been problematic for some container users. Finally, we have included the linux-api folks on (all?) recent revisions of the patchset and addressed all of their concerns. - Add a new security_file_ioctl_compat() LSM hook to handle the 32-bit ioctls on 64-bit systems problem. This patch includes support for all of the existing LSMs which provide ioctl hooks, although it turns out only SELinux actually cares about the individual ioctls. It is worth noting that while Casey (Smack) and Tetsuo (TOMOYO) did not give explicit ACKs to this patch, they did both indicate they are okay with the changes. - Fix a potential memory leak in the CALIPSO code when IPv6 is disabled at boot. While it's good that we are fixing this, I doubt this is something users are seeing in the wild as you need to both disable IPv6 and then attempt to configure IPv6 labeled networking via NetLabel/CALIPSO; that just doesn't make much sense. Normally this would go through netdev, but Jakub asked me to take this patch and of all the trees I maintain, the LSM tree seemed like the best fit. - Update the LSM MAINTAINERS entry with additional information about our process docs, patchwork, bug reporting, etc. I also noticed that the Lockdown LSM is missing a dedicated MAINTAINERS entry so I've added that to the pull request. I've been working with one of the major Lockdown authors/contributors to see if they are willing to step up and assume a Lockdown maintainer role; hopefully that will happen soon, but in the meantime I'll continue to look after it. - Add a handful of mailmap entries for Serge Hallyn and myself. * tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (27 commits) lsm: new security_file_ioctl_compat() hook lsm: Add a __counted_by() annotation to lsm_ctx.ctx calipso: fix memory leak in netlbl_calipso_add_pass() selftests: remove the LSM_ID_IMA check in lsm/lsm_list_modules_test MAINTAINERS: add an entry for the lockdown LSM MAINTAINERS: update the LSM entry mailmap: add entries for Serge Hallyn's dead accounts mailmap: update/replace my old email addresses lsm: mark the lsm_id variables are marked as static lsm: convert security_setselfattr() to use memdup_user() lsm: align based on pointer length in lsm_fill_user_ctx() lsm: consolidate buffer size handling into lsm_fill_user_ctx() lsm: correct error codes in security_getselfattr() lsm: cleanup the size counters in security_getselfattr() lsm: don't yet account for IMA in LSM_CONFIG_COUNT calculation lsm: drop LSM_ID_IMA LSM: selftests for Linux Security Module syscalls SELinux: Add selfattr hooks AppArmor: Add selfattr hooks Smack: implement setselfattr and getselfattr hooks ...
2024-01-09 12:57:46 -08:00
459 common lsm_get_self_attr sys_lsm_get_self_attr
460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules
mseal: wire up mseal syscall Patch series "Introduce mseal", v10. This patchset proposes a new mseal() syscall for the Linux kernel. In a nutshell, mseal() protects the VMAs of a given virtual memory range against modifications, such as changes to their permission bits. Modern CPUs support memory permissions, such as the read/write (RW) and no-execute (NX) bits. Linux has supported NX since the release of kernel version 2.6.8 in August 2004 [1]. The memory permission feature improves the security stance on memory corruption bugs, as an attacker cannot simply write to arbitrary memory and point the code to it. The memory must be marked with the X bit, or else an exception will occur. Internally, the kernel maintains the memory permissions in a data structure called VMA (vm_area_struct). mseal() additionally protects the VMA itself against modifications of the selected seal type. Memory sealing is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. Memory sealing can automatically be applied by the runtime loader to seal .text and .rodata pages and applications can additionally seal security critical data at runtime. A similar feature already exists in the XNU kernel with the VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the mimmutable syscall [4]. Also, Chrome wants to adopt this feature for their CFI work [2] and this patchset has been designed to be compatible with the Chrome use case. Two system calls are involved in sealing the map: mmap() and mseal(). The new mseal() is an syscall on 64 bit CPU, and with following signature: int mseal(void addr, size_t len, unsigned long flags) addr/len: memory range. flags: reserved. mseal() blocks following operations for the given memory range. 1> Unmapping, moving to another location, and shrinking the size, via munmap() and mremap(), can leave an empty space, therefore can be replaced with a VMA with a new set of attributes. 2> Moving or expanding a different VMA into the current location, via mremap(). 3> Modifying a VMA via mmap(MAP_FIXED). 4> Size expansion, via mremap(), does not appear to pose any specific risks to sealed VMAs. It is included anyway because the use case is unclear. In any case, users can rely on merging to expand a sealed VMA. 5> mprotect() and pkey_mprotect(). 6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous memory, when users don't have write permission to the memory. Those behaviors can alter region contents by discarding pages, effectively a memset(0) for anonymous memory. The idea that inspired this patch comes from Stephen Röttger’s work in V8 CFI [5]. Chrome browser in ChromeOS will be the first user of this API. Indeed, the Chrome browser has very specific requirements for sealing, which are distinct from those of most applications. For example, in the case of libc, sealing is only applied to read-only (RO) or read-execute (RX) memory segments (such as .text and .RELRO) to prevent them from becoming writable, the lifetime of those mappings are tied to the lifetime of the process. Chrome wants to seal two large address space reservations that are managed by different allocators. The memory is mapped RW- and RWX respectively but write access to it is restricted using pkeys (or in the future ARM permission overlay extensions). The lifetime of those mappings are not tied to the lifetime of the process, therefore, while the memory is sealed, the allocators still need to free or discard the unused memory. For example, with madvise(DONTNEED). However, always allowing madvise(DONTNEED) on this range poses a security risk. For example if a jump instruction crosses a page boundary and the second page gets discarded, it will overwrite the target bytes with zeros and change the control flow. Checking write-permission before the discard operation allows us to control when the operation is valid. In this case, the madvise will only succeed if the executing thread has PKEY write permissions and PKRU changes are protected in software by control-flow integrity. Although the initial version of this patch series is targeting the Chrome browser as its first user, it became evident during upstream discussions that we would also want to ensure that the patch set eventually is a complete solution for memory sealing and compatible with other use cases. The specific scenario currently in mind is glibc's use case of loading and sealing ELF executables. To this end, Stephen is working on a change to glibc to add sealing support to the dynamic linker, which will seal all non-writable segments at startup. Once this work is completed, all applications will be able to automatically benefit from these new protections. In closing, I would like to formally acknowledge the valuable contributions received during the RFC process, which were instrumental in shaping this patch: Jann Horn: raising awareness and providing valuable insights on the destructive madvise operations. Liam R. Howlett: perf optimization. Linus Torvalds: assisting in defining system call signature and scope. Theo de Raadt: sharing the experiences and insight gained from implementing mimmutable() in OpenBSD. MM perf benchmarks ================== This patch adds a loop in the mprotect/munmap/madvise(DONTNEED) to check the VMAs’ sealing flag, so that no partial update can be made, when any segment within the given memory range is sealed. To measure the performance impact of this loop, two tests are developed. [8] The first is measuring the time taken for a particular system call, by using clock_gettime(CLOCK_MONOTONIC). The second is using PERF_COUNT_HW_REF_CPU_CYCLES (exclude user space). Both tests have similar results. The tests have roughly below sequence: for (i = 0; i < 1000, i++) create 1000 mappings (1 page per VMA) start the sampling for (j = 0; j < 1000, j++) mprotect one mapping stop and save the sample delete 1000 mappings calculates all samples. Below tests are performed on Intel(R) Pentium(R) Gold 7505 @ 2.00GHz, 4G memory, Chromebook. Based on the latest upstream code: The first test (measuring time) syscall__ vmas t t_mseal delta_ns per_vma % munmap__ 1 909 944 35 35 104% munmap__ 2 1398 1502 104 52 107% munmap__ 4 2444 2594 149 37 106% munmap__ 8 4029 4323 293 37 107% munmap__ 16 6647 6935 288 18 104% munmap__ 32 11811 12398 587 18 105% mprotect 1 439 465 26 26 106% mprotect 2 1659 1745 86 43 105% mprotect 4 3747 3889 142 36 104% mprotect 8 6755 6969 215 27 103% mprotect 16 13748 14144 396 25 103% mprotect 32 27827 28969 1142 36 104% madvise_ 1 240 262 22 22 109% madvise_ 2 366 442 76 38 121% madvise_ 4 623 751 128 32 121% madvise_ 8 1110 1324 215 27 119% madvise_ 16 2127 2451 324 20 115% madvise_ 32 4109 4642 534 17 113% The second test (measuring cpu cycle) syscall__ vmas cpu cmseal delta_cpu per_vma % munmap__ 1 1790 1890 100 100 106% munmap__ 2 2819 3033 214 107 108% munmap__ 4 4959 5271 312 78 106% munmap__ 8 8262 8745 483 60 106% munmap__ 16 13099 14116 1017 64 108% munmap__ 32 23221 24785 1565 49 107% mprotect 1 906 967 62 62 107% mprotect 2 3019 3203 184 92 106% mprotect 4 6149 6569 420 105 107% mprotect 8 9978 10524 545 68 105% mprotect 16 20448 21427 979 61 105% mprotect 32 40972 42935 1963 61 105% madvise_ 1 434 497 63 63 115% madvise_ 2 752 899 147 74 120% madvise_ 4 1313 1513 200 50 115% madvise_ 8 2271 2627 356 44 116% madvise_ 16 4312 4883 571 36 113% madvise_ 32 8376 9319 943 29 111% Based on the result, for 6.8 kernel, sealing check adds 20-40 nano seconds, or around 50-100 CPU cycles, per VMA. In addition, I applied the sealing to 5.10 kernel: The first test (measuring time) syscall__ vmas t tmseal delta_ns per_vma % munmap__ 1 357 390 33 33 109% munmap__ 2 442 463 21 11 105% munmap__ 4 614 634 20 5 103% munmap__ 8 1017 1137 120 15 112% munmap__ 16 1889 2153 263 16 114% munmap__ 32 4109 4088 -21 -1 99% mprotect 1 235 227 -7 -7 97% mprotect 2 495 464 -30 -15 94% mprotect 4 741 764 24 6 103% mprotect 8 1434 1437 2 0 100% mprotect 16 2958 2991 33 2 101% mprotect 32 6431 6608 177 6 103% madvise_ 1 191 208 16 16 109% madvise_ 2 300 324 24 12 108% madvise_ 4 450 473 23 6 105% madvise_ 8 753 806 53 7 107% madvise_ 16 1467 1592 125 8 108% madvise_ 32 2795 3405 610 19 122% The second test (measuring cpu cycle) syscall__ nbr_vma cpu cmseal delta_cpu per_vma % munmap__ 1 684 715 31 31 105% munmap__ 2 861 898 38 19 104% munmap__ 4 1183 1235 51 13 104% munmap__ 8 1999 2045 46 6 102% munmap__ 16 3839 3816 -23 -1 99% munmap__ 32 7672 7887 216 7 103% mprotect 1 397 443 46 46 112% mprotect 2 738 788 50 25 107% mprotect 4 1221 1256 35 9 103% mprotect 8 2356 2429 72 9 103% mprotect 16 4961 4935 -26 -2 99% mprotect 32 9882 10172 291 9 103% madvise_ 1 351 380 29 29 108% madvise_ 2 565 615 49 25 109% madvise_ 4 872 933 61 15 107% madvise_ 8 1508 1640 132 16 109% madvise_ 16 3078 3323 245 15 108% madvise_ 32 5893 6704 811 25 114% For 5.10 kernel, sealing check adds 0-15 ns in time, or 10-30 CPU cycles, there is even decrease in some cases. It might be interesting to compare 5.10 and 6.8 kernel The first test (measuring time) syscall__ vmas t_5_10 t_6_8 delta_ns per_vma % munmap__ 1 357 909 552 552 254% munmap__ 2 442 1398 956 478 316% munmap__ 4 614 2444 1830 458 398% munmap__ 8 1017 4029 3012 377 396% munmap__ 16 1889 6647 4758 297 352% munmap__ 32 4109 11811 7702 241 287% mprotect 1 235 439 204 204 187% mprotect 2 495 1659 1164 582 335% mprotect 4 741 3747 3006 752 506% mprotect 8 1434 6755 5320 665 471% mprotect 16 2958 13748 10790 674 465% mprotect 32 6431 27827 21397 669 433% madvise_ 1 191 240 49 49 125% madvise_ 2 300 366 67 33 122% madvise_ 4 450 623 173 43 138% madvise_ 8 753 1110 357 45 147% madvise_ 16 1467 2127 660 41 145% madvise_ 32 2795 4109 1314 41 147% The second test (measuring cpu cycle) syscall__ vmas cpu_5_10 c_6_8 delta_cpu per_vma % munmap__ 1 684 1790 1106 1106 262% munmap__ 2 861 2819 1958 979 327% munmap__ 4 1183 4959 3776 944 419% munmap__ 8 1999 8262 6263 783 413% munmap__ 16 3839 13099 9260 579 341% munmap__ 32 7672 23221 15549 486 303% mprotect 1 397 906 509 509 228% mprotect 2 738 3019 2281 1140 409% mprotect 4 1221 6149 4929 1232 504% mprotect 8 2356 9978 7622 953 423% mprotect 16 4961 20448 15487 968 412% mprotect 32 9882 40972 31091 972 415% madvise_ 1 351 434 82 82 123% madvise_ 2 565 752 186 93 133% madvise_ 4 872 1313 442 110 151% madvise_ 8 1508 2271 763 95 151% madvise_ 16 3078 4312 1234 77 140% madvise_ 32 5893 8376 2483 78 142% From 5.10 to 6.8 munmap: added 250-550 ns in time, or 500-1100 in cpu cycle, per vma. mprotect: added 200-750 ns in time, or 500-1200 in cpu cycle, per vma. madvise: added 33-50 ns in time, or 70-110 in cpu cycle, per vma. In comparison to mseal, which adds 20-40 ns or 50-100 CPU cycles, the increase from 5.10 to 6.8 is significantly larger, approximately ten times greater for munmap and mprotect. When I discuss the mm performance with Brian Makin, an engineer who worked on performance, it was brought to my attention that such performance benchmarks, which measuring millions of mm syscall in a tight loop, may not accurately reflect real-world scenarios, such as that of a database service. Also this is tested using a single HW and ChromeOS, the data from another HW or distribution might be different. It might be best to take this data with a grain of salt. This patch (of 5): Wire up mseal syscall for all architectures. Link: https://lkml.kernel.org/r/20240415163527.626541-1-jeffxu@chromium.org Link: https://lkml.kernel.org/r/20240415163527.626541-2-jeffxu@chromium.org Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Jann Horn <jannh@google.com> [Bug #2] Cc: Jeff Xu <jeffxu@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Stephen Röttger <sroettger@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Amer Al Shanawany <amer.shanawany@gmail.com> Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-15 16:35:20 +00:00
462 common mseal sys_mseal