File Sealing & memfd_create()
From: | David Herrmann <dh.herrmann@gmail.com> | |
To: | linux-kernel@vger.kernel.org | |
Subject: | [PATCH 0/6] File Sealing & memfd_create() | |
Date: | Wed, 19 Mar 2014 20:06:45 +0100 | |
Message-ID: | <1395256011-2423-1-git-send-email-dh.herrmann@gmail.com> | |
Cc: | Hugh Dickins <hughd@google.com>, Alexander Viro <viro@zeniv.linux.org.uk>, Matthew Wilcox <matthew@wil.cx>, Karol Lewandowski <k.lewandowsk@samsung.com>, Kay Sievers <kay@vrfy.org>, Daniel Mack <zonque@gmail.com>, Lennart Poettering <lennart@poettering.net>, =?UTF-8?q?Kristian=20H=C3=B8gsberg?= <krh@bitplanet.net>, john.stultz@linaro.org, Greg Kroah-Hartman <greg@kroah.com>, Tejun Heo <tj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>, dri-devel@lists.freedesktop.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>, Linus Torvalds <torvalds@linux-foundation.org>, Ryan Lortie <desrt@desrt.ca>, "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>, David Herrmann <dh.herrmann@gmail.com> | |
Archive-link: | Article, Thread |
Hi This series introduces the concept of "file sealing". Sealing a file restricts the set of allowed operations on the file in question. Multiple seals are defined and each seal will cause a different set of operations to return EPERM if it is set. The following seals are introduced: * SEAL_SHRINK: If set, the inode size cannot be reduced * SEAL_GROW: If set, the inode size cannot be increased * SEAL_WRITE: If set, the file content cannot be modified Unlike existing techniques that provide similar protection, sealing allows file-sharing without any trust-relationship. This is enforced by rejecting seal modifications if you don't own an exclusive reference to the given file. So if you own a file-descriptor, you can be sure that no-one besides you can modify the seals on the given file. This allows mapping shared files from untrusted parties without the fear of the file getting truncated or modified by an attacker. Several use-cases exist that could make great use of sealing: 1) Graphics Compositors If a graphics client creates a memory-backed render-buffer and passes a file-decsriptor to it to the graphics server for display, the server _has_ to setup SIGBUS handlers whenever mapping the given file. Otherwise, the client might run ftruncate() or O_TRUNC on the on file in parallel, thus crashing the server. With sealing, a compositor can reject any incoming file-descriptor that does _not_ have SEAL_SHRINK set. This way, any memory-mappings are guaranteed to stay accessible. Furthermore, we still allow clients to increase the buffer-size in case they want to resize the render-buffer for the next frame. We also allow parallel writes so the client can render new frames into the same buffer (client is responsible of never rendering into a front-buffer if you want to avoid artifacts). Real use-case: Wayland wl_shm buffers can be transparently converted 2) Geneal-purpose IPC IPC mechanisms that do not require a mutual trust-relationship (like dbus) cannot do zero-copy so far. With sealing, zero-copy can be easily done by sharing a file-descriptor that has SEAL_SHRINK | SEAL_GROW | SEAL_WRITE set. This way, the source can store sensible data in the file, seal the file and then pass it to the destination. The destination verifies these seals are set and then can parse the message in-line. Note that these files are usually one-shot files. Without any trust-relationship, a destination can notify the source that it released a file again, but a source can never rely on it. So unless the destination releases the file, a source cannot clear the seals for modification again. However, this is inherent to situations without any trust-relationship. Real use-case: kdbus messages already use a similar interface and can be transparently converted to use these seals Other similar use-cases exist (eg., audio), but these two I am personally working on. Interest in this interface has been raised from several other camps and I've put respective maintainers into CC. If more information on these use-cases is needed, I think they can give some insights. The API introduced by this patchset is: * fcntl() extension: Two new fcntl() commands are added that allow retrieveing (SHMEM_GET_SEALS) and setting (SHMEM_SET_SEALS) seals on a file. Only shmfs implements them so far and there is no intention to implement them on other file-systems. All shmfs based files support sealing. Patch 2/6 * memfd_create() syscall: The new memfd_create() syscall is a public frontend to the shmem_file_new() interface in the kernel. It avoids the need of a local shmfs mount-point (as requested by android people) and acts more like MAP_ANON than O_TMPFILE. Patch 3/6 The other 4 patches are cleanups, self-tests and docs. The commit-messages explain the API extensions in detail. Man-page proposals are also provided. Last but not least, the extensive self-tests document the intended behavior, in case it is still not clear. Technically, sealing and memfd_create() are independent, but the described use-cases would greatly benefit from the combination of both. Hence, I merged them into the same series. Please also note that this series is based on earlier works (ashmem, memfd, shmgetfd, ..) and unifies these attempts. Comments welcome! Thanks David David Herrmann (4): fs: fix i_writecount on shmem and friends shm: add sealing API shm: add memfd_create() syscall selftests: add memfd_create() + sealing tests David Herrmann (2): (man-pages) fcntl.2: document SHMEM_SET/GET_SEALS commands memfd_create.2: add memfd_create() man-page arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + fs/fcntl.c | 12 +- fs/file_table.c | 27 +- include/linux/shmem_fs.h | 17 + include/linux/syscalls.h | 1 + include/uapi/linux/fcntl.h | 13 + include/uapi/linux/memfd.h | 9 + kernel/sys_ni.c | 1 + mm/shmem.c | 267 +++++++- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/memfd/.gitignore | 2 + tools/testing/selftests/memfd/Makefile | 29 + tools/testing/selftests/memfd/memfd_test.c | 972 +++++++++++++++++++++++++++++ 14 files changed, 1338 insertions(+), 15 deletions(-) create mode 100644 include/uapi/linux/memfd.h create mode 100644 tools/testing/selftests/memfd/.gitignore create mode 100644 tools/testing/selftests/memfd/Makefile create mode 100644 tools/testing/selftests/memfd/memfd_test.c -- 1.9.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
译文:
你好
本系列介绍“文件密封”的概念。密封文件限制
有关文件的允许操作集。多重密封
定义,每个密封将导致一组不同的操作返回EPERM
如果已设置。介绍了以下密封件:
* SEAL_SHRINK:如果设置,则无法减小inode的大小
* SEAL_GROW:如果设置,则不能增加索引节点的大小
* SEAL_WRITE:如果设置,则不能修改文件内容
与提供类似保护的现有技术不同,密封允许
没有任何信任关系的文件共享。这是通过拒绝印章强制执行的
如果您没有给定文件的专有引用,请进行修改。因此,如果
您拥有文件描述符,可以确保除了修改之外,没有人可以修改
给定文件上的印章。这允许映射不受信任的共享文件
各方,而不必担心文件被截断或修改
攻击者。
存在一些可以充分利用密封的用例:
1)图形合成器
如果图形客户端创建了内存支持的渲染缓冲区并传递了
将文件描述符发送到图形服务器以进行显示,该服务器
_has_可以在映射给定文件时设置SIGBUS处理程序。除此以外,
客户端可能会在文件上并行运行ftruncate()或O_TRUNC,
从而使服务器崩溃。
通过密封,合成器可以拒绝任何传入的文件描述符,即
_not_是否设置了SEAL_SHRINK。这样,任何内存映射都是
保证保持可访问性。此外,我们仍然允许客户
如果他们想调整渲染缓冲区的大小,请增加缓冲区的大小
下一帧。我们还允许并行写入,以便客户端可以呈现新的
帧放入同一缓冲区(客户端负责从不渲染到
如果要避免伪像,则使用前缓冲区)。
实际用例:Wayland wl_shm缓冲区可以透明地转换
2)通用IPC
不需要相互信任关系的IPC机制(例如dbus)
到目前为止,无法进行零复制。通过密封,可以很容易地完成零复制
共享具有SEAL_SHRINK的文件描述符| SEAL_GROW | SEAL_WRITE
组。这样,源可以将敏感数据存储在文件中,将
文件,然后将其传递到目的地。目的地验证这些
设置密封,然后可以在线解析消息。
请注意,这些文件通常是一次性文件。没有任何
信任关系,目标可以通知源它发布了一个
再次提交文件,但源永远不能依赖它。所以除非目的地
释放文件后,来源将无法再次清除印章以进行修改。
但是,这是没有任何信任关系的情况所固有的。
实际用例:kdbus消息已经使用了类似的接口,并且可以
透明地转换为使用这些密封
还存在其他类似的用例(例如,音频),但是我个人是这两个
正在努力。其他几个阵营也对此接口产生了兴趣
并且我已经将各自的维护者纳入CC。如果有关这些的更多信息
用例是必要的,我认为它们可以提供一些见解。
此补丁集引入的API是:
* fcntl()扩展名:
添加了两个新的fcntl()命令,这些命令允许进行检索(SHMEM_GET_SEALS)
并在文件上设置(SHMEM_SET_SEALS)密封。只有shmfs才能实现它们
到目前为止,无意在其他文件系统上实现它们。
所有基于shmfs的文件都支持密封。
补丁2/6
* memfd_create()系统调用:
新的memfd_create()系统调用是shmem_file_new()的公共前端
内核中的接口。它避免了需要本地shmfs挂载点(因为
受到Android使用者的要求),并且比起O_TMPFILE更像MAP_ANON。
补丁3/6
其他4个补丁是清理,自检和文档。
提交消息详细解释了API扩展。手册建议
还提供。最后但并非最不重要的一点是,广泛的自我测试记录了
预期的行为(如果仍然不清楚)。
从技术上讲,seal和memfd_create()是独立的,但是
用例将极大地受益于两者的结合。因此,我合并了
他们进入同一系列。另请注意,本系列基于较早的版本
可以工作(ashmem,memfd,shmgetfd等),并统一这些尝试。
欢迎评论!
谢谢
大卫
大卫·赫尔曼(David Herrmann)(4):
fs:在shmem和朋友上修复i_writecount
shm:添加密封API
shm:添加memfd_create()系统调用
自检:添加memfd_create()+密封测试
大卫·赫尔曼(2):(手册页)
fcntl.2:文档SHMEM_SET / GET_SEALS命令
memfd_create.2:添加memfd_create()手册页
推荐阅读
《C语言程序调用栈:backtrace+backtrace_symbols+backtrace_symbols_fd》