Commit graph

69 commits

Author SHA1 Message Date
David Howells
2480f57fb3 KEYS: Pre-clear struct key on allocation
The second word of key->payload does not get initialised in key_alloc(), but
the big_key type is relying on it having been cleared.  The problem comes when
big_key fails to instantiate a large key and doesn't then set the payload.  The
big_key_destroy() op is called from the garbage collector and this assumes that
the dentry pointer stored in the second word will be NULL if instantiation did
not complete.

Therefore just pre-clear the entire struct key on allocation rather than trying
to be clever and only initialising to 0 only those bits that aren't otherwise
initialised.

The lack of initialisation can lead to a bug report like the following if
big_key failed to initialise its file:

	general protection fault: 0000 [#1] SMP
	Modules linked in: ...
	CPU: 0 PID: 51 Comm: kworker/0:1 Not tainted 3.10.0-53.el7.x86_64 #1
	Hardware name: Dell Inc. PowerEdge 1955/0HC513, BIOS 1.4.4 12/09/2008
	Workqueue: events key_garbage_collector
	task: ffff8801294f5680 ti: ffff8801296e2000 task.ti: ffff8801296e2000
	RIP: 0010:[<ffffffff811b4a51>] dput+0x21/0x2d0
	...
	Call Trace:
	 [<ffffffff811a7b06>] path_put+0x16/0x30
	 [<ffffffff81235604>] big_key_destroy+0x44/0x60
	 [<ffffffff8122dc4b>] key_gc_unused_keys.constprop.2+0x5b/0xe0
	 [<ffffffff8122df2f>] key_garbage_collector+0x1df/0x3c0
	 [<ffffffff8107759b>] process_one_work+0x17b/0x460
	 [<ffffffff8107834b>] worker_thread+0x11b/0x400
	 [<ffffffff81078230>] ? rescuer_thread+0x3e0/0x3e0
	 [<ffffffff8107eb00>] kthread+0xc0/0xd0
	 [<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110
	 [<ffffffff815c4bec>] ret_from_fork+0x7c/0xb0
	 [<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110

Reported-by: Patrik Kis <pkis@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Stephen Gallagher <sgallagh@redhat.com>
2013-12-02 11:24:18 +00:00
David Howells
74792b0001 KEYS: Fix a race between negating a key and reading the error set
key_reject_and_link() marking a key as negative and setting the error with
which it was negated races with keyring searches and other things that read
that error.

The fix is to switch the order in which the assignments are done in
key_reject_and_link() and to use memory barriers.

Kudos to Dave Wysochanski <dwysocha@redhat.com> and Scott Mayhew
<smayhew@redhat.com> for tracking this down.

This may be the cause of:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
IP: [<ffffffff81219011>] wait_for_key_construction+0x31/0x80
PGD c6b2c3067 PUD c59879067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 0
Modules linked in: ...

Pid: 13359, comm: amqzxma0 Not tainted 2.6.32-358.20.1.el6.x86_64 #1 IBM System x3650 M3 -[7945PSJ]-/00J6159
RIP: 0010:[<ffffffff81219011>] wait_for_key_construction+0x31/0x80
RSP: 0018:ffff880c6ab33758  EFLAGS: 00010246
RAX: ffffffff81219080 RBX: 0000000000000000 RCX: 0000000000000002
RDX: ffffffff81219060 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880c6ab33768 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff880adfcbce40
R13: ffffffffa03afb84 R14: ffff880adfcbce40 R15: ffff880adfcbce43
FS:  00007f29b8042700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000070 CR3: 0000000c613dc000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process amqzxma0 (pid: 13359, threadinfo ffff880c6ab32000, task ffff880c610deae0)
Stack:
 ffff880adfcbce40 0000000000000000 ffff880c6ab337b8 ffffffff81219695
<d> 0000000000000000 ffff880a000000d0 ffff880c6ab337a8 000000000000000f
<d> ffffffffa03afb93 000000000000000f ffff88186c7882c0 0000000000000014
Call Trace:
 [<ffffffff81219695>] request_key+0x65/0xa0
 [<ffffffffa03a0885>] nfs_idmap_request_key+0xc5/0x170 [nfs]
 [<ffffffffa03a0eb4>] nfs_idmap_lookup_id+0x34/0x80 [nfs]
 [<ffffffffa03a1255>] nfs_map_group_to_gid+0x75/0xa0 [nfs]
 [<ffffffffa039a9ad>] decode_getfattr_attrs+0xbdd/0xfb0 [nfs]
 [<ffffffff81057310>] ? __dequeue_entity+0x30/0x50
 [<ffffffff8100988e>] ? __switch_to+0x26e/0x320
 [<ffffffffa039ae03>] decode_getfattr+0x83/0xe0 [nfs]
 [<ffffffffa039b610>] ? nfs4_xdr_dec_getattr+0x0/0xa0 [nfs]
 [<ffffffffa039b69f>] nfs4_xdr_dec_getattr+0x8f/0xa0 [nfs]
 [<ffffffffa02dada4>] rpcauth_unwrap_resp+0x84/0xb0 [sunrpc]
 [<ffffffffa039b610>] ? nfs4_xdr_dec_getattr+0x0/0xa0 [nfs]
 [<ffffffffa02cf923>] call_decode+0x1b3/0x800 [sunrpc]
 [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50
 [<ffffffffa02cf770>] ? call_decode+0x0/0x800 [sunrpc]
 [<ffffffffa02d99a7>] __rpc_execute+0x77/0x350 [sunrpc]
 [<ffffffff81096c67>] ? bit_waitqueue+0x17/0xd0
 [<ffffffffa02d9ce1>] rpc_execute+0x61/0xa0 [sunrpc]
 [<ffffffffa02d03a5>] rpc_run_task+0x75/0x90 [sunrpc]
 [<ffffffffa02d04c2>] rpc_call_sync+0x42/0x70 [sunrpc]
 [<ffffffffa038ff80>] _nfs4_call_sync+0x30/0x40 [nfs]
 [<ffffffffa038836c>] _nfs4_proc_getattr+0xac/0xc0 [nfs]
 [<ffffffff810aac87>] ? futex_wait+0x227/0x380
 [<ffffffffa038b856>] nfs4_proc_getattr+0x56/0x80 [nfs]
 [<ffffffffa0371403>] __nfs_revalidate_inode+0xe3/0x220 [nfs]
 [<ffffffffa037158e>] nfs_revalidate_mapping+0x4e/0x170 [nfs]
 [<ffffffffa036f147>] nfs_file_read+0x77/0x130 [nfs]
 [<ffffffff811811aa>] do_sync_read+0xfa/0x140
 [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13
 [<ffffffff81228ffb>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8121bed6>] ? security_file_permission+0x16/0x20
 [<ffffffff81181a95>] vfs_read+0xb5/0x1a0
 [<ffffffff81181bd1>] sys_read+0x51/0x90
 [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dave Wysochanski <dwysocha@redhat.com>
cc: Scott Mayhew <smayhew@redhat.com>
2013-10-30 11:15:24 +00:00
David Howells
008643b86c KEYS: Add a 'trusted' flag and a 'trusted only' flag
Add KEY_FLAG_TRUSTED to indicate that a key either comes from a trusted source
or had a cryptographic signature chain that led back to a trusted key the
kernel already possessed.

Add KEY_FLAGS_TRUSTED_ONLY to indicate that a keyring will only accept links to
keys marked with KEY_FLAGS_TRUSTED.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
2013-09-25 17:17:01 +01:00
David Howells
b2a4df200d KEYS: Expand the capacity of a keyring
Expand the capacity of a keyring to be able to hold a lot more keys by using
the previously added associative array implementation.  Currently the maximum
capacity is:

	(PAGE_SIZE - sizeof(header)) / sizeof(struct key *)

which, on a 64-bit system, is a little more 500.  However, since this is being
used for the NFS uid mapper, we need more than that.  The new implementation
gives us effectively unlimited capacity.

With some alterations, the keyutils testsuite runs successfully to completion
after this patch is applied.  The alterations are because (a) keyrings that
are simply added to no longer appear ordered and (b) some of the errors have
changed a bit.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:18 +01:00
David Howells
e57e8669f2 KEYS: Drop the permissions argument from __keyring_search_one()
Drop the permissions argument from __keyring_search_one() as the only caller
passes 0 here - which causes all checks to be skipped.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:17 +01:00
David Howells
ccc3e6d9c9 KEYS: Define a __key_get() wrapper to use rather than atomic_inc()
Define a __key_get() wrapper to use rather than atomic_inc() on the key usage
count as this makes it easier to hook in refcount error debugging.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:16 +01:00
David Howells
16feef4340 KEYS: Consolidate the concept of an 'index key' for key access
Consolidate the concept of an 'index key' for accessing keys.  The index key
is the search term needed to find a key directly - basically the key type and
the key description.  We can add to that the description length.

This will be useful when turning a keyring into an associative array rather
than just a pointer block.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:15 +01:00
Linus Torvalds
2a74dbb9a8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "A quiet cycle for the security subsystem with just a few maintenance
  updates."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  Smack: create a sysfs mount point for smackfs
  Smack: use select not depends in Kconfig
  Yama: remove locking from delete path
  Yama: add RCU to drop read locking
  drivers/char/tpm: remove tasklet and cleanup
  KEYS: Use keyring_alloc() to create special keyrings
  KEYS: Reduce initial permissions on keys
  KEYS: Make the session and process keyrings per-thread
  seccomp: Make syscall skipping and nr changes more consistent
  key: Fix resource leak
  keys: Fix unreachable code
  KEYS: Add payload preparsing opportunity prior to key instantiate or update
2012-12-16 15:40:50 -08:00
Linus Torvalds
d25282d1c9 Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module signing support from Rusty Russell:
 "module signing is the highlight, but it's an all-over David Howells frenzy..."

Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
  X.509: Fix indefinite length element skip error handling
  X.509: Convert some printk calls to pr_devel
  asymmetric keys: fix printk format warning
  MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
  MODSIGN: Make mrproper should remove generated files.
  MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
  MODSIGN: Use the same digest for the autogen key sig as for the module sig
  MODSIGN: Sign modules during the build process
  MODSIGN: Provide a script for generating a key ID from an X.509 cert
  MODSIGN: Implement module signature checking
  MODSIGN: Provide module signing public keys to the kernel
  MODSIGN: Automatically generate module signing keys if missing
  MODSIGN: Provide Kconfig options
  MODSIGN: Provide gitignore and make clean rules for extra files
  MODSIGN: Add FIPS policy
  module: signature checking hook
  X.509: Add a crypto key parser for binary (DER) X.509 certificates
  MPILIB: Provide a function to read raw data into an MPI
  X.509: Add an ASN.1 decoder
  X.509: Add simple ASN.1 grammar compiler
  ...
2012-10-14 13:39:34 -07:00
David Howells
cf7f601c06 KEYS: Add payload preparsing opportunity prior to key instantiate or update
Give the key type the opportunity to preparse the payload prior to the
instantiation and update routines being called.  This is done with the
provision of two new key type operations:

	int (*preparse)(struct key_preparsed_payload *prep);
	void (*free_preparse)(struct key_preparsed_payload *prep);

If the first operation is present, then it is called before key creation (in
the add/update case) or before the key semaphore is taken (in the update and
instantiate cases).  The second operation is called to clean up if the first
was called.

preparse() is given the opportunity to fill in the following structure:

	struct key_preparsed_payload {
		char		*description;
		void		*type_data[2];
		void		*payload;
		const void	*data;
		size_t		datalen;
		size_t		quotalen;
	};

Before the preparser is called, the first three fields will have been cleared,
the payload pointer and size will be stored in data and datalen and the default
quota size from the key_type struct will be stored into quotalen.

The preparser may parse the payload in any way it likes and may store data in
the type_data[] and payload fields for use by the instantiate() and update()
ops.

The preparser may also propose a description for the key by attaching it as a
string to the description field.  This can be used by passing a NULL or ""
description to the add_key() system call or the key_create_or_update()
function.  This cannot work with request_key() as that required the description
to tell the upcall about the key to be created.

This, for example permits keys that store PGP public keys to generate their own
name from the user ID and public key fingerprint in the key.

The instantiate() and update() operations are then modified to look like this:

	int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
	int (*update)(struct key *key, struct key_preparsed_payload *prep);

and the new payload data is passed in *prep, whether or not it was preparsed.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-08 13:49:48 +10:30
David Howells
4442d7704c Merge branch 'modsign-keys-devel' into security-next-keys
Signed-off-by: David Howells <dhowells@redhat.com>
2012-10-02 19:30:19 +01:00
David Howells
96b5c8fea6 KEYS: Reduce initial permissions on keys
Reduce the initial permissions on new keys to grant the possessor everything,
view permission only to the user (so the keys can be seen in /proc/keys) and
nothing else.

This gives the creator a chance to adjust the permissions mask before other
processes can access the new key or create a link to it.

To aid with this, keyring_alloc() now takes a permission argument rather than
setting the permissions itself.

The following permissions are now set:

 (1) The user and user-session keyrings grant the user that owns them full
     permissions and grant a possessor everything bar SETATTR.

 (2) The process and thread keyrings grant the possessor full permissions but
     only grant the user VIEW.  This permits the user to see them in
     /proc/keys, but not to do anything with them.

 (3) Anonymous session keyrings grant the possessor full permissions, but only
     grant the user VIEW and READ.  This means that the user can see them in
     /proc/keys and can list them, but nothing else.  Possibly READ shouldn't
     be provided either.

 (4) Named session keyrings grant everything an anonymous session keyring does,
     plus they grant the user LINK permission.  The whole point of named
     session keyrings is that others can also subscribe to them.  Possibly this
     should be a separate permission to LINK.

 (5) The temporary session keyring created by call_sbin_request_key() gets the
     same permissions as an anonymous session keyring.

 (6) Keys created by add_key() get VIEW, SEARCH, LINK and SETATTR for the
     possessor, plus READ and/or WRITE if the key type supports them.  The used
     only gets VIEW now.

 (7) Keys created by request_key() now get the same as those created by
     add_key().

Reported-by: Lennart Poettering <lennart@poettering.net>
Reported-by: Stef Walter <stefw@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-10-02 19:24:56 +01:00
Linus Torvalds
437589a74b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace changes from Eric Biederman:
 "This is a mostly modest set of changes to enable basic user namespace
  support.  This allows the code to code to compile with user namespaces
  enabled and removes the assumption there is only the initial user
  namespace.  Everything is converted except for the most complex of the
  filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
  nfs, ocfs2 and xfs as those patches need a bit more review.

  The strategy is to push kuid_t and kgid_t values are far down into
  subsystems and filesystems as reasonable.  Leaving the make_kuid and
  from_kuid operations to happen at the edge of userspace, as the values
  come off the disk, and as the values come in from the network.
  Letting compile type incompatible compile errors (present when user
  namespaces are enabled) guide me to find the issues.

  The most tricky areas have been the places where we had an implicit
  union of uid and gid values and were storing them in an unsigned int.
  Those places were converted into explicit unions.  I made certain to
  handle those places with simple trivial patches.

  Out of that work I discovered we have generic interfaces for storing
  quota by projid.  I had never heard of the project identifiers before.
  Adding full user namespace support for project identifiers accounts
  for most of the code size growth in my git tree.

  Ultimately there will be work to relax privlige checks from
  "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
  root in a user names to do those things that today we only forbid to
  non-root users because it will confuse suid root applications.

  While I was pushing kuid_t and kgid_t changes deep into the audit code
  I made a few other cleanups.  I capitalized on the fact we process
  netlink messages in the context of the message sender.  I removed
  usage of NETLINK_CRED, and started directly using current->tty.

  Some of these patches have also made it into maintainer trees, with no
  problems from identical code from different trees showing up in
  linux-next.

  After reading through all of this code I feel like I might be able to
  win a game of kernel trivial pursuit."

Fix up some fairly trivial conflicts in netfilter uid/git logging code.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
  userns: Convert the ufs filesystem to use kuid/kgid where appropriate
  userns: Convert the udf filesystem to use kuid/kgid where appropriate
  userns: Convert ubifs to use kuid/kgid
  userns: Convert squashfs to use kuid/kgid where appropriate
  userns: Convert reiserfs to use kuid and kgid where appropriate
  userns: Convert jfs to use kuid/kgid where appropriate
  userns: Convert jffs2 to use kuid and kgid where appropriate
  userns: Convert hpfs to use kuid and kgid where appropriate
  userns: Convert btrfs to use kuid/kgid where appropriate
  userns: Convert bfs to use kuid/kgid where appropriate
  userns: Convert affs to use kuid/kgid wherwe appropriate
  userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
  userns: On ia64 deal with current_uid and current_gid being kuid and kgid
  userns: On ppc convert current_uid from a kuid before printing.
  userns: Convert s390 getting uid and gid system calls to use kuid and kgid
  userns: Convert s390 hypfs to use kuid and kgid where appropriate
  userns: Convert binder ipc to use kuids
  userns: Teach security_path_chown to take kuids and kgids
  userns: Add user namespace support to IMA
  userns: Convert EVM to deal with kuids and kgids in it's hmac computation
  ...
2012-10-02 11:11:09 -07:00
Eric W. Biederman
9a56c2db49 userns: Convert security/keys to the new userns infrastructure
- Replace key_user ->user_ns equality checks with kuid_has_mapping checks.
- Use from_kuid to generate key descriptions
- Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t
- Avoid potential problems with file descriptor passing by displaying
  keys in the user namespace of the opener of key status proc files.

Cc: linux-security-module@vger.kernel.org
Cc: keyrings@linux-nfs.org
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-13 18:28:02 -07:00
David Howells
d4f65b5d24 KEYS: Add payload preparsing opportunity prior to key instantiate or update
Give the key type the opportunity to preparse the payload prior to the
instantiation and update routines being called.  This is done with the
provision of two new key type operations:

	int (*preparse)(struct key_preparsed_payload *prep);
	void (*free_preparse)(struct key_preparsed_payload *prep);

If the first operation is present, then it is called before key creation (in
the add/update case) or before the key semaphore is taken (in the update and
instantiate cases).  The second operation is called to clean up if the first
was called.

preparse() is given the opportunity to fill in the following structure:

	struct key_preparsed_payload {
		char		*description;
		void		*type_data[2];
		void		*payload;
		const void	*data;
		size_t		datalen;
		size_t		quotalen;
	};

Before the preparser is called, the first three fields will have been cleared,
the payload pointer and size will be stored in data and datalen and the default
quota size from the key_type struct will be stored into quotalen.

The preparser may parse the payload in any way it likes and may store data in
the type_data[] and payload fields for use by the instantiate() and update()
ops.

The preparser may also propose a description for the key by attaching it as a
string to the description field.  This can be used by passing a NULL or ""
description to the add_key() system call or the key_create_or_update()
function.  This cannot work with request_key() as that required the description
to tell the upcall about the key to be created.

This, for example permits keys that store PGP public keys to generate their own
name from the user ID and public key fingerprint in the key.

The instantiate() and update() operations are then modified to look like this:

	int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
	int (*update)(struct key *key, struct key_preparsed_payload *prep);

and the new payload data is passed in *prep, whether or not it was preparsed.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-09-13 13:06:29 +01:00
Tejun Heo
3b07e9ca26 workqueue: deprecate system_nrt[_freezable]_wq
system_nrt[_freezable]_wq are now spurious.  Mark them deprecated and
convert all users to system[_freezable]_wq.

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant, so there's no reason to use system_nrt[_freezable]_wq.
Please use system[_freezable]_wq instead.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-By: Lai Jiangshan <laijs@cn.fujitsu.com>

Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Airlie <airlied@linux.ie>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
2012-08-20 14:51:24 -07:00
Linus Torvalds
644473e9c6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace enhancements from Eric Biederman:
 "This is a course correction for the user namespace, so that we can
  reach an inexpensive, maintainable, and reasonably complete
  implementation.

  Highlights:
   - Config guards make it impossible to enable the user namespace and
     code that has not been converted to be user namespace safe.

   - Use of the new kuid_t type ensures the if you somehow get past the
     config guards the kernel will encounter type errors if you enable
     user namespaces and attempt to compile in code whose permission
     checks have not been updated to be user namespace safe.

   - All uids from child user namespaces are mapped into the initial
     user namespace before they are processed.  Removing the need to add
     an additional check to see if the user namespace of the compared
     uids remains the same.

   - With the user namespaces compiled out the performance is as good or
     better than it is today.

   - For most operations absolutely nothing changes performance or
     operationally with the user namespace enabled.

   - The worst case performance I could come up with was timing 1
     billion cache cold stat operations with the user namespace code
     enabled.  This went from 156s to 164s on my laptop (or 156ns to
     164ns per stat operation).

   - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
     Most uid/gid setting system calls treat these value specially
     anyway so attempting to use -1 as a uid would likely cause
     entertaining failures in userspace.

   - If setuid is called with a uid that can not be mapped setuid fails.
     I have looked at sendmail, login, ssh and every other program I
     could think of that would call setuid and they all check for and
     handle the case where setuid fails.

   - If stat or a similar system call is called from a context in which
     we can not map a uid we lie and return overflowuid.  The LFS
     experience suggests not lying and returning an error code might be
     better, but the historical precedent with uids is different and I
     can not think of anything that would break by lying about a uid we
     can't map.

   - Capabilities are localized to the current user namespace making it
     safe to give the initial user in a user namespace all capabilities.

  My git tree covers all of the modifications needed to convert the core
  kernel and enough changes to make a system bootable to runlevel 1."

Fix up trivial conflicts due to nearby independent changes in fs/stat.c

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
  userns:  Silence silly gcc warning.
  cred: use correct cred accessor with regards to rcu read lock
  userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
  userns: Convert cgroup permission checks to use uid_eq
  userns: Convert tmpfs to use kuid and kgid where appropriate
  userns: Convert sysfs to use kgid/kuid where appropriate
  userns: Convert sysctl permission checks to use kuid and kgids.
  userns: Convert proc to use kuid/kgid where appropriate
  userns: Convert ext4 to user kuid/kgid where appropriate
  userns: Convert ext3 to use kuid/kgid where appropriate
  userns: Convert ext2 to use kuid/kgid where appropriate.
  userns: Convert devpts to use kuid/kgid where appropriate
  userns: Convert binary formats to use kuid/kgid where appropriate
  userns: Add negative depends on entries to avoid building code that is userns unsafe
  userns: signal remove unnecessary map_cred_ns
  userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
  userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
  userns: Convert stat to return values mapped from kuids and kgids
  userns: Convert user specfied uids and gids in chown into kuids and kgid
  userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
  ...
2012-05-23 17:42:39 -07:00
David Howells
fd75815f72 KEYS: Add invalidation support
Add support for invalidating a key - which renders it immediately invisible to
further searches and causes the garbage collector to immediately wake up,
remove it from keyrings and then destroy it when it's no longer referenced.

It's better not to do this with keyctl_revoke() as that marks the key to start
returning -EKEYREVOKED to searches when what is actually desired is to have the
key refetched.

To invalidate a key the caller must be granted SEARCH permission by the key.
This may be too strict.  It may be better to also permit invalidation if the
caller has any of READ, WRITE or SETATTR permission.

The primary use for this is to evict keys that are cached in special keyrings,
such as the DNS resolver or an ID mapper.

Signed-off-by: David Howells <dhowells@redhat.com>
2012-05-11 10:56:56 +01:00
David Howells
1eb1bcf5bf KEYS: Announce key type (un)registration
Announce the (un)registration of a key type in the core key code rather than
in the callers.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
2012-05-11 10:56:56 +01:00
Eric W. Biederman
c4a4d60379 userns: Use cred->user_ns instead of cred->user->user_ns
Optimize performance and prepare for the removal of the user_ns reference
from user_struct.  Remove the slow long walk through cred->user->user_ns and
instead go straight to cred->user_ns.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-04-07 16:55:51 -07:00
Bryan Schumaker
59e6b9c113 Created a function for setting timeouts on keys
The keyctl_set_timeout function isn't exported to other parts of the
kernel, but I want to use it for the NFS idmapper.  I already have the
key, but I wanted a generic way to set the timeout.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-01 16:50:31 -05:00
Jeff Layton
9f6ed2ca25 keys: add a "logon" key type
For CIFS, we want to be able to store NTLM credentials (aka username
and password) in the keyring. We do not, however want to allow users
to fetch those keys back out of the keyring since that would be a
security risk.

Unfortunately, due to the nuances of key permission bits, it's not
possible to do this. We need to grant search permissions so the kernel
can find these keys, but that also implies permissions to read the
payload.

Resolve this by adding a new key_type. This key type is essentially
the same as key_type_user, but does not define a .read op. This
prevents the payload from ever being visible from userspace. This
key type also vets the description to ensure that it's "qualified"
by checking to ensure that it has a ':' in it that is preceded by
other characters.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2012-01-17 22:39:40 -06:00
David Howells
7845bc3964 KEYS: Give key types their own lockdep class for key->sem
Give keys their own lockdep class to differentiate them from each other in case
a key of one type has to refer to a key of another type.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-11-17 09:35:32 +11:00
David Howells
0c061b5707 KEYS: Correctly destroy key payloads when their keytype is removed
unregister_key_type() has code to mark a key as dead and make it unavailable in
one loop and then destroy all those unavailable key payloads in the next loop.
However, the loop to mark keys dead renders the key undetectable to the second
loop by changing the key type pointer also.

Fix this by the following means:

 (1) The key code has two garbage collectors: one deletes unreferenced keys and
     the other alters keyrings to delete links to old dead, revoked and expired
     keys.  They can end up holding each other up as both want to scan the key
     serial tree under spinlock.  Combine these into a single routine.

 (2) Move the dead key marking, dead link removal and dead key removal into the
     garbage collector as a three phase process running over the three cycles
     of the normal garbage collection procedure.  This is tracked by the
     KEY_GC_REAPING_DEAD_1, _2 and _3 state flags.

     unregister_key_type() then just unlinks the key type from the list, wakes
     up the garbage collector and waits for the third phase to complete.

 (3) Downgrade the key types sem in unregister_key_type() once it has deleted
     the key type from the list so that it doesn't block the keyctl() syscall.

 (4) Dead keys that cannot be simply removed in the third phase have their
     payloads destroyed with the key's semaphore write-locked to prevent
     interference by the keyctl() syscall.  There should be no in-kernel users
     of dead keys of that type by the point of unregistration, though keyctl()
     may be holding a reference.

 (5) Only perform timer recalculation in the GC if the timer actually expired.
     If it didn't, we'll get another cycle when it goes off - and if the key
     that actually triggered it has been removed, it's not a problem.

 (6) Only garbage collect link if the timer expired or if we're doing dead key
     clean up phase 2.

 (7) As only key_garbage_collector() is permitted to use rb_erase() on the key
     serial tree, it doesn't need to revalidate its cursor after dropping the
     spinlock as the node the cursor points to must still exist in the tree.

 (8) Drop the spinlock in the GC if there is contention on it or if we need to
     reschedule.  After dealing with that, get the spinlock again and resume
     scanning.

This has been tested in the following ways:

 (1) Run the keyutils testsuite against it.

 (2) Using the AF_RXRPC and RxKAD modules to test keytype removal:

     Load the rxrpc_s key type:

	# insmod /tmp/af-rxrpc.ko
	# insmod /tmp/rxkad.ko

     Create a key (http://people.redhat.com/~dhowells/rxrpc/listen.c):

	# /tmp/listen &
	[1] 8173

     Find the key:

	# grep rxrpc_s /proc/keys
	091086e1 I--Q--     1 perm 39390000     0     0 rxrpc_s   52:2

     Link it to a session keyring, preferably one with a higher serial number:

	# keyctl link 0x20e36251 @s

     Kill the process (the key should remain as it's linked to another place):

	# fg
	/tmp/listen
	^C

     Remove the key type:

	rmmod rxkad
	rmmod af-rxrpc

     This can be made a more effective test by altering the following part of
     the patch:

	if (unlikely(gc_state & KEY_GC_REAPING_DEAD_2)) {
		/* Make sure everyone revalidates their keys if we marked a
		 * bunch as being dead and make sure all keyring ex-payloads
		 * are destroyed.
		 */
		kdebug("dead sync");
		synchronize_rcu();

     To call synchronize_rcu() in GC phase 1 instead.  That causes that the
     keyring's old payload content to hang around longer until it's RCU
     destroyed - which usually happens after GC phase 3 is complete.  This
     allows the destroy_dead_key branch to be tested.

Reported-by: Benjamin Coddington <bcodding@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-08-23 09:57:37 +10:00
David Howells
b072e9bc2f KEYS: Make the key reaper non-reentrant
Make the key reaper non-reentrant by sticking it on the appropriate system work
queue when we queue it.  This will allow it to have global state and drop
locks.  It should probably be non-reentrant already as it may spend a long time
holding the key serial spinlock, and so multiple entrants can spend long
periods of time just sitting there spinning, waiting to get the lock.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-08-23 09:57:36 +10:00
David Howells
8bc16deabc KEYS: Move the unreferenced key reaper to the keys garbage collector file
Move the unreferenced key reaper function to the keys garbage collector file
as that's a more appropriate place with the dead key link reaper.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-08-23 09:57:36 +10:00
David Howells
fdd1b94581 KEYS: Add a new keyctl op to reject a key with a specified error code
Add a new keyctl op to reject a key with a specified error code.  This works
much the same as negating a key, and so keyctl_negate_key() is made a special
case of keyctl_reject_key().  The difference is that keyctl_negate_key()
selects ENOKEY as the error to be reported.

Typically the key would be rejected with EKEYEXPIRED, EKEYREVOKED or
EKEYREJECTED, but this is not mandatory.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-03-08 11:17:18 +11:00
David Howells
b9fffa3877 KEYS: Add a key type op to permit the key description to be vetted
Add a key type operation to permit the key type to vet the description of a new
key that key_alloc() is about to allocate.  The operation may reject the
description if it wishes with an error of its choosing.  If it does this, the
key will not be allocated.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-03-08 11:17:15 +11:00
David Howells
ceb73c1204 KEYS: Fix __key_link_end() quota fixup on error
Fix __key_link_end()'s attempt to fix up the quota if an error occurs.

There are two erroneous cases: Firstly, we always decrease the quota if
the preallocated replacement keyring needs cleaning up, irrespective of
whether or not we should (we may have replaced a pointer rather than
adding another pointer).

Secondly, we never clean up the quota if we added a pointer without the
keyring storage being extended (we allocate multiple pointers at a time,
even if we're not going to use them all immediately).

We handle this by setting the bottom bit of the preallocation pointer in
__key_link_begin() to indicate that the quota needs fixing up, which is
then passed to __key_link() (which clears the whole thing) and
__key_link_end().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-26 08:58:20 +10:00
David Howells
973c9f4f49 KEYS: Fix up comments in key management code
Fix up comments in the key management code.  No functional changes.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-21 14:59:30 -08:00
David Howells
a8b17ed019 KEYS: Do some style cleanup in the key management code.
Do a bit of a style clean up in the key management code.  No functional
changes.

Done using:

  perl -p -i -e 's!^/[*]*/\n!!' security/keys/*.c
  perl -p -i -e 's!} /[*] end [a-z0-9_]*[(][)] [*]/\n!}\n!' security/keys/*.c
  sed -i -s -e ": next" -e N -e 's/^\n[}]$/}/' -e t -e P -e 's/^.*\n//' -e "b next" security/keys/*.c

To remove /*****/ lines, remove comments on the closing brace of a
function to name the function and remove blank lines before the closing
brace of a function.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-21 14:59:29 -08:00
David Howells
f70e2e0619 KEYS: Do preallocation for __key_link()
Do preallocation for __key_link() so that the various callers in request_key.c
can deal with any errors from this source before attempting to construct a key.
This allows them to assume that the actual linkage step is guaranteed to be
successful.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-05-06 22:25:02 +10:00
Justin P. Mattock
c5b60b5e67 security: whitespace coding style fixes
Whitespace coding style fixes.

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-23 10:10:23 +10:00
David Howells
c08ef808ef KEYS: Fix garbage collector
Fix a number of problems with the new key garbage collector:

 (1) A rogue semicolon in keyring_gc() was causing the initial count of dead
     keys to be miscalculated.

 (2) A missing return in keyring_gc() meant that under certain circumstances,
     the keyring semaphore would be unlocked twice.

 (3) The key serial tree iterator (key_garbage_collector()) part of the garbage
     collector has been modified to:

     (a) Complete each scan of the keyrings before setting the new timer.

     (b) Only set the new timer for keys that have yet to expire.  This means
         that the new timer is now calculated correctly, and the gc doesn't
         get into a loop continually scanning for keys that have expired, and
         preventing other things from happening, like RCU cleaning up the old
         keyring contents.

     (c) Perform an extra scan if any keys were garbage collected in this one
     	 as a key might become garbage during a scan, and (b) could mean we
     	 don't set the timer again.

 (4) Made key_schedule_gc() take the time at which to do a collection run,
     rather than the time at which the key expires.  This means the collection
     of dead keys (key type unregistered) can happen immediately.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-15 09:11:02 +10:00
David Howells
5d135440fa KEYS: Add garbage collection for dead, revoked and expired keys. [try #6]
Add garbage collection for dead, revoked and expired keys.  This involved
erasing all links to such keys from keyrings that point to them.  At that
point, the key will be deleted in the normal manner.

Keyrings from which garbage collection occurs are shrunk and their quota
consumption reduced as appropriate.

Dead keys (for which the key type has been removed) will be garbage collected
immediately.

Revoked and expired keys will hang around for a number of seconds, as set in
/proc/sys/kernel/keys/gc_delay before being automatically removed.  The default
is 5 minutes.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:11 +10:00
David Howells
f041ae2f99 KEYS: Flag dead keys to induce EKEYREVOKED [try #6]
Set the KEY_FLAG_DEAD flag on keys for which the type has been removed.  This
causes the key_permission() function to return EKEYREVOKED in response to
various commands.  It does not, however, prevent unlinking or clearing of
keyrings from detaching the key.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:09 +10:00
David Howells
5593122eec KEYS: Deal with dead-type keys appropriately [try #6]
Allow keys for which the key type has been removed to be unlinked.  Currently
dead-type keys can only be disposed of by completely clearing the keyrings
that point to them.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:04 +10:00
Serge E. Hallyn
1d1e97562e keys: distinguish per-uid keys in different namespaces
per-uid keys were looked by uid only.  Use the user namespace
to distinguish the same uid in different namespaces.

This does not address key_permission.  So a task can for instance
try to join a keyring owned by the same uid in another namespace.
That will be handled by a separate patch.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-27 12:35:06 +11:00
David Howells
d84f4f992c CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management.  This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.

A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().

With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:

	struct cred *new = prepare_creds();
	int ret = blah(new);
	if (ret < 0) {
		abort_creds(new);
		return ret;
	}
	return commit_creds(new);

There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.

To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const.  The purpose of this is compile-time
discouragement of altering credentials through those pointers.  Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:

  (1) Its reference count may incremented and decremented.

  (2) The keyrings to which it points may be modified, but not replaced.

The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

 (1) execve().

     This now prepares and commits credentials in various places in the
     security code rather than altering the current creds directly.

 (2) Temporary credential overrides.

     do_coredump() and sys_faccessat() now prepare their own credentials and
     temporarily override the ones currently on the acting thread, whilst
     preventing interference from other threads by holding cred_replace_mutex
     on the thread being dumped.

     This will be replaced in a future patch by something that hands down the
     credentials directly to the functions being called, rather than altering
     the task's objective credentials.

 (3) LSM interface.

     A number of functions have been changed, added or removed:

     (*) security_capset_check(), ->capset_check()
     (*) security_capset_set(), ->capset_set()

     	 Removed in favour of security_capset().

     (*) security_capset(), ->capset()

     	 New.  This is passed a pointer to the new creds, a pointer to the old
     	 creds and the proposed capability sets.  It should fill in the new
     	 creds or return an error.  All pointers, barring the pointer to the
     	 new creds, are now const.

     (*) security_bprm_apply_creds(), ->bprm_apply_creds()

     	 Changed; now returns a value, which will cause the process to be
     	 killed if it's an error.

     (*) security_task_alloc(), ->task_alloc_security()

     	 Removed in favour of security_prepare_creds().

     (*) security_cred_free(), ->cred_free()

     	 New.  Free security data attached to cred->security.

     (*) security_prepare_creds(), ->cred_prepare()

     	 New. Duplicate any security data attached to cred->security.

     (*) security_commit_creds(), ->cred_commit()

     	 New. Apply any security effects for the upcoming installation of new
     	 security by commit_creds().

     (*) security_task_post_setuid(), ->task_post_setuid()

     	 Removed in favour of security_task_fix_setuid().

     (*) security_task_fix_setuid(), ->task_fix_setuid()

     	 Fix up the proposed new credentials for setuid().  This is used by
     	 cap_set_fix_setuid() to implicitly adjust capabilities in line with
     	 setuid() changes.  Changes are made to the new credentials, rather
     	 than the task itself as in security_task_post_setuid().

     (*) security_task_reparent_to_init(), ->task_reparent_to_init()

     	 Removed.  Instead the task being reparented to init is referred
     	 directly to init's credentials.

	 NOTE!  This results in the loss of some state: SELinux's osid no
	 longer records the sid of the thread that forked it.

     (*) security_key_alloc(), ->key_alloc()
     (*) security_key_permission(), ->key_permission()

     	 Changed.  These now take cred pointers rather than task pointers to
     	 refer to the security context.

 (4) sys_capset().

     This has been simplified and uses less locking.  The LSM functions it
     calls have been merged.

 (5) reparent_to_kthreadd().

     This gives the current thread the same credentials as init by simply using
     commit_thread() to point that way.

 (6) __sigqueue_alloc() and switch_uid()

     __sigqueue_alloc() can't stop the target task from changing its creds
     beneath it, so this function gets a reference to the currently applicable
     user_struct which it then passes into the sigqueue struct it returns if
     successful.

     switch_uid() is now called from commit_creds(), and possibly should be
     folded into that.  commit_creds() should take care of protecting
     __sigqueue_alloc().

 (7) [sg]et[ug]id() and co and [sg]et_current_groups.

     The set functions now all use prepare_creds(), commit_creds() and
     abort_creds() to build and check a new set of credentials before applying
     it.

     security_task_set[ug]id() is called inside the prepared section.  This
     guarantees that nothing else will affect the creds until we've finished.

     The calling of set_dumpable() has been moved into commit_creds().

     Much of the functionality of set_user() has been moved into
     commit_creds().

     The get functions all simply access the data directly.

 (8) security_task_prctl() and cap_task_prctl().

     security_task_prctl() has been modified to return -ENOSYS if it doesn't
     want to handle a function, or otherwise return the return value directly
     rather than through an argument.

     Additionally, cap_task_prctl() now prepares a new set of credentials, even
     if it doesn't end up using it.

 (9) Keyrings.

     A number of changes have been made to the keyrings code:

     (a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
     	 all been dropped and built in to the credentials functions directly.
     	 They may want separating out again later.

     (b) key_alloc() and search_process_keyrings() now take a cred pointer
     	 rather than a task pointer to specify the security context.

     (c) copy_creds() gives a new thread within the same thread group a new
     	 thread keyring if its parent had one, otherwise it discards the thread
     	 keyring.

     (d) The authorisation key now points directly to the credentials to extend
     	 the search into rather pointing to the task that carries them.

     (e) Installing thread, process or session keyrings causes a new set of
     	 credentials to be created, even though it's not strictly necessary for
     	 process or session keyrings (they're shared).

(10) Usermode helper.

     The usermode helper code now carries a cred struct pointer in its
     subprocess_info struct instead of a new session keyring pointer.  This set
     of credentials is derived from init_cred and installed on the new process
     after it has been cloned.

     call_usermodehelper_setup() allocates the new credentials and
     call_usermodehelper_freeinfo() discards them if they haven't been used.  A
     special cred function (prepare_usermodeinfo_creds()) is provided
     specifically for call_usermodehelper_setup() to call.

     call_usermodehelper_setkeys() adjusts the credentials to sport the
     supplied keyring as the new session keyring.

(11) SELinux.

     SELinux has a number of changes, in addition to those to support the LSM
     interface changes mentioned above:

     (a) selinux_setprocattr() no longer does its check for whether the
     	 current ptracer can access processes with the new SID inside the lock
     	 that covers getting the ptracer's SID.  Whilst this lock ensures that
     	 the check is done with the ptracer pinned, the result is only valid
     	 until the lock is released, so there's no point doing it inside the
     	 lock.

(12) is_single_threaded().

     This function has been extracted from selinux_setprocattr() and put into
     a file of its own in the lib/ directory as join_session_keyring() now
     wants to use it too.

     The code in SELinux just checked to see whether a task shared mm_structs
     with other tasks (CLONE_VM), but that isn't good enough.  We really want
     to know if they're part of the same thread group (CLONE_THREAD).

(13) nfsd.

     The NFS server daemon now has to use the COW credentials to set the
     credentials it is going to use.  It really needs to pass the credentials
     down to the functions it calls, but it can't do that until other patches
     in this series have been applied.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:23 +11:00
David Howells
47d804bfa1 CRED: Wrap task credential accesses in the key management code
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:11 +11:00
David Howells
0b77f5bfb4 keys: make the keyring quotas controllable through /proc/sys
Make the keyring quotas controllable through /proc/sys files:

 (*) /proc/sys/kernel/keys/root_maxkeys
     /proc/sys/kernel/keys/root_maxbytes

     Maximum number of keys that root may have and the maximum total number of
     bytes of data that root may have stored in those keys.

 (*) /proc/sys/kernel/keys/maxkeys
     /proc/sys/kernel/keys/maxbytes

     Maximum number of keys that each non-root user may have and the maximum
     total number of bytes of data that each of those users may have stored in
     their keys.

Also increase the quotas as a number of people have been complaining that it's
not big enough.  I'm not sure that it's big enough now either, but on the
other hand, it can now be set in /etc/sysctl.conf.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <kwc@citi.umich.edu>
Cc: <arunsr@cse.iitk.ac.in>
Cc: <dwalsh@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:17 -07:00
David Howells
69664cf16a keys: don't generate user and user session keyrings unless they're accessed
Don't generate the per-UID user and user session keyrings unless they're
explicitly accessed.  This solves a problem during a login process whereby
set*uid() is called before the SELinux PAM module, resulting in the per-UID
keyrings having the wrong security labels.

This also cures the problem of multiple per-UID keyrings sometimes appearing
due to PAM modules (including pam_keyinit) setuiding and causing user_structs
to come into and go out of existence whilst the session keyring pins the user
keyring.  This is achieved by first searching for extant per-UID keyrings
before inventing new ones.

The serial bound argument is also dropped from find_keyring_by_name() as it's
not currently made use of (setting it to 0 disables the feature).

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <kwc@citi.umich.edu>
Cc: <arunsr@cse.iitk.ac.in>
Cc: <dwalsh@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:17 -07:00
Arun Raghavan
6b79ccb514 keys: allow clients to set key perms in key_create_or_update()
The key_create_or_update() function provided by the keyring code has a default
set of permissions that are always applied to the key when created.  This
might not be desirable to all clients.

Here's a patch that adds a "perm" parameter to the function to address this,
which can be set to KEY_PERM_UNDEF to revert to the current behaviour.

Signed-off-by: Arun Raghavan <arunsr@cse.iitk.ac.in>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:16 -07:00
David Howells
e231c2ee64 Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p)
Convert instances of ERR_PTR(PTR_ERR(p)) to ERR_CAST(p) using:

perl -spi -e 's/ERR_PTR[(]PTR_ERR[(](.*)[)][)]/ERR_CAST(\1)/' `grep -rl 'ERR_PTR[(]*PTR_ERR' fs crypto net security`

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:26 -08:00
David Howells
76181c134f KEYS: Make request_key() and co fundamentally asynchronous
Make request_key() and co fundamentally asynchronous to make it easier for
NFS to make use of them.  There are now accessor functions that do
asynchronous constructions, a wait function to wait for construction to
complete, and a completion function for the key type to indicate completion
of construction.

Note that the construction queue is now gone.  Instead, keys under
construction are linked in to the appropriate keyring in advance, and that
anyone encountering one must wait for it to be complete before they can use
it.  This is done automatically for userspace.

The following auxiliary changes are also made:

 (1) Key type implementation stuff is split from linux/key.h into
     linux/key-type.h.

 (2) AF_RXRPC provides a way to allocate null rxrpc-type keys so that AFS does
     not need to call key_instantiate_and_link() directly.

 (3) Adjust the debugging macros so that they're -Wformat checked even if
     they are disabled, and make it so they can be enabled simply by defining
     __KDEBUG to be consistent with other code of mine.

 (3) Documentation.

[alan@lxorguk.ukuu.org.uk: keys: missing word in documentation]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:57 -07:00
Paul Mundt
20c2df83d2 mm: Remove slab destructors from kmem_cache_create().
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-20 10:11:58 +09:00
David Howells
9ad0830f30 [PATCH] Keys: Fix key serial number collision handling
Fix the key serial number collision avoidance code in key_alloc_serial().

This didn't use to be so much of a problem as the key serial numbers were
allocated from a simple incremental counter, and it would have to go through
two billion keys before it could possibly encounter a collision.  However, now
that random numbers are used instead, collisions are much more likely.

This is fixed by finding a hole in the rbtree where the next unused serial
number ought to be and using that by going almost back to the top of the
insertion routine and redoing the insertion with the new serial number rather
than trying to be clever and attempting to work out the insertion point
pointer directly.

This fixes kernel BZ #7727.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-06 14:45:00 -08:00
Eric Sesterhenn
48ad504ee7 [PATCH] security/keys/*: user kmemdup()
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
Christoph Lameter
e18b890bb0 [PATCH] slab: remove kmem_cache_t
Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
Christoph Lameter
e94b176609 [PATCH] slab: remove SLAB_KERNEL
SLAB_KERNEL is an alias of GFP_KERNEL.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:24 -08:00