kernel-fxtec-pro1x/security
Eric Paris 8549164143 IMA: use rbtree instead of radix tree for inode information cache
The IMA code needs to store the number of tasks which have an open fd
granting permission to write a file even when IMA is not in use.  It
needs this information in order to be enabled at a later point in time
without losing it's integrity garantees.

At the moment that means we store a little bit of data about every inode
in a cache.  We use a radix tree key'd on the inode's memory address.
Dave Chinner pointed out that a radix tree is a terrible data structure
for such a sparse key space.  This patch switches to using an rbtree
which should be more efficient.

Bug report from Dave:

 "I just noticed that slabtop was reporting an awfully high usage of
  radix tree nodes:

   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
  4200331 2778082  66%    0.55K 144839       29   2317424K radix_tree_node
  2321500 2060290  88%    1.00K  72581       32   2322592K xfs_inode
  2235648 2069791  92%    0.12K  69864       32    279456K iint_cache

  That is, 2.7M radix tree nodes are allocated, and the cache itself is
  consuming 2.3GB of RAM.  I know that the XFS inodei caches are indexed
  by radix tree node, but for 2 million cached inodes that would mean a
  density of 1 inode per radix tree node, which for a system with 16M
  inodes in the filsystems is an impossibly low density.  The worst I've
  seen in a production system like kernel.org is about 20-25% density,
  which would mean about 150-200k radix tree nodes for that many inodes.
  So it's not the inode cache.

  So I looked up what the iint_cache was.  It appears to used for
  storing per-inode IMA information, and uses a radix tree for indexing.
  It uses the *address* of the struct inode as the indexing key.  That
  means the key space is extremely sparse - for XFS the struct inode
  addresses are approximately 1000 bytes apart, which means the closest
  the radix tree index keys get is ~1000.  Which means that there is a
  single entry per radix tree leaf node, so the radix tree is using
  roughly 550 bytes for every 120byte structure being cached.  For the
  above example, it's probably wasting close to 1GB of RAM...."

Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 11:37:17 -07:00
..
apparmor AppArmor: Fix locking from removal of profile namespace 2010-09-08 09:19:34 +10:00
integrity/ima IMA: use rbtree instead of radix tree for inode information cache 2010-10-26 11:37:17 -07:00
keys KEYS: Fix bug in keyctl_session_to_parent() if parent has no session keyring 2010-09-10 07:30:00 -07:00
selinux tty: fix fu_list abuse 2010-08-18 08:35:47 -04:00
smack Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2010-08-04 15:31:02 -07:00
tomoyo TOMOYO: Don't abuse sys_getpid(), sys_getppid() 2010-09-27 10:53:18 +10:00
capability.c Merge branch 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux 2010-08-10 12:07:51 -07:00
commoncap.c Make do_execve() take a const filename pointer 2010-08-17 18:07:43 -07:00
device_cgroup.c Merge branch 'master' into next 2010-05-06 10:56:07 +10:00
inode.c securityfs: Drop dentry reference count when mknod fails 2010-08-02 15:34:59 +10:00
Kconfig AppArmor: Enable configuring and building of the AppArmor security module 2010-08-02 15:38:34 +10:00
lsm_audit.c Merge branch 'master' into next 2010-05-06 10:56:07 +10:00
Makefile AppArmor: Enable configuring and building of the AppArmor security module 2010-08-02 15:38:34 +10:00
min_addr.c mmap_min_addr check CAP_SYS_RAWIO only for write 2010-04-23 08:56:31 +10:00
security.c Merge branch 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux 2010-08-10 12:07:51 -07:00