mnt: Refactor the logic for mounting sysfs and proc in a user namespace

Fresh mounts of proc and sysfs are a very special case that works very
much like a bind mount.  Unfortunately the current structure can not
preserve the MNT_LOCK... mount flags.  Therefore refactor the logic
into a form that can be modified to preserve those lock bits.

Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount
of the filesystem be fully visible in the current mount namespace,
before the filesystem may be mounted.

Move the logic for calling fs_fully_visible from proc and sysfs into
fs/namespace.c where it has greater access to mount namespace state.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
diff --git a/fs/namespace.c b/fs/namespace.c
index 1b9e111..8e7edaf 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2332,6 +2332,8 @@
 	return err;
 }
 
+static bool fs_fully_visible(struct file_system_type *fs_type);
+
 /*
  * create a new mount for userspace and request it to be added into the
  * namespace's tree
@@ -2363,6 +2365,10 @@
 			flags |= MS_NODEV;
 			mnt_flags |= MNT_NODEV | MNT_LOCK_NODEV;
 		}
+		if (type->fs_flags & FS_USERNS_VISIBLE) {
+			if (!fs_fully_visible(type))
+				return -EPERM;
+		}
 	}
 
 	mnt = vfs_kern_mount(type, flags, name, data);
@@ -3164,7 +3170,7 @@
 	return chrooted;
 }
 
-bool fs_fully_visible(struct file_system_type *type)
+static bool fs_fully_visible(struct file_system_type *type)
 {
 	struct mnt_namespace *ns = current->nsproxy->mnt_ns;
 	struct mount *mnt;