CNK's Blog

User and Group Management for Multitenancy

Once we set up a site, we want to hand over all control to the site admin. This means that they need to be able to add users and give them permissions to do things on their site (and only their site). In Wagtail this means a site admin, needs to be able to create and delete users and assign them to predefined groups.

Groups

Wagtail expects permissions to be assigned to users via groups. Users belong to groups and groups come with a set of permissions. Wagtail has a UI for creating groups and editing the permissions but we don’t use it (except sometimes for trouble shooting or on-off groups for Privacy options). The script we use to create a new site creates a standard set of groups (Admin, Editor, and Viewer) and assigns each a specific set of permissions (or in the case of Viewer, no admin privileges at all). Obviously we can’t have 500 groups named Admin so the machine names of these groups are each prefixed with the site’s hostname, e.g. “foo.localhost Admin”.

So far this is all pretty straightforward - except for the fact that there isn’t actually a foreign key relationship between sites and groups. The relationship between sites and groups is entirely based on the group name starting with the site’s hostname. Since we use code to create (and delete) sites, we haven’t had any problems with the lack of database integrity constraints. But that is something one might want to change if you were building a multitenant system where the sites and their associated objects were not so rigidly defined.

The only other down side of the hostname prefix is that is a bit ugly. We prefer our site owners see just “Admin” or “Editor”. So any place they would see the group name, we remove the hostname prefix. The main places that happens are the user forms discussed below and in the privacy restrictions forms (which we already customize to add an additional option). In the initialization method of our user forms, we call the following method to configure the groups section of the form. It takes care of filtering the allowed groups by site and cleaning up the displayed names.

    def configure_shared_fields(self):
        error_messages = self.fields['groups'].error_messages.copy()
        if not self.request.user.is_superuser:
            # Only superusers may grant superuser status to other users.
            self.fields.pop('is_superuser')

            # Site admins MUST assign at least one Group. Replace the messages with ones tailored to them.
            self.fields['groups'].required = True
            error_messages['required'] = self.error_messages['group_required']
            self.fields['groups'].help_text = "A user's groups determine their permissions within the site."

            site = Site.find_for_request(self.request)
            # Non-superusers are allowed to see only the Groups that belong to the current Site.
            # This also reduces the displayed Group name from e.g. "hostname.example.com Admins" to just "Admins".
            self.fields['groups'].choices = (
                (g.id, g.name.replace(site.hostname, '').strip())
                for g
                in Group.objects.filter(name__startswith=site.hostname)
            )
            # Changing the queryset alone isn't sufficient to change the available choices on the form (the "choices"
            # setting was created during the field's init). But we have to change the queryset anyway because it's
            # what the validation code uses to determine if the specified inputs are valid choices.
            self.fields['groups'].queryset = Group.objects.filter(name__startswith=site.hostname)
        else:
            self.fields['groups'].help_text = """Normal users require a Group. However, superusers should NOT be
                in any Groups. Thus, this field is required only when the Superuser checkbox is unchecked."""
            # Replace the "Required" error message with one tailored to superusers.
            error_messages['required'] = self.error_messages['group_required_superuser']
        self.fields['groups'].error_messages = error_messages

Users

I have said that I want a user to see completely different content when logging into Site A vs into Site B. One way to handle that would be to create separate users on the 2 sites. Problem solved, eh? However, most of our services are set up so you can use a standard set of credentials everywhere. Not only is this more convenient for our users (fewer passwords to track) but, since we are using a central store to check usernames and passwords, we can automatically disable access when someone leaves. So we need to use a single user record across all sites.

We need to use a single user record across sites. But we also have to enforce our strict site separation which means we can’t let Site A know that a user also has permissions on Site B. So we built our own user management views. In addition to allowing the views to behave differently when one is logged in as a superuser vs logged in as the admin of a site, it also made it easy for us to combine the steps of creating a user and assigning them to groups.

When adding new users to a site, we enter their username or name into a search which searches for an active account and returns the information we need to create a user record. Then we present a list of groups you can assign to this user. For site admins, this list only only shows the groups for the current site. But what if that user is already an Editor on some other site? If we just saved the form as is, first, that user will already exist in our system, so our “create” form actually needs to be an edit form. And more importantly, we need to not lose the group mappings that already exist - even though they were not included in the form data.

    def save(self, commit=True):
        """
        If a Django User with this username already exists, pull it from the DB and add the specified Groups to it,
        instead of creating a new User object.
        """
        user = super().save(commit=False)
        # Users can access django-admin iff they are a superuser.
        user.is_staff = user.is_superuser

        # Autocompleter note: The "LDAP User" field is actually an autocompleter for ContactInformation objects.
        # Since the form doesn't have a username field, we need to set this manually.
        user.username = self.cleaned_data['contact_info'].uid
        groups = Group.objects.filter(pk__in=self.cleaned_data['groups']).all()
        logger_extras = {
            'target_user': user.username,
            'target_user_superuser': user.is_superuser,
            'groups': ", ".join(str(g) for g in groups)
        }
        try:
            existing_user = get_user_model().objects.get(username=user.username)
        except get_user_model().DoesNotExist:
            existing_user = False
            # This is a brand new LDAPUser, so it needs to have its Django password made unusable, ensuring the user
            # can only log in via their LDAP credentials. We do this here so that the password doesn't change when an
            # LDAPUser is "added", but it already exists and is actually just being placed in a new Site's Group(s).
            user.set_unusable_password()
        else:
            user = existing_user
            # Don't bulk add groups or we disrupt existing group mappings. Add individual groups from the form
            for group in groups:
                user.groups.add(group)

            # Set the is_superadmin flag to True if the form data says to. This is necessary only
            # when there's an existing user because unlike 'user', 'existing_user' will not have had
            # these flags set by super().save(commit=False). We don't just set the flag to the form
            # value, because we never want to remove is_superadmin when this code gets executed.
            if self.cleaned_data.get('is_superadmin'):
                user.is_superadmin = True
        if existing_user:
            logger_extras['target_user_id'] = user.id

        # Populate the identifying information for the User form the ContactInformation object.
        user.first_name = self.cleaned_data['contact_info'].first_name
        user.last_name = self.cleaned_data['contact_info'].last_name
        user.email = self.cleaned_data['contact_info'].email

        if commit:
            user.save()
            if not existing_user:
                # Only call save_m2m() if we're not updating an existing User. It'll try to overwrite the updated
                # user.groups list, AND it'll crash for some reason I haven't figured out.
                self.save_m2m()
                logger.info('user.ldap.create', **logger_extras)
            else:
                logger.info('user.ldap.update', **logger_extras)
        return user

The clean and save methods in the edit form is a bit more straightforward because we know that we already have a user. But we still have to do a little fooling around with the form data because site admins can only edit users who belong to their site - which means those users must be assigned to one of the site’s groups. And site admins can’t change a user’s superuser or superadmin attributes.

    def clean(self):
        super().clean()

        # Remove any data about groups that may have been included in the form. We need to apply changes to Groups
        # manually, due to how non-superusers get presented with the Groups list.
        if not self.request.user.is_superuser:
            self.new_groups = self.cleaned_data.get('groups', [])
            with suppress(KeyError):
                del self.cleaned_data['groups']
            with suppress(ValueError):
                self.changed_data.remove('groups')

        # Superusers are allowed to be ungrouped.
        if self.cleaned_data.get('is_superuser'):
            if self.errors.get('groups') and self.errors['groups'][0] == self.error_messages['group_required']:
                del self.errors['groups']
                # The clean_groups() function removed self.cleaned_data['groups'] due to the error, but that will cause
                # the groups list to go unchanged upon save. So we need to set it to empty list.
                self.cleaned_data['groups'] = []

        # A superuser must assign a User to a Group, and/or set that User as a Superuser or a Super Admin.
        # If they don't do at least one, throw an informative error.
        if (
            self.request.user.is_superuser and
            not (self.cleaned_data.get('is_superuser') or self.cleaned_data.get('is_superadmin'))
            and not self.cleaned_data.get('groups')
        ):
            self.add_error('groups', forms.ValidationError(self.error_messages['group_required_superuser']))


    def save(self, commit=True):
        """
        In case the data in LDAP has changed, or it failed to populate on the previous create/edit, we override save()
        to re-populate this User's personal info from LDAP.
        """
        user = super().save(commit=False)
        # Users can access django-admin iff they are a superuser.
        user.is_staff = user.is_superuser
        populate_user_from_contact_info(user)
        if commit:
            user.save()
            self.save_m2m()
            if self.has_changed():
                logger_extras = {
                    'target_user': user.username,
                    'target_user_superuser': user.is_superuser,
                }
                for field_name in self.changed_data:
                    logger_extras[field_name] = self.cleaned_data[field_name]
                logger.info('user.ldap.update', **logger_extras)

            # Rather than setting the User's entire Groups list to just what's in this form's POST data, we must ensure
            # that only those Groups which belong to the current Site are affected (unless the current user is a
            # superuser), because Groups on other Sites will never be included in the POST data.
            # So, we check if the list of current-Site-Groups that the user belongs to differs from the list that was
            # set in the form.
            existing_local_groups = user.groups.filter(name__startswith=self.site.hostname).all()
            if not self.request.user.is_superuser and set(existing_local_groups) != set(self.new_groups):
                # If changes were made to the user's Groups, remove the old list of current-Site-groups
                # and apply the new one.
                user.groups.remove(*existing_local_groups)
                user.groups.add(*self.new_groups)

        return user

When logged in as a superuser, one sees a more normal list view - showing all users with their name and the full list of groups they belong to. And the superuser’s create/edit forms have all the form fields including is_superadmin and is_superuser. The one minor annoyance for the superuser forms is that the group list can get kind of out of hand since it displays all the groups. We never log into any of the sites as root unless we are creating a new site or doing some troubleshooting so we have never done anything about this.

Permission Patches for Multitenancy

To completely separate sites within the Wagtail admin, we need to make changes to page and collection permissions and do some patching of the user management, workflow, and history systems. My previous post covered the mechanics of how we introduce monkey patches into our project. In this post I am going to explain how we have customized Wagtail 5.1’s new PagePermissionPolicy to preserve our version of multitenancy.

SuperAdmins

It is impractical for us to add our developers to the actual Admin groups of the hundreds of sites on the system, so we invented concept we call “superadmins”. Superadmins are users who the system pretends are in the “Admin” group for whichever site they’re currently logged in to. In this way, our system presents each site to a superadmin as if it’s the only site on the server and lets us see exactly what an actual admin of the site sees. is_superadmin is a boolean field on our user model:

    class User(AbstractUser):
        """
        Replaces the auth.User model with our customized version.
        """
        is_superadmin = models.BooleanField(
            default=False,
            verbose_name='Super Admin',
            help_text='Enable this flag to make this user a Super Admin, which causes the system to treat them like they '
                    'are an Admin on whatever site they are logged into.'
        )

Page Permission Patches

Prior to Wagtail 5.1 we were patching wagtail.admin.auth.user_has_any_page_permission, wagtail.admin.navigation.get_pages_with_direct_explore_permission, and wagtail.core.models.UserPagePermissionsProxy.__init__.

In Wagtail 5.1, UserPagePermissionsProxy and get_pages_with_direct_explore_permission are both deprecated and permission checking has been consolidated into a new PagePermissionPolicy class. I was initially planning to try subclassing PagePermissionPolicy so I could explicitly initialize it with the current site. Because PagePermissionPolicy is instantiated 27 places in 17 different files, switching out the policy class for a subclass is impractical. So I have gone back to our monkey patching strategy.

Method diagram for Wagtail's PagePermissionPolicy

When I diagram the method calls within PagePermissionPolicy, I see that they nearly all go through get_all_page_permissions_for_user - the main method used to query the GroupPagePermissions table. The results of this query are cached and used by other parts of the Wagtail admin interface as needed.

To enforce our site separation requirement, I added a filter for pages on the current site:

    return GroupPagePermission.objects.filter(
        group__user=user,
        page__path__startswith=site.root_page.path
    ).select_related(
        "page", "permission"
    )

To allow superadmins to behave as site admins, I explicitly filtered for the site admin group:

    # Give them the permissions of the site admin group
    group = Group.objects.filter(name=f'{site.hostname} Admins').first()
    return GroupPagePermission.objects.filter(group=group).select_related(
        "page", "permission"
    )

Combining those two, our full version of get_all_page_permissions_for_user is:

    def mutitenant_get_all_page_permissions_for_user(self, user):
        if not user.is_active or user.is_anonymous or user.is_superuser:
            return GroupPagePermission.objects.none()

        # BEGIN PATCH
        request = get_current_request()
        if not request:
            logger.error(
                'In PagePermissionPolicy.mutitenant_get_all_page_permissions_for_user but could not get the request.'
            )
            return GroupPagePermission.objects.none()

        # So now restrict checks to permissions for the current site
        site = Site.find_for_request(request)
        if user.is_superadmin:
            # Give them the permissions of the site admin group
            group = Group.objects.filter(name=f'{site.hostname} Admins').first()
            return GroupPagePermission.objects.filter(group=group).select_related(
                "page", "permission"
            )
        else:
            # filter for current user and for permissions relevant only to this site
            return GroupPagePermission.objects.filter(
                group__user=user,
                page__path__startswith=site.root_page.path
            ).select_related(
                "page", "permission"
            )
    # Getting this function used is covered below

The behavior changes are both relatively straightforward; the tricky bit is getting the site. In the code above that is taken care of by Site.find_for_request plus our get_current_request method. This could be a problem if get_all_permissions_for_user were called from code that does not have access to the request. Fortunately almost all the places that instantiate PagePermissionPolicy are views or, if the instantiating code is not itself a view, the methods that need the permission policy are only executed from a view. For example, the is_shown method for MenuItem subclasses is only executed when a user is viewing the admin UI.

Looking at the diagram above, you can see in the next to bottom row, in addition to get_all_permissions_for_user, there are two other methods that query GroupPagePermission. Neither of them appear to be in use in the current Wagtail codebase. But for the sake of completeness, I have monkey patched them too:

    def mutitenant_users_with_any_permission(self, actions, include_superusers=True):
        """
        2023-07-22 cnk: I patched this because it had a query in it but as of Wagtail 5.1.1 this
        method is not in use, nor is users_with_permission which delegates to this method
        """
        # User with only "add" permission can still edit their own pages
        actions = set(actions)
        if "change" in actions:
            actions.add("add")

        # BEGIN PATCH
        request = get_current_request()
        if not request:
            logger.error('In PagePermissionPolicy.mutitenant_users_with_any_permission but could not get the request.')
            return get_user_model.objects.none()

        # So now restrict checks to permissions for the current site
        site = Site.find_for_request(request)
        groups = GroupPagePermission.objects.filter(
            permission__codename__in=self._get_permission_codenames(actions),
            group__name__startswith=site.hostname
        ).values_list("group", flat=True)

        q = Q(groups__in=groups)
        # Superadmins will have all page permissions because Admins do
        q |= Q(is_superadmin=True)
        # END PATCH
        if include_superusers:
            q |= Q(is_superuser=True)

        return (
            get_user_model()
            ._default_manager.filter(is_active=True)
            .filter(q)
            .distinct()
        )


    def multitenant_users_with_any_permission_for_instance(
        self, actions, instance, include_superusers=True
    ):
        """
        2023-07-22 cnk: I patched this because it had a query in it but as of Wagtail 5.1.1 the only
        place this is used is send_moderation_notification. Since this is for an instance, it naturally
        filters for just one site - but we need to add in superadmins.
        """
        # Find permissions for all ancestors that match any of the actions
        ancestors = instance.get_ancestors(inclusive=True)
        groups = GroupPagePermission.objects.filter(
            permission__codename__in=self._get_permission_codenames(actions),
            page__in=ancestors,
        ).values_list("group", flat=True)

        q = Q(groups__in=groups)

        # BEGIN PATCH
        # Superadmins will have all page permissions because Admins do
        q |= Q(is_superadmin=True)
        # END PATCH
        if include_superusers:
            q |= Q(is_superuser=True)

        # If "change" is in actions but "add" is not, then we need to check for
        # cases where the user has "add" permission on an ancestor, and is the
        # owner of the instance
        if "change" in actions and "add" not in actions:
            add_groups = GroupPagePermission.objects.filter(
                permission__codename=get_permission_codename("add", self.model._meta),
                page__in=ancestors,
            ).values_list("group", flat=True)

            q |= Q(groups__in=add_groups) & Q(pk=instance.owner_id)

        return (
            get_user_model()
            ._default_manager.filter(is_active=True)
            .filter(q)
            .distinct()
        )

And finally, to get our versions of these files used, we import PagePermissionPolicy and replace the functions:

    from wagtail.permission_policies.pages import PagePermissionPolicy
    PagePermissionPolicy.get_all_permissions_for_user = mutitenant_get_all_page_permissions_for_user
    PagePermissionPolicy.users_with_any_permission = mutitenant_users_with_any_permission
    PagePermissionPolicy.users_with_any_permission_for_instance = multitenant_users_with_any_permission_for_instance

Collection Permission Patches

In addition to managing their own pages, site owners need to be able to manage their own images and documents. Permissions for images and documents are controlled by permissions set on the collection that contains them. When we create a new site, we create a collection for it and allow the site’s Admin group the ability to create collections underneath that parent collection. Permissions for managing the collections are managed by the CollectionManagementPermissionPolicy and permissions that control access to images and documents are controlled by the CollectionOwnershipPermissionPolicy. Both of those use the CollectionPermissionLookupMixin to query GroupCollectionPermission. In the diagrams below, methods coming from CollectionPermissionLookupMixin are denoted with a “*”. Prior to Wagtail 5.1 we were patching CollectionPermissionLookupMixin.check_perm and CollectionPermissionLookupMixin.collections_with_perm but as of Wagtail 5.1 most of the collection permission logic goes through CollectionPermissionLookupMixin.get_all_permissions_for_user.

Document and Image Permissions

The more important set of permissions is in the CollectionOwnershipPermissionPolicy class. This class decides what permissions a user has over the images and documents stored in the site’s collections. As you can see in the diagram below, all of the policy’s queries flow through get_all_permissions_for_user, so we can enforce our rules by patching that one method.

Method diagram for Wagtail's CollectionOwnershipPermissionPolicy

As with page permissions, the first time a Collection model is accessed triggers a query to the GroupCollectionPermission model (via get_all_permissions_for_user) and caches the user’s collection permissions on the user object. So we make similar patches to the ones we made above for pages. We add one line to filter the collection tree to restrict it to permissions for this site and a different change to assign superadmins to the site’s Admin group. Our naming contention ensures the we can find that site’s base collection by knowing the site for this request.

    def mutitenant_get_all_collection_permissions_for_user(self, user):
        """
        This method does a lot of the filtering for collections the user has access to. If we can get a
        request here, we can enforce a lot of our special cases right here.
            1. Users should only see collections for the current site - even if they have permissions on
               other sites. So we need to filter permissions for the site's root collection.
            2. If the user is a superadmin, we need to fake assigning them to the site's Admin group.
        """
        # For these users, we can determine the permissions without querying
        # GroupCollectionPermission by checking it directly in _check_perm()
        if not user.is_active or user.is_anonymous or user.is_superuser:
            return GroupCollectionPermission.objects.none()

        # BEGIN PATCH
        request = get_current_request()
        if not request:
            logger.error('In CollectionPermissionLookupMixin.mutitenant_get_all_permissions_for_user but could not get the request.')
            return GroupCollectionPermission.objects.none()

        # So now restrict checks to the collections for the current site
        site = Site.find_for_request(request)
        collection = Collection.objects.filter(name=site.hostname).first()
        if user.is_superadmin:
            group = Group.objects.filter(name=f'{site.hostname} Admins').first()
            return GroupCollectionPermission.objects.filter(
                group=group,
                collection=collection
            ).select_related("permission", "collection")
        else:
            return GroupCollectionPermission.objects.filter(
                group__user=user,
                collection=collection
            ).select_related("permission", "collection")
        # END PATCH


    from wagtail.permission_policies.collections import CollectionPermissionLookupMixin
    CollectionPermissionLookupMixin.get_all_permissions_for_user = mutitenant_get_all_collection_permissions_for_user

Collection Management

Collection management permissions allow admins to create their own nested set of collections. As you can see in the diagram below, the CollectionManagementPermissionPolicy’s permissions also all flow through get_all_permissions_for_user so the patch above that we used for managing items stored in collections takes care of most of the policy changes needed for managing the collections themselves.

Collection Management Permissions

Method diagram for Wagtail's CollectionManagementPermissionPolicy

The one additional thing we need to patch is a helper method used to decide which collections a user may delete: _descendants_with_perm. (If we omit this patch, admin’s can’t delete any collections).

    def multitenant__descendants_with_perm(self, user, action):
        """
        Return a queryset of collections descended from a collection on which this user has
        a GroupCollectionPermission record for this action. Used for actions, like edit and
        delete where the user cannot modify the collection where they are granted permission.
        """
        # Get the permission object corresponding to this action
        permission = self._get_permission_objects_for_actions([action]).first()

        # BEGIN PATCH
        # Replace the check for permission on the User's full list of Groups to a check for
        # permissions on only the current Site's Groups. Also take SuperAdmins into account.
        request = get_current_request()
        if not request:
            logger.error('In CollectionManagementPermissionPolicy.multitenant__descendants_with_perm but could not get the request.')
            return Collection.objects.none()

        site = Site.find_for_request(request)
        collection = Collection.objects.filter(name=site.hostname).first()

        # Fill in SuperAdmin groups
        if user.is_superadmin:
            groups = Group.objects.filter(name=f'{site.hostname} Admins').all()
        else:
            # user.groups.all() is what is in the original; we could restrict by site but the collection
            # filter will remove permissions not relevant to this site
            groups = user.groups.all()

        # Get the collections that have a GroupCollectionPermission record
        # for this permission and any of the user's groups; create a list of their paths
        # PATCH: restrict to collections belonging to this site
        collection_roots = Collection.objects.descendant_of(collection, inclusive=True).filter(
            group_permissions__group__in=groups,
            group_permissions__permission=permission,
        ).values("path", "depth")
        # END PATCH

        if collection_roots:
            # build a filter expression that will filter our model to just those
            # instances in collections with a path that starts with one of the above
            # but excluding the collection on which permission was granted
            collection_path_filter = Q(
                path__startswith=collection_roots[0]["path"]
            ) & Q(depth__gt=collection_roots[0]["depth"])
            for collection in collection_roots[1:]:
                collection_path_filter = collection_path_filter | (
                    Q(path__startswith=collection["path"])
                    & Q(depth__gt=collection["depth"])
                )
            return Collection.objects.all().filter(collection_path_filter)
        else:
            # no matching collections
            return Collection.objects.none()


    from wagtail.permission_policies.collections import CollectionManagementPermissionPolicy
    CollectionManagementPermissionPolicy._descendants_with_perm = multitenant__descendants_with_perm

Permissions for other models

We also need per-site permissions to manage other kinds of models - Snippets in Wagtail’s terminology. Please see the last section of Snippets for the code we use in our authentication backend.

Monkey Patching Wagtail

At work we run a large multitenant version of Wagtail (~500 separate websites on a single installation). To achieve this and to make some other changes to the way Wagtail behaves, we have a number of monkey patches. So we have consolidated all of them in their own Django app which we called wagtail_patches. This is loaded into our INSTALLED_APPS after most of our own apps but before any of the Wagtail apps:

    # settings.py
    INSTALLED_APPS = [
        # Multitenant apps. These are ordered with regard to template overrides.
        'core',
        'search',
        'site_creator',
        'calendar',
        'theme_v6_5',
        'theme_v7_0',
        'robots_txt',
        'wagtail_patches',  #####
        'sitemap',
        'features',
        'custom_auth',

        # Wagtail apps.
        'wagtail.embeds',
        'wagtail.sites',
        'wagtail.users',
        'wagtail.snippets',
        'wagtail.documents',
        # We use a custom replacement for wagtail.images that makes it add decoding="async" and loading="lazy" attrs.
        # 'wagtail.images',
        'wagtail_patches.apps.MultitenantImagesAppConfig',
        'wagtail.search',
        'wagtail.admin',
        'wagtail',
        'wagtail.contrib.modeladmin',
        'wagtail.contrib.settings',
        'wagtail.contrib.routable_page',

        # Wagtail dependencies, django, etc.....
    ]

And then in that app, we use the apps.py file to load everything from the patches directory:

    from django.apps import AppConfig
    from wagtail.images.apps import WagtailImagesAppConfig


    class WagtailPatchesConfig(AppConfig):
        name = 'wagtail_patches'
        verbose_name = 'Wagtail Patches'
        ready_is_done = False
        # If there are multiple AppConfigs in a single apps.py, one of them needs to be default=True.
        default = True

        def ready(self):
            """
            This function runs as soon as the app is loaded. It executes our monkey patches to various parts of Wagtail
            that change it to support our architecture of fully separated tenants.
            """
            # As suggested by the Django docs, we need to make absolutely certain that this code runs only once.
            if not self.ready_is_done:
                # The act of performing this import executes all the code in patches/__init__.py.
                from . import patches  # noqa
                self.ready_is_done = True
            else:
                print("{}.ready() executed more than once! This method's code is skipped on subsequent runs.".format(
                    self.__class__.__name__
                ))


    class MultitenantImagesAppConfig(WagtailImagesAppConfig):
        default_attrs = {"decoding": "async", "loading": "lazy"}

You will note that the first of our customizations is right in apps.py. We use this file to configure default html attributes for image tags generated by Wagtail - per the instructions in “Adding default attributes to all images”.

Patching views

We have a handful of views that need overrides. Mostly these involve changing querysets or altering filters so the choices are limited to users belonging to the current site. The easiest option is to subclass the existing view, make our changes, then assign our subclass to the same path as the original.

I use the show_urls command from django_extensions to find the existing mapping. And then I map my replacement view to the same pattern. So for replacing the page explorer view, I added the following two lines:

    # patched_urls.py
    from .views.page_explorer import MultitenantPageIndexView

    patched_wagtail_urlpatterns = [
        # This overrides the wagtailadmin_explore_page (aka page listing view) so we can monkey patch the filters
        path('admin/pages/', MultitenantPageIndexView.as_view()),
        path('admin/pages/<int:parent_page_id>/', MultitenantPageIndexView.as_view()),
    ]

Because we have a bunch of overrides, we have a patched_urls.py in our wagtail_patches app. Then, in our main urls.py file, we add that pattern before our other mappings:

    # urls.py
    from wagtail import views as wagtailcore_views
    from wagtail_patches.patched_urls import patched_wagtail_urlpatterns

    # We override several /admin/* URLs with our own custom versions
    urlpatterns = patched_wagtail_urlpatterns + [
        # We now include wagtails' own admin URLs.
        path('admin/', include('wagtail.admin.urls')),
        path('documents/', include('wagtail.documents.urls')),
        ... our custom urls and the rest of the standard Wagtail url mappings
    ]

I then use show_urls to check my mapping. As long as our version is the second one, then it will get used. If you feel like your changes are getting ignored, start by checking to see that the url pattern for your override exactly matches the original pattern.

Multitenancy with Wagtail

If you want to run several sites from the same Wagtail codebase, you have a couple of options which are summarized in the Wagtail docs.

Wagtail fully supports “multi-site” installations where “where content creators go into a single admin interface and manage the content of multiple websites”. But at work, we would like our Wagtail installation to treat every site as if it were completely independent. So if you have permissions on Site A and Site B, when you’re logged in to Site A, you should only see content, images, etc. from Site A. We also want site owners to be able to manage just about everything for their site. This means that they need to be able to configure their own site’s settings, manage their own collections, images, and documents and manage their own users. This series of blog posts will cover the changes we have made to enforce our version of multitenancy for sites built with the Wagtail CMS.

  1. Monkey Patching Wagtail
  2. Permission Patches for Multitenancy
  3. Users and Groups
  4. Site Creator
  5. Snippets
  6. Snippet Choosers
  7. Reports

These posts were originally written describing our patches while running Wagtail 5.1 (and Django 3.2). I have subsequently updated them for additional patches I made to upgrade to Wagtail 6.0 (and Django 4.2).

Determining the current site

When we first started using Wagtail, it included its own site middleware so request.site was available in all views. When this was removed in Wagtail 2.9, we started using CRequestMiddleware to make the request information available from a variety of contexts. We generally access the request via our own get_current_request method which allows us to provide a useful error message if the request is not available.

    def get_current_request(default=None, silent=True, label='__DEFAULT_LABEL__'):
        """
        Returns the current request.

        You can optionally use ``default`` to pass in a fake request object to act as the default if there
        is no current request, e.g. when ``get_current_request()`` is called during a manage.py command.

        :param default: (optional) a fake request object
        :type default: an object that emulates a Django request object

        :param silent: If ``False``, raise an exception if CRequestMiddleware can't get us a request object.  Default: True
        :type silent: boolean

        :param label: If ``silent`` is ``False``, put this label in our exception message
        :type label: string

        :rtype: a Django request object
        """
        request = CrequestMiddleware.get_request(default)
        if request is None and not silent:
            raise NoCurrentRequestException(
                "{} failed because there is no current request. Try using djunk.utils.FakeCurrentRequest.".format(label)
            )
        return request

NOTE: get_current_request has a parameter for setting a default site if none is available when the method is called but in practice we never provide a default site in code that is trying to access the request. Instead we use one of the methods below to fake the request and then let get_current_request use that to determine the site.

Setting current site in scripts and tests

Our data imports, manage.py scripts, and tests do not have a browser context, so get_current_request will fail in those circumstances. We have created a couple of methods to help set the request and site in those circumstances. This is working but it remains a bit of a pain point.

    class FakeRequest:
        """
        FakeRequest takes the place of the django HTTPRequest object in various testing scenarios where
        a real one doesn't exist, but the code under test expects one to be there.

        Wagtail 2.9 now determines the current Site by looking at the hostname and port in the request object,
        which means it calls get_host() on our faked out requests. Thus, we need to emulate it.
        """

        def __init__(self, site=None, user=None, **kwargs):
            self.user = user
            # Include empty GET and POST attrs, so code which expects request.GET or request.POST to exist won't crash.
            self.GET = self.POST = {}
            # Callers can override GET and POST, or override/add any other attribute using kwargs.
            self.__dict__.update(kwargs)
            self._wagtail_site = site

        def get_host(self):
            if not self._wagtail_site:
                return 'fakehost'
            return self._wagtail_site.hostname

        def get_port(self):
            # It should be safe to pretend all test traffic is on port 443.
            # HTTPRequest.get_port() explicitly returns a string, so we do, too.
            return '443'


    def set_fake_current_request(site=None, user=None, request=None, **kwargs):
        """
        Sets the current request to either a specified request object or a FakeRequest object built from the given Site
        and/or User. Any additional keyword args are added as attributes on the FakeRequest.
        """
        # If the caller didn't provide a request object, create a FakeRequest.
        if request is None:
            request = FakeRequest(site, user, **kwargs)
        # Set the created (or provided) request as the "current request".
        CrequestMiddleware.set_request(request)
        return request


    class FakeCurrentRequest():
        """
        Implements set_fake_current_request() as a context manager. Use like this:
        with FakeCurrentRequest(some_site, some_user):
            // .. do stuff
        OR
        with FakeCurrentRequest(request=some_request):
            // .. do stuff

        When the context manager exits, the current request will be automatically reverted to its previous state.
        """
        NO_CURRENT_REQUEST = 'no_current_request'

        def __init__(self, site=None, user=None, request=None, **kwargs):
            self.site = site
            self.user = user
            self.request = request
            self.kwargs = kwargs

        def __enter__(self):
            # Store a copy of the original current request, so we can restore it when the context manager exits.
            self.old_request = CrequestMiddleware.get_request(default=self.NO_CURRENT_REQUEST)
            return set_fake_current_request(self.site, self.user, self.request, **self.kwargs)

        def __exit__(self, *args):
            if self.old_request == self.NO_CURRENT_REQUEST:
                # If there wasn't a current request when we entered the contact manager, remove the current request.
                CrequestMiddleware.del_request()
            else:
                # Otherwise, set the current request back to whatever it was when we entered.
                CrequestMiddleware.set_request(self.old_request)

On Campus Middleware

Note this code uses regular expressions to determine if a request comes from one of our allowed IPs. This should really be reworked to use a library that does proper netmask calculations.

    class OnCampusMiddleware(MiddlewareMixin):
        """
        Middleware sets ON_CAMPUS session variable to True if the request
        came from an campus IP or if the user is authenticated.

        2022-04-09 Storing ON_CAMPUS in the session is causing us to set a
        cookie for every request which interferes with Cloudflare caching.
        If your site is largely for anonymous users, store ON_CAMPUS in the request
        itself by adding STORE_ON_CAMPUS_IN_SESSION=False to your settings.py
        """

        CAMPUS_ADDRESSES = [
            # redacted
            r'192\.168\.\d{1,3}\.\d{1,3}',
            r'127\.0\.0\.1',
        ]

        def check_ip(self, request):
            client_ip = get_client_ip(request)

            if client_ip:
                for ip_regex in self.CAMPUS_ADDRESSES:
                    if re.match(ip_regex, client_ip):
                        return True
            return False

        def process_request(self, request):
            # A user is considered "on campus" if they are visiting from a campus IP, or are logged in
            # to the site.
            if getattr(settings, 'STORE_ON_CAMPUS_IN_SESSION', True):
                request.session['ON_CAMPUS'] = request.user.is_authenticated or self.check_ip(request)
            else:
                request.on_campus = request.user.is_authenticated or self.check_ip(request)
            return None

Then to use this in a Django project:

    # settings.py
    ...
    MIDDLEWARE = [
        # Normal Django middle ware stack
        # Sets request.on_campus = True for logged-in users, and for visitors who come from a campus IP.
        # Set STORE_ON_CAMPUS_IN_SESSION to False to prevent setting cookies for anonymous users.
            'djunk.middleware.OnCampusMiddleware',
    ]

    STORE_ON_CAMPUS_IN_SESSION = False