Skip to content

Use fixed size memory allocation in IndicesPermission#77748

Merged
ywangd merged 4 commits into
elastic:masterfrom
tvernum:role-authz-map-size
Sep 24, 2021
Merged

Use fixed size memory allocation in IndicesPermission#77748
ywangd merged 4 commits into
elastic:masterfrom
tvernum:role-authz-map-size

Conversation

@tvernum

@tvernum tvernum commented Sep 15, 2021

Copy link
Copy Markdown
Contributor

This changes the implementation of the IndicesPermission.authorize
method so that data structures can be constructed with a known size.

Previously, when authorizing a request with a large number of indices,
the HashMaps would need to resize themselves multiple times, at a
noticeable performance cost.

In order to know how large these maps need to be, we now expand all
named resource (indices, aliases, data-streams) upfront and then set
the initial size of the maps accordingly.

This changes the implementation of the Role.authorize method so that
data structures can be constructed with a known size.

Previously, when authorizing a request with a large number of indices,
the HashMaps would need to resize themselves multiple times, at a
noticeable performance cost.

In order to know how large these maps need to be, we now expand all
named resource (indices, aliases, data-streams) upfront and then set
the initial size of the maps accordingly.
@tvernum tvernum added >enhancement :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v8.0.0 v7.16.0 labels Sep 15, 2021
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Sep 15, 2021
@elasticmachine

Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-security (Team:Security)

@tvernum

tvernum commented Sep 15, 2021

Copy link
Copy Markdown
Contributor Author

On a large cluster (250k indices) with a large request (50k indices) this cuts the execution time of Role.authorize by about 20% in my tests (for superuser).

@tvernum tvernum changed the title Use fixed size memory allocation in Role.authorize Use fixed size memory allocation in IndicesPermission Sep 16, 2021
@tvernum tvernum changed the title Use fixed size memory allocation in IndicesPermission Use fixed size memory allocation in Role.authorize Sep 16, 2021
@tvernum tvernum changed the title Use fixed size memory allocation in Role.authorize Use fixed size memory allocation in IndicesPermission Sep 16, 2021

@albertzaharovits albertzaharovits left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I confirm that the interface has not changed.

@ywangd ywangd left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Expanding all concrete indices upfront is likely to increase heap usage to some extend. It may not be worth concerning unless the cluster and request are configured really badly (e.g. tons of aliases pointing to the same indices/data streams). But I wonder whether we could still be better prepared for this situation if we change IndexResource to return concreteIndices as an Iterator. This can be done by retaining a reference of IndexAbstraction in the create method. The change should(?) be relative small. But I'll leave it up to you for consideration.

Comment on lines +276 to +278
assert concreteIndices.isEmpty() || concreteIndices.contains(name) : "An object of type "
+ type
+ " must reference itself";

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we allow both an empty collection and a singleton collection for concrete Index type? I'd prefer to pick either one of them. IIUC, the subtle difference between empty and singleton is that an empty collection indicates the concrete index does not exist in the cluster state yet. But do we care this difference for authorization?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I was trying to avoid implementing behavioural changes, and that's what the existing code does - a missing index is treated like a concrete index with no expanded names.

for (String indexOrAlias : requestedIndicesOrAliases) {
final IndexResource resource = IndexResource.create(indexOrAlias, lookup.get(indexOrAlias));
resources.add(resource);
totalResourceCount += resource.size();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that concreteIndices of multiple IndexResource overlap. For example:

requestedIndicesOrAliases = Set.of("alias1", "alias2")
where both aliases point to index1.

So the total count of distinct resources is 3 instead of 4.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is possible. The important thing here is to avoid reallocating the maps. If we over-size them that's a tiny heap cost with essentially no performance cost.

Comment on lines +385 to +386
final Map<String, Set<FieldPermissions>> fieldPermissionsByIndex = new HashMap<>(totalResourceCount);
final Map<String, DocumentLevelPermissions> roleQueriesByIndex = new HashMap<>(totalResourceCount);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike grantedBuilder, these two Maps may not be populated for every concrete index because they are only calculated if there is at least one group grants the permission. So we could be allocating more than needed. But when there are indices rejected by all permission groups, maybe it is always the unhappy path (request rejected) and we don't really care that much about a bit more memory consumption?

@tvernum

tvernum commented Sep 23, 2021

Copy link
Copy Markdown
Contributor Author

Expanding all concrete indices upfront is likely to increase heap usage to some extend.

Yes it does, but given we build multiple maps, each of which has those indices as keys, we're already using a lot of heap for this. We can probably get some good gains in heap usage by having a single Map with a richer object as the value.

e.g.

class IndexPermissions {
    public Boolean granted;
    public Set<FieldPermissions> fieldPermissions;
    public DocumentLevelPermissions roleQueries;
}
final Map<String, IndexPermissions> permissionsByIndex = new HashMap<>(totalResourceCount);

I don't know that it's worth it, but if we're worried about heap, that's probably a better place to make savings.

This can be done by retaining a reference of IndexAbstraction in the create method. The change should(?) be relative small.

I did consider that, but it felt like it would be more complex than that. I'll try it out and see.

@tvernum

tvernum commented Sep 23, 2021

Copy link
Copy Markdown
Contributor Author

I did consider that, but it felt like it would be more complex than that. I'll try it out and see.

It is more complex, but it works out OK, so I've merged it to this PR.

@tvernum

tvernum commented Sep 23, 2021

Copy link
Copy Markdown
Contributor Author

@elasticmachine update branch

@ywangd ywangd merged commit 2cfdadc into elastic:master Sep 24, 2021
ywangd pushed a commit to ywangd/elasticsearch that referenced this pull request Sep 25, 2021
This changes the implementation of the Role.authorize method so that
data structures can be constructed with a known size.

Previously, when authorizing a request with a large number of indices,
the HashMaps would need to resize themselves multiple times, at a
noticeable performance cost.

In order to know how large these maps need to be, we now expand all
named resource (indices, aliases, data-streams) upfront and then set
the initial size of the maps accordingly.
ywangd added a commit that referenced this pull request Sep 25, 2021
This changes the implementation of the Role.authorize method so that
data structures can be constructed with a known size.

Previously, when authorizing a request with a large number of indices,
the HashMaps would need to resize themselves multiple times, at a
noticeable performance cost.

In order to know how large these maps need to be, we now expand all
named resource (indices, aliases, data-streams) upfront and then set
the initial size of the maps accordingly.

Co-authored-by: Tim Vernum <tim.vernum@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC Team:Security Meta label for security team v7.16.0 v8.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants