TABLE OF CONTENTS
For the initial installation of OSIS application, following steps must be executed.
- A Super Admin account will be created as part of S3C/Vault setup
- Set OSE properties:
- S3 URL
- Admin Vault URL along with credentials as Super Admin access key/ secret key
Note: "Super Admin" account has full Account Management access and can use "AssumeRoleBackbeat" to assume a role of another Account.
Important Assumptions:
- All the Tenant APIs on OSIS will use Super Admin credentials to manage on Vault.
- A Vault Account is equivalent of OSE Tenant.
This API creates a Tenant on Vault.
- Vault
create-account
api will be called using vaultclient.- Tenant Name is stored as Account Name
cd_tenant_ids
list as part of the request to be stored as a property of the vault account.- Account ID to be sent back as storage tenant ID to OSE
- [F00] will trigger an asynchronous thread to invoke the
SetupAssumeRole
subroutine. For more, see SetupAssumeRole-subroutine.
This API will list tenants on Vault.
-
List Tenants
API can be called with following parameters:offset
: The start index of tenants to return (optional)limit
: Maximum number of tenants to return (optional)
-
Vault
list-accounts
api will be called using vaultclient with parametersmarker
(if exists frommarkerCache
)max-limit
-
markerCache
- Every time
List Accounts
API is called by the OSIS, ifisTruncated
istrue
in the response, thenmarker
value will be stored in themarkerCache
with the key as(max-limit +1)
- Each entry in the
markerCache
will be short-lived (implementation is for no more than 60 seconds). - See Cache-Design for more information.
- Every time
This API will query tenants on Vault using a filter
parameter.
-
Query Tenants
API can be called with following parameters:offset
: The start index of tenants to return (optional)limit
: Maximum number of tenants to return (optional)filter
parameter- Usually the OSE passes only the
cd_tenant_id
field with a value underfilter
parameter - If
cd_tenant_id
filter parameter is passed,cd_tenant_id
value will be validated for UUID format. - If
cd_tenant_id
filter value is in UUID format,list-accounts
api will be invoked. - If
cd_tenant_id
filter value is not in UUID format,get-account
api will be invoked with the provided value.
- Usually the OSE passes only the
-
Vault
list-accounts
api will be called using vaultclient with parametersmarker
(if exists frommarkerCache
)max-limit
filterKey=cd_tenant_id%3D%3D<uuid1>
(Thefilter
value from OSE will always be in thecd_tenant_id%3D%3D<uuid1>
format).- Cache design can be found here.
This API will return the tenant on Vault with tenantID
.
- Vault
get-account
api will be called using vaultclient with the providedtenantID
asAccountID
.
This API will check whether the tenant exists on Vault with tenantID
.
- Vault
get-account
api will be called using vaultclient with the providedtenantID
asAccountID
. - Return
true
orfalse
if result is returned or not respectively.
This API will delete the tenant on Vault.
- Vault
delete-account
api will be called using vaultclient with the providedtenantID
asAccountID
. - Return error if account is not empty.
This API will update the existing storage tenant with the provided cd_tenant_id
. Input parameters are:
tenantId
: Tenant ID of the tenant to updatetenant
object: Object that holds the newcd_tenant_ids
along with originaltenant
properties.
- Vault
update-account
api will be called using vaultclient with the providedtenantID
asAccountID
andtenant
properties
-
Use vaultclient to call
generate-account-access-key
withdurationSeconds
for the account. -
Create a role using the generated access key with:
- Role name:
osis
(So role-arn can be generated in the formatarn:aws:iam::[account-id]:role/osis
) - Trust policy:
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": { "Service": "service-name" }, "Action": "sts:AssumeRole" }] }
- Note:
- This role is visible to and modifiable by the root account user or by an IAM user with correct permissions (iam:list-roles, iam:deleteRoles...). Do not edit this role. If this role has been edited, delete it. It will be automatically repopulated (See Assume-Role for more).
- Role name:
-
Create an IAM managed policy
adminPolicy@[account-id]
with full S3 and IAM access using the generated access key. -
Invoke
attach-role-policy
to attach the policyadminPolicy@[account-id]
to theosis
role using the generated access key. -
Use
delete-access-key
to delete the account's access key.
- Invoke the Assume Role flow before accessing any user APIs.
- If any user API returns an
Access Denied
error with the message:user [RoleArn] don't have any policies, denied access
, then invoke theSetupAssumeRolePolicy
flow (See Setup-Assume-Role-Policy.
- Generate an access key for the account by using vaultclient to call
generate-account-access-key
withdurationSeconds
. - Invoke IAM
get-policy
API for policyadminPolicy@[account-id]
using the generated access key. - If no policy is returned, use the generated access key to create an IAM managed policy
adminPolicy@[account-id]
with full s3 and iam access ({ s3:*, iam:* }
). - Invoke
attach-role-policy
to attach the policyadminPolicy@[account-id]
to theosis
role using the generated access key. delete-access-key
for the account.
- Before invoking a User API, you must call the
AssumeRole
flow. Use the tenant account credentials fromassumeRoleCache
for the given account ID before each User API. - If the tenant account credentials were not found in
assumeRoleCache
, theAssumeRoleBackbeat
API must be called as superadmin and added to theassumeRoleCache
cache. - If the
AssumeRoleBackbeat
API returns aNoSuchEntity
error with aRole does not exist
description, use vaultclient to invokeget-account
withaccountID
to retrieve theaccountName
and then invoke theSetupAssumeRole
subroutine). assumeRoleCache
cache will be a [key, value] pair of [Role_Arn, Temporary_Credentials] respectively.- Each
assumeRoleCache
entry will have a 50-minute TTL (its session token is valid only for 60 minutes). - When accessing credentials on the cache, refresh the credentials as well as
needsRefresh
(using AWS SDKrefresh()
), as they are subject to expiration. - Cache design can be found here.
- Each
This API creates a user on Vault.
create-user
api will be called using assumed role credentials.- The tenant user's
cdUserId
is stored asusername
in the corresponding Vault user. - The tenant user's
username
,role
enum value,emailAddress
,cdTenantID
and account'scanonicalID
are stored in the Vault user path as/<tenantUsername>/<roleEnumValue>/<email>/<cdTenantID>/<canonicalID>/
.
- The tenant user's
- The
get-policy
API will be called with theuserPolicy@[account-id]
policy name using assumed role credentials - If the
get-policy
API does not return any policy, using the assumed role credentials creates an IAM managed policy with theuserPolicy@[account-id]
policy name with full S3 access. attach-policy
for the new user will be called using the assumed role credentials and the policy arn formatted asarn:aws:iam::[account-id]:policy/userPolicy@[account-id]
.- The IAM
generate-access-key
API will be called using assumed role credentials. - Encrypt the secret key (For more, see SecretKey-Encryption-Strategy). Store the encrypted secret key on Redis Sentinel in the hash named "osis:s3credentials", with the hash key formatted as
<Username>__<AccessKeyID>
.
This API will list users on Vault.
- The
list-users
API can be called with following parameters:offset
: The start index of tenants to return (optional)limit
: The maximum number of tenants to return (optional)
list-users
api will be called using assumed role credentials.- If
offset
is present, return thelist-users
result by providing theoffset
value as themarker
parameter.
- If
This API will query users on Vault using a filter
parameter with the filters of cd_tenant_id
and display_name
.
- Extract the
cd_tenant_id
filter from thefilter
parameter.- If
cd_tenant_id
filter value is in the UUID format, it will be used to calllist-accounts
using vaultclient to retrieve theaccountID
. - If
cd_tenant_id
filter value is not in the UUID format, it will be considered asaccountID
.
- If
- Use
accountID
to generate assumed role credentials for that particular account. - Extract the
displayname
value from thefilter
parameter. - The
list-users
API will be called using assumed role credentials with thepath-prefix
as/<display_name>/
.
This API will return the user.
- The
get-user
API will be called using assumed role credentials, with the tenant user's user ID as the username.
This API will return the user.
- The
get-account
API will be called using vaultclient with the providedcanonical-id
and account details will be used to fill tenant details of the response. - The
list-users
API will be called withoffset
as 0 andlimit
as 1000 and the last user values in the response will be used to fill the user details of the response.canonical-id
in Scality is defined with respect to an account and any specific user cannot be retrieved using only thecanonicalID
get-user-with-canonical-id
API will be called by OSE only when user APIs does not return thecanonical-id
for any user.
This API will return if user exists or not.
- The
get-user
API will be called using assumed-role credentials, with the tenant user's user ID as the username. - Return
true
orfalse
if result is returned or not respectively.
This API will delete user on Vault.
delete-user
api will be called using assumed role credentials.
This API will enable or disable user on Vault.
updateAccessKey
api will be called using assumed role credentials to disable/enable access keys for the user.
S3 Credential APIs Have Common Behavior with User APIs
This API creates S3 credentials for the user.
- The Generate Access Key API will be called using assumed role credentials.
- Encrypt the secret key (For more, see SecretKey-Encryption-Strategy). Store the encrypted secret key on Redis Sentinel in the hash named
osis:s3credentials
, with the hash key formatted as<Username>__<AccessKeyID>
.
Important Notes:
- Redis Sentinel is subject to an ongoing rolling upgrade.
- If the entire Redis Sentinel cluster fails or crashes, the vCloud Director's OSE service must be restarted with the following command:
$ ose service restart
- Once OSE service restart process is finished,
- OSE will invoke
listS3Credentials
API to identify the keys for object storage operations. As no key is available for the user on Redis, OSIS will create a new key for object storage operations and returns along with all the other keys on Vault DB. (For more, see List S3 Credentials)
- OSE will invoke
- Access keys created by OSIS before Redis crash will be listed with
secretKey
value asNot Available
. vCloud Director Tenant Users are responsible to clean up "Not Available" access keys using S3 console or IAM API because only they know if the access keys are in use or not.
- Once OSE service restart process is finished,
- Redis Sentinel only supports storing 4294967295 (approx. 4.2 billion) keys in the hash.
This API query S3 credentials of the user using a filter
parameter.
list-access-keys
api will be called using assumed role credentials.- Filter the result credentials using the
filter
parameter in the request.
This API list S3 credentials of the user.
list-access-keys
api will be called using assumed role credentials.- Retrieve the secret keys, formatted as
<Username>__<AccessKeyID>
, from Redis Sentinel. - Decrypt the secret keys (For more, see SecretKey-Encryption-Strategy) and add them to the response.
- If no key is available for the user on Redis, OSIS will invoke
createS3Credentials
API in the backend to create a new key for object storage operations and adds the new key to the response. - Add all the keys that are not present in the Redis Sentinel at the bottom of the list in the response with
secretKey
value asNot Available
.
This API deletes the S3 credential of the user.
delete-access-key
api will be called using assumed role credentials.
This API return S3 credential of the user with the provided access key.
list-access-keys
api will be called using assumed role credentials.- Extract the provided
access-key
details from the response - Retrieve the secret key, formatted as
<Username>__<AccessKeyID>
, from Redis Sentinel. - Decrypt the secret key (For more, see SecretKey-Encryption-Strategy) and add it to the response.
- If the key is not present in the Redis Sentinel return the response with
secretKey
value asNot Available
.
Get the console URI of the platform or platform tenant if tenantId is specified
- Return supervisor URL (Static and needs to be maintained in Config)
- This is a tunable
Get the console URI of the platform or platform tenant if tenantId is specified
-
First iteration
- It is configuration in application.yml
- It is Url of the S3 console
-
Optional: we can provide some kind of SSO)
-
For Next Gen production, it will be the XDM ui
Get S3 capabilities of the platform
- It is using an xml file to show s3 capabilities it’s a copy paste from ceph code)
Get the information of the REST Services, including platform name, OSIS version and etc (Static Details)
Get the bucket list of the platform tenant
- S3:
listBucket
API
Get the platform usage of global level (without query parameter), tenant level (only with tenant_id) or user level (with tenant_id and user_id).
Platform usage has the 5 metrics: bucket count, object count, total_bytes, available_bytes, used_bytes at the level of global, tenant and user respectively.
-
bucket_count
- global: get all accounts from VAULT which are associated with VCD tenants, and call OSIS getBucketList API of each tenant respectively to get the bucket count
- tenant: call OSIS getBucketList API of the provided tenant_id to get the bucket count
- user: call OSIS getBucketList API of the provided tenant_id to get all buckets then filter by bucket owner equal to provided user_id
-
object_count
- global: get all accounts from VAULT which are associated with VCD tenants,
call UTAPI ListMetrics API with all tenant_ids to get
numberOfObjects
field. (Split into multiple requests if the number of accounts is big, the threshold value can be 50-100) - tenant: call UTAPI ListMetrics API with tenant_id to get
numberOfObjects
field - user: call UTAPI ListMetrics API with user_id to get
numberOfObjects
field
- global: get all accounts from VAULT which are associated with VCD tenants,
call UTAPI ListMetrics API with all tenant_ids to get
-
total_bytes
- tenant: call VAULT getAccount API to get quota of the tenant, total_bytes will be the quota of the current tenant if quota exists, otherwise, it will be -1
- global and user: -1
-
used_bytes
- global: call UTAPI ListMetrics with all tenant_ids to get
storageUtilized
filed. (Split into multiple requests if the number of accounts is big, the threshold value can be 50-100) - tenant: call UTAPI ListMetrics with tenant_id to get
storageUtilized
field - user: call UTAPI ListMetrics with user_id to get
storageUtilized
field
- global: call UTAPI ListMetrics with all tenant_ids to get
-
available_bytes
- tenant: can be calculated by total_bytes and used_bytes if total_bytes exists, otherwise, will be -1 as well.
- global and user: -1
Like how we integrated Vault/Cloudserver with OSIS, add UTAPI host and port in application.properties
file, create a UtapiClient class and add a listMetrics function to call Utapi listMetrics requests.
We will have a config field osis.scality.utapi.enabled
in application.properties
which can automatically read from S3C or Zenko cluster about if UTAPI is enabled, we can as well configure it manually.
If UTAPI is enabled, it will be added to the deep heath check for OSIS. As UTAPI is not critical in the data path, we will only have it in /_/healthcheck/deep
If UTAPI is unhealthy/unavailable, -1
will be returned for all metrics.
Currently, the policy that is assigned to OSIS users has full permissions to do all IAM and S3 requests, we will need to add utapi:ListMetrics
permission to give OSIS users access to ListMetrics calls.
Also, the ListMetrics calls are sent with OSIS user’s accessKey/secretKey/sessionToken.
UTAPI enabled in S3C or Zenko cluster may enable expiration, it disposes of old metrics on a rolling schedule. This won't affect the use case in OSIS as the metrics we need are accumulated not periodical, we just need its most recent metrics.
-
Implement caches using the following principles:
- The cache must have a maximum capacity.
- The cache must have a Least Recently Used (LRU) eviction policy.
- Each cache entry must have an expiration time (TTL).
- Cache
put
call must includettl
for the entry. - Use
ScheduledFuture
in Java for cache entry removal after TTL.
- Cache
- Caches must handle concurrency.
- Use
Read locks
forget
calls andwrite locks
forput
calls.
- Use
- Caches can be configured using the following properties in
application.properties
:ttl
for invalidating cache entriesmaxCapacity
for maximum number of entries in the cache- A flag to enable or disable the cache
-
Implement the following caches for this project:
ListAccountsMarkerCache
for theList Tenants
andQuery Tenants
APIskey
:offset
andvalue
:marker
AssumeRoleCache
for all theUser
andS3 Credential
APIskey
:RoleArn
andvalue
:Credentials
secret-key
value in the form of plaintext will be encrypted and decrypted using the cipher algorithm (property:cipher
), and corresponding cryptographic key (property:secretKey
) of the latest key slot in theosis.security.keys
list provided in thecrypto.yml
config file.- A
id
property will be provided for each key slot entry in theosis.security.keys
list in thecrypto.yml
config file to maintain the version of the key and will be used in the Key rotation.- Example
crypto.yml
keys: - id: 2 cipher: AES256GCM secretKey: YW5vdGhlcmxpbmVvZnBhc3N3b3JkZm9yYW5vdG== - id: 1 cipher: AES256GCM secretKey: dGhpc2lzYXJlYWxseWxvbmdhbmRzdHJvbmdrZXk=
- Example
- A java object of class
SecretKeyRepoData
will be created with variableskeyId
,encryptedData
andcipherInformation
keyId
variable will be of type String and stores theid
property of the key used.encryptedData
will be a byte array and stores the encrypted data bytes.cipherInformation
will be an encapsulation to store all the additional information used by the cipher algorithm.- Example: For
AES256GCM
algorithm,cipherInformation
encapsulates thenonce
variable, which will be a byte array and stores the nonce bytes.
- Example: For
- Java object of class
SecretKeyRepoData
will be serialized and stored as binary in Redis - During decryption,
keyId
in the meta information will be used to identify applied the cipher algorithm. - Initially, only the AES256GCM cipher algorithm is supported. (The Secretbox algorithm is suggested for future releases)
- An ability to change the cipher algorithm will be provided.
- Key Rotation:
- Key rotation is used to replace a key with another one once the original key is compromised or too old to use.
crypto.yml
file will be updated with a new key slot on top ofkeys
list with an incrementedid
value.- A key rotation application jar file will be provided to do the key rotation of all the encrypted
secret-key
data on Redis database.- Key rotation application will use the meta-information of the encrypted data to identify the
keyId
used. - Each
secret-key
will be decrypted using the old cipher algorithm (identified using thekeyId
). secret-key
value will be re-encrypted using the latest cipher algorithm and updated on Redis.- This key rotation application need to be re-run until it returns success which means all the
secret-key
values on Redis are re-encrypted.
- Key rotation application will use the meta-information of the encrypted data to identify the
- During decryption, if a new
keyId
is found,OSIS
application will reload thecrypto.yml
file