- Article
This article shows you how to resolve common issues you may encounter with managed feature storage in Azure Machine Learning.
Problems were found when building and updating the feature store
When you create or update a feature store, you may encounter the following issues.
- ARM throttling error
- RBAC permission error
- Duplicate materialization identity ARM ID issue
ARM throttling error
Symptoms
Creating or updating the feature store fails. The error may look like this:
{ "error": { "code": "TooManyRequests", "message": "The request is slowing down because the write operation type limit has been reached. ...", "details": [ { "code": "TooManyRequests", " target": "Microsoft.MachineLearningServices/workspaces", "message": "..." } ] }}
The solution
Perform the function store create/update operation later. Since deployment is a multi-step process, the second attempt might fail because some of the resources already exist. Delete these resources and continue working.
RBAC permission error
To create a feature store, a user must have the followingassociate
IUser access administrator
Roles (or a custom role that covers the same or a larger set of actions).
Symptoms
If the user does not have the required roles, the assignment will fail. The error response may look like this
{ "error": { "code": "AuthorizationFailed", "message": "Client "{client_id}" with object ID "{object_id}" does not have permission to use action "{action_name}" in Execute range "{ ". range}' or the range is invalid. If access was recently granted, please update your credentials." }}
The solution
approveassociate
IUser access administrator
Assign roles to the user in the resource group where the feature store will be created and instruct the user to restart the deployment.
For more details seePermissions required for the Feature Store Materialization managed identity role..
Issue with duplicate materialization identity, ARM ID
After the feature store is updated to enable materialization for the first time, some later updates to the feature store may have this error.
Symptoms
When updating the feature store using the SDK/CLI, the update fails with the following error message
Error:
{ "error":{ "code": "InvalidRequestContent", "message": "The request content contains duplicate JSON property names that lead to ambiguity in the paths 'identity.userAssignedIdentities['/subscriptions/{sub-id}/resourceGroups/ { rg }/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{your-uai}']' Update the request content to remove duplicates and try again. }}
The solution
The problem is in the ARM ID formatmaterialization_identity
.
In the Azure UI or SDK, the ARM ID of a user-assigned managed identity is written in lowercase lettersresource groups
. See the following example:
- (A): /subscriptions/{sub-id}/resource groups/{rg}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{vaš-uai}
When a feature store uses a user-assigned managed identity as materialization_identity, its ARM ID is normalized and stored with itresource groups
, see the following example:
- (B): /subscriptions/{sub-id}/resource groups/{rg}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{vaš-uai}
The next time the user updates the feature store and uses the same user-assigned managed identity as the materialization identity in the update request, while also using the (A) ARM ID format, the update fails with the above error.
To fix the problem, replace the stringresource groups
sresource groups
Enter the user-assigned managed identity ARM ID and restart the feature store update.
Failed to create feature set specification
- Invalid schema in feature set specification
- The transformation class could not be found
- FileNotFoundError im Codeordner
Invalid schema in feature set specification
Before a feature set is registered in the feature repository, users first define and run the feature set specification locally
to confirm it.
Symptoms
When the user is working
Various schema validation errors can occur when the feature set data frame schema does not match what is defined in the feature set specification.
For example:
- Error message:
azure.ai.ml.Exceptions.ValidationException: Schema validation error, timestamp column: timestamp not in output data frame
- Error message:
Exception: Schema validation error, no index column: AccountID in output data frame
- Error message:
ValidationException: schema validation failed, feature column: transaction_7d_count has data type: ColumnType.long, expected: ColumnType.string
The solution
Check the schema validation error and update the feature set specification definition for the column name and type accordingly. For example:
- Please update
izvor.timestamp_column.name
Property to correctly define the timestamp column name. - Please update
index_columns
Property to correctly define index columns. - Please update
characteristics
property to correctly define feature column names and types.
Then run
again to check if the validation passed.
If the feature set specification is defined using the SDK, it is recommended to use it as wellinfer_shema
Ability to autofill the SDKcharacteristics
, instead of typing it in manually. Thetimestamp column
Iindex columns
cannot be filled in automatically.
Check it outSchema of the feature set specificationdoc for more details.
The transformation class could not be found
Symptoms
When the user is working
, the following error is returnedAttributeError: Module '<...>' attribute name '<...>'
For example:
AttributeError: Modul „7780d27aa8364270b6b61fed2a43b749.transaction_transform“ s atributom „TransactionFeatureTransformer1“
The solution
The feature transformation class is expected to be defined in a Python file at the root of the code folder (the code folder may contain other files or subfolders).
Set the valuefeature_transformation_code. transformation_class
be property
,
For example, if the code folder looks like this
Encode
/
└── my_transformation_class.py
iMyFeatureTransformer
The class is defined in the file my_transformation_class.py.
Sentencefeature_transformation_code. transformation_class
bemy_transformation_class.MyFeatureTransformer
FileNotFoundError im Codeordner
Symptoms
This can happen when the YAML feature set specification is created manually rather than generated by the SDK. If the userleads
, the following error is returnedFileNotFoundError: [Errno 2] No such file or directory: ....
The solution
Check the code map. It is expected to be a subfolder in the feature set specification folder.
Then you specify in the feature set specificationfeature_transformation_code.path
must be a relative path to the feature set specification map. For example:
Feature set specification map
/
├── Code/
│ ├── my_transformer.py
│ └── my_other_folder
└── FeatureSetSpec.yaml
And in this example thatfeature_transformation_code.path
the property should be in YAML./Encode
Note
When creating a FeatureSetSpec Python object using the create_feature_set_spec function in theazureml-featurestore
, It may take some timefeature_transformation_code.path
this is any local map. When a FeatureSetSpec object is placed in the target folder to form a feature set specification in yaml format, the code path is copied to the target folder andfeature_transformation_code.path
Updated property in YAML specification.
Function call job and query error
- Retrieval specification for error correction features
- The feature_retrieval_spec.yaml file was not found when using a model as input to a feature retrieval job
- [Observational data are not associated with characteristic values]
- The user or managed identity does not have the appropriate RBAC permission to store features
- The user or managed identity does not have the appropriate RBAC permission to read from native storage or offline storage
- The training job cannot read data generated by the built-in feature fetcher
If the feature fetch job fails, check the error details by going toStart the details page, chooseOutputs + LogsPress the tab and check the filelogs/azureml/driver/stdout.
When the user is workingget_offline_feature()
When you query the notebook, the error is displayed directly as cell output.
Retrieval specification for error correction features
Symptoms
The query/job to get the functions shows the following errors
- Invalid function
Code: 'UserError' Message: 'Function'' was not found in this function set."
- Invalid feature store URI:
Poruka: "Resurs 'Microsoft.MachineLearningServices/workspaces/' under resource group '<>>resource group name>'->' not found. For more information see https://aka.ms/ARMResourceNotFoundFix",code: "ResourceNotFound"
- Invalid feature set:
Code: "UserError" Message: "A set of functions named:and versions:not found."
The solution
Check the contentfeature_retrieval_spec.yaml
use work. Ensure that all feature store URIs, feature set/version names, and feature names are valid and exist in the feature store.
It is also recommended to use the utility function to select features from the feature store and generate a feature retrieval specification YAML file.
This code snippet usesgenerate_specification_retrieving_features
Utility function.
from azureml.featurestore import FeatureStoreClientfrom azure.ai.ml.identity import AzureMLOnBehalfOfCredentialfeaturestore = FeatureStoreClient(credential = AzureMLOnBehalfOfCredential(),subscription_id = featurestore_subscription_id,resource_group_name = featurestore_resource_group_name,name = featurestor e_name)transactions_featureset = featurestore.feature_sets.get(name="transactions", verzija = "1")features = [ communications_featureset.get_feature('transaction_amount_7d_sum'), communications_featureset.get_feature('transaction_amount_3d_sum')]feature_retrieval_spec_folder = "./project/fraud_model/feature_retrieval_spec"featurestore.generate_feature_ retrieval_spec(feature_retrieval_spec_folder, Funktionen)
filefeature_retrieval_spec.yamlNot found when using a model as input to a feature fetch job
Symptoms
If a registered model is used as input to a feature fetch job, the job fails with the following error message:
ValueError: Visit Error: Execution Error: Streaming error from input data sourceVisitError(ExecutionError(StreamError(NotFound)))=> Execution Error: Streaming error from input data sourceExecutionError(StreamError(NotFound)); Could not find path: azureml://subscriptions/{sub_id}/resourcegroups/{rg}/workspaces/{ws}/datastores/workspaceblobstore/paths/LocalUpload/{guid}/feature_retrieval_spec.yaml
Solution:
When you provide a model as input to a feature fetch step, the fetch specification YAML file is assumed to exist in the model's artifacts folder. The job fails if the file does not exist.
To solve the problem, pack thisfeature_retrieval_spec.yaml
in the root folder of the model artifact folder before you register the model.
Observational data are not associated with characteristic values
Symptoms
After users run a query/job to retrieve features, the output data will not contain feature values.
For example, a user runs a feature fetch job to fetch featurestransaction_amount_3d_avg
Itransaction betrag_7d_avg
Transaction-ID | Konto-ID | timestamp | is_fraud | transaction_amount_3d_avg | transaction betrag_7d_avg |
---|---|---|---|---|---|
83870774-7A98-43B... | A1055520444618950 | 28.02.2023 04:34:27 | 0 | Null | Null |
25144265-F68B-4FD... | A1055520444618950 | 28.02.2023 10:44:30 | 0 | Null | Null |
8899ED8C-B295-43F... | A1055520444812380 | 06.03.2023 00:36:30 | 0 | Null | Null |
The solution
Feature retrieval runs a point-in-time join query. If the merge result is empty, try the following possible solutions:
- expand
temporal_join_lookback
Set the scope in the feature set specification definition or temporarily remove it. This allows the point-in-time association to look further (or infinitely) past the timestamp of the observation event to find feature values. - I
source.source_delay
Make sure this is also set in the feature set specification definitiontemporal_join_lookback > source.source_delay
.
If none of the above solutions work, grab the feature set from the feature store and run it
to manually check the feature index and timestamp columns. The error can have the following causes:
- The index values in the observation data do not exist in the feature set data frame
- There is no feature value that is in the past timestamp of the observation event.
In such cases, if the feature has offline materialization enabled, you may need to fill in additional feature data.
The user or managed identity does not have the appropriate RBAC permission to store features
Symptoms:
The function call job/query fails with the following error messagelogs/azureml/driver/stdout:
Traceback (last call last): file "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/ai/ml/_restclient/v2022_12_01_preview/operations/_workspaces_operations.py " , line 633, in get raise HttpResponseError(response=response, model=error, error_format=ARMErrorFormat)azure.core.Exceptions.HttpResponseError: (AuthorizationFailed) Client "XXXX" with object ID "XXXX" is not authorized to run action " Microsoft.MachineLearningServices/workspaces/read" via scope "/subscriptions/XXXX/resourceGroups/XXXX/providers/Microsoft.MachineLearningServices/workspaces/XXXX" or the scope is invalid If access was recently granted, please update your credentials. Code: Authorization failure
Solution:
- If the job uses a managed identity to retrieve functions, assign it
AzureML Data Scientist
Add a role to the identity in the feature store. - If this happens when a user runs code in an Azure Machine Learning Spark notebook that uses their own identity to access the Azure Machine Learning service, assign the following
AzureML Data Scientist
Associate the feature store role with the user's Azure Active Directory identity.
AzureML Data Scientist
is a recommended role. The following actions allow the user to create their own custom role
- Microsoft.MachineLearningServices/workspaces/datastores/listsecrets/action
- Microsoft.MachineLearningServices/workspaces/featuresets/read
- Microsoft.MachineLearningServices/workspaces/read
For more information on setting up RBAC, see the document.
The user or managed identity does not have the appropriate RBAC permission to read from native storage or offline storage
Symptoms
The function call job/query fails with the following error message in logs/azureml/driver/stdout:
An error occurred while calling o1025.parquet: java.nio.file.AccessDeniedException: Operation failed: "This request is not authorized to perform this operation with this permission.", 403, GET, https://{storage}. dfs. core.windows.net/test?upn=false&resource=filesystem&maxResults=5000&directory=datasources&timeout=90&recursive=false, AuthorizationPermissionMismatch, "This request does not have permission to perform this operation with this permission. RequestId:63013315-e01f-005e-577b-7c63b8000000 Time: 2023-05-01T22:20:51.1064935Z"at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileS ystem.java:1203)at org.ap ahh . hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem .java:408)at org.apache.hadoop.fs.Globber.listStatus(Globber.java:128)at org.apache.hadoop.fs.Globber.doGlobber. java:291)at org.apache.hadoop.fs.Globber.glob(Globber.java:202)at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2124)
Solution:
- If the job uses a managed identity to retrieve functions, assign it
Blob memory data reader
Role in native storage and offline storage for identity. - If this happens when a user runs a query in an Azure Machine Learning Spark notebook, it uses its own identity to access and provision the Azure Machine Learning service
Blob memory data reader
Role in native storage and offline storage for user identity.
Blob memory data reader
is the recommended minimum requirement for access. The user can also assign roles such as additional permissionsContributor to blob data storage
orOwner of blob memory data
.
Check it outManage access control for the managed feature repositoryFor more information on setting up RBAC, see the document.
The training job cannot read data generated by the built-in feature fetcher
Symptoms
The training job fails with an error message
- Training information is not available.
FileNotFoundError: [Errno 2] No such file or directory
- The format is incorrect.
parser error:
The solution
The built-in feature fetcher has one output:output data
. The output data is the uri_folder data asset. It always has the following folder structure:
/
├── Data/
│ ├── xxxxx.parkett
│ └── xxxxx.parkett
└── feature_retrieval_spec.yaml
And the output data is always in parquet format.
Update the training script to read from the data subfolder and read the data as a parquet.
Feature materialization jobs failed
- Invalid offline store configuration
- The materialization identity does not have the appropriate RBAC permission to store features
- The materialization identity does not have the appropriate RBAC permission to read from storage
- The materialization identity does not have the appropriate RBAC permission to write data to offline storage
If a feature materialization job fails, the user can follow these steps to view the job failure details.
- Go to the feature store page:https://ml.azure.com/featureStore/{your-feature-store-name}.
- To go
range of functions
On the tab, select the feature set you're working on and navigate to itFeature set details page. - On the feature set details page, select
materialization affairs
Then select the failed job to view the job details. - In the view of order details under
Characteristics
The order status and error message are displayed on the map. - In addition, you can go to
Outputs + Logs
Tab, then you will findstdout
file fromlogs\azureml\driver\stdout
After the fix is applied, the user can manually run a backfill materialization job to verify that the fix is working.
Invalid offline store configuration
Symptoms
The materialization job fails with the following error message in logs/azureml/driver/stdout:
Error message:
Causes: Status Code: -1 Error Code: null Error Message: InvalidAbfsRestOperationExceptionjava.net.UnknownHostException: adlgen23.dfs.core.windows.net
java.util.concurrent.ExecutionException: Operation failed: "The specified resource name contains invalid characters.", 400, HEAD, https://{storage}.dfs.core.windows.net/{container-name}/{fs - id}/transactions/1/_delta_log?upn=false&action=getStatus&timeout=90
The solution
Validate the offline store target defined in the feature store using the SDK.
iz azure.ai.ml import MLClientfrom azure.ai.ml.identity import AzureMLOnBehalfOfCredentialfs_client = MLClient(AzureMLOnBehalfOfCredential(), Featurestore_subscription_id, Featurestore_resource_group_name, Featurestore_name)featurestore = fs_client.feature_stores.get(name=fe aturestore_name)featurestore.offline_store.target
User can also check it on feature store UI preview page.
Verify that the target is in the following format and that both storage and container are present.
/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Storage/storageAccounts/{storage}/blobServices/default/containers/{The name of the container}
The materialization identity does not have the appropriate RBAC permission to store features
Symptoms:
The materialization job fails with the following error messagelogs/azureml/driver/stdout:
Traceback (last call last): file "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/ai/ml/_restclient/v2022_12_01_preview/operations/_workspaces_operations.py " , line 633, in get raise HttpResponseError(response=response, model=error, error_format=ARMErrorFormat)azure.core.Exceptions.HttpResponseError: (AuthorizationFailed) Client "XXXX" with object ID "XXXX" is not authorized to run action " Microsoft.MachineLearningServices/workspaces/read" via scope "/subscriptions/XXXX/resourceGroups/XXXX/providers/Microsoft.MachineLearningServices/workspaces/XXXX" or the scope is invalid If access was recently granted, please update your credentials. Code: Authorization failure
Solution:
AssignAzureML Data Scientist
Feature store role for the materialization identity (user-assigned managed identity) of the feature store.
AzureML Data Scientist
is a recommended role. You can use the following actions to create your own custom role
- Microsoft.MachineLearningServices/workspaces/datastores/listsecrets/action
- Microsoft.MachineLearningServices/workspaces/featuresets/read
- Microsoft.MachineLearningServices/workspaces/read
For more information seePermissions required for the Feature Store Materialization managed identity role..
The materialization identity does not have the appropriate RBAC permission to read from storage
Symptoms
The materialization job fails with the following error message in logs/azureml/driver/stdout:
An error occurred while calling o1025.parquet: java.nio.file.AccessDeniedException: Operation failed: "This request is not authorized to perform this operation with this permission.", 403, GET, https://{storage}. dfs. core.windows.net/test?upn=false&resource=filesystem&maxResults=5000&directory=datasources&timeout=90&recursive=false, AuthorizationPermissionMismatch, "This request does not have permission to perform this operation with this permission. RequestId:63013315-e01f-005e-577b-7c63b8000000 Time: 2023-05-01T22:20:51.1064935Z"at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileS ystem.java:1203)at org.ap ahh . hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem .java:408)at org.apache.hadoop.fs.Globber.listStatus(Globber.java:128)at org.apache.hadoop.fs.Globber.doGlobber. java:291)at org.apache.hadoop.fs.Globber.glob(Globber.java:202)at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2124)
Solution:
AssignBlob memory data reader
The role in the source store transferred to the feature store materialization identity (user-assigned managed identity).
Blob memory data reader
is the recommended minimum requirement for access. You can also assign roles with more permissions, e.gContributor to blob data storage
orOwner of blob memory data
.
For more information on configuring RBAC, seePermissions required for the Feature Store Materialization managed identity role..
The materialization identity does not have the appropriate RBAC permission to write data to offline storage
Symptoms
The materialization job fails with the following error messagelogs/azureml/driver/stdout:
When calling o1162.load. an error occurred: java.util.concurrent.ExecutionException: java.nio.file.AccessDeniedException: Operation failed: "This request is not authorized to perform this operation with this permission.", 403, HEAD, https://featuresotrestorage1. dfs .core.windows.net/offlinestore/fs_xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx_fsname/transactions/1/_delta_log?upn=false&action=getStatus&timeout=90at com.google.common.util.concurrent.AbstractFuture$ Sync.getValue(AbstractFuture); .java:306)at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)at com .google.common.util.concurrent.Uninterruptibles.getUninterruptibles(Uninterruptibles.java:135) at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410) at com.google.common.cache.localCache$Segment. LocalCache $Segment.loadSync(LocalCache.java:2380) at com.google.common.cache.LocalCache$S
The solution
AssignContributor to blob data storage
The role of storing offline commerce has been migrated to the materialization identity of the commerce (user-managed identity).
Contributor to blob data storage
is the recommended minimum requirement for access. You can also assign roles like other privilegesOwner of blob memory data
.
For more information on configuring RBAC, seePermissions required for the Feature Store Materialization managed identity role...