- Notifications
You must be signed in to change notification settings - Fork 48
Real-Time What-If Analysis with Model Catalog and Model Deployment Integration #1041
New issue
Have a question about this project? Sign up for a free account to open an issue and contact its maintainers and the community.
By clicking “Sign up for ”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on ? Sign in to your account
Conversation
self.test_mode = os.environ.get("TEST_MODE", False) | ||
self.deployment_info = {} | ||
def _satiny_test(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo sanity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! I have a few comments I just need clarity on.
@@ -168,6 +168,7 @@ def get_data_multi_indexed(self): | |||
self.additional_data.data, | |||
], | |||
axis=1, | |||
join='inner' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What triggered this change? If I have an additional data column with a lot of missing data, won't this delete rows from the historical dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for flagging. I've reverted this change and tested the new development with it.
required: false | ||
meta: "If model_deployment_id is not specified, a new model deployment is created; otherwise, the model is linked to the specified model deployment." | ||
schema: | ||
model_deployment_id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably simplify this to "id" since it's already under "model_deployment".
Or we could move it up a level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, changed to id
""" | ||
Function perform sanity test for saved artifact | ||
""" | ||
sys.path.insert(0, f"{self.path_to_artifact}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we undo this command later?
We should save a copy of the original sys.path and restore it.
How do we think about concurrency issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a single operator run, this code will execute only once and restoring sys.path now
Inference script. This script is used for prediction by scoring server when schema is known. | ||
""" | ||
AUTOTS = "autots" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pull these in from the const.py file? Now that we're importing ads anyways
logger_pred = logging.getLogger('model-prediction') | ||
logger_pred.setLevel(logging.INFO) | ||
logger_feat = logging.getLogger('input-features') | ||
logger_feat.setLevel(logging.INFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting logger categories... Maybe we should do this throughout the Operator.
data_type[col['name']] = col['dtype'] | ||
else: | ||
print( | ||
"input_schema has to be passed in in order to recover the same data type. pass `X_sample` in `ads.model.framework.sklearn_model.SklearnModel.prepare` function to generate the input_schema. Otherwise, the data type might be changed after serialization/deserialization.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep this error msg? Or change to something operator specific?
If we can re-use the version from score.py-jinja2, then lets just pull it directly
) | ||
return forecast[target_column].tolist() | ||
else: | ||
raise Exception(f"Invalid model object type: {type(model_object).__name__}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work here! Very easy to read and debug
This PR introduces real-time "What-If" analysis which includes functionality for saving models to the model catalog and automatically creating a model deployment server when the what_if_analysis section and additional_data is provided in the input YAML.
The forecast operator's input YAML now supports the following configuration:
If project_id and compartment_id are optional, if not provided, the session's environment variables will be used.
deployment_info.json
contains deployment metadata such as the Model Catalog ID, Model Deployment ID, Model Deployment Endpoint, and the Series IDs for which the model is applicable.