odsc-65115 : Transform columns names to make them compatible across models #1046

codeloop · 2025-01-24T05:29:19Z

odsc-65115
Add the column name transformations, also save a mapping for reverse transformation as per model requirement.

ahosler · 2025-01-28T08:17:23Z

ads/opctl/operator/lowcode/common/transformations.py

@@ -59,7 +60,8 @@ def run(self, data):

 """
 clean_df = self._remove_trailing_whitespace(data)
-# clean_df = self._normalize_column_names(clean_df)
+if hasattr(self.dataset_info, 'horizon'):


Is this a proxy for "if it's forecasting"?
If so it may be better to use kind/type for future compatibility

In the transformations we use OperatorSpec, incase of the forecasting it is ForecastOperatorSpec, type is available in the OperatorConfig which is not required here and adding type to spec will make it redundant, so I have doing that. Updated this to us operatorspec instance type check for better readability .

ahosler · 2025-01-28T08:22:51Z

ads/opctl/operator/lowcode/common/transformations.py

+self.target_category_columns = [
+self.raw_column_names.get(col, col) for col in self.target_category_columns
+]
+df.columns = df.columns.str.replace(" ", "")


Should we be doing this replace 2x? Is there a way to do it once so we ensure it's done the same way

ahosler · 2025-01-28T08:24:26Z

ads/opctl/operator/lowcode/common/transformations.py

@@ -226,6 +254,10 @@ def _check_historical_dataset(self, df):
 expected_names = [self.target_column_name, self.dt_column_name] + (
 self.target_category_columns if self.target_category_columns else []
 )
+
+if self.raw_column_names:
+expected_names.extend(list(self.raw_column_names.values()))


Are these guaranteed to be in historical data?

yes, as we are making this transformation to the historical data columns, expected columns has to be extended using the same post transformed columns

updates

transform columns with space to without space and preserve a map

7f29766

oracle-contributor-agreement bot added the OCA VerifiedAll contributors have signed the Oracle Contributor Agreement.label Jan 24, 2025

codeloop and others added 3 commits January 24, 2025 05:59

update

53dc395

update the testcaes

7c08cdc

Merge branch 'main' into feature/odsc-65115

efe3e35

codeloop changed the title ~~Make column name LGB compatible~~ Transform columns names to make them compatible across models Jan 27, 2025

codeloop changed the title ~~Transform columns names to make them compatible across models~~ odsc-65115 : Transform columns names to make them compatible across models Jan 27, 2025

Merge branch 'main' into feature/odsc-65115

a20b90f

codeloop marked this pull request as ready for review January 28, 2025 04:50

codeloop requested review from darenr, mayoor, mrDzurb, VipulMascarenhas, qiuosier and ahosler as code owners January 28, 2025 04:50

codeloop enabled auto-merge January 28, 2025 04:50

codeloop requested a review from prasankh January 28, 2025 04:51

ahosler previously requested changes Jan 28, 2025
View reviewed changes

codeloop and others added 2 commits January 30, 2025 05:50

unify replace op, check for forecastoperatorspec

043fe73

Merge branch 'main' into feature/odsc-65115

178d922

codeloop requested a review from ahosler January 31, 2025 07:07

codeloop and others added 2 commits January 31, 2025 12:38

Merge branch 'main' into feature/odsc-65115

8947e3f

Merge branch 'main' into feature/odsc-65115

9b457c7

ahosler approved these changes Feb 3, 2025
View reviewed changes

ahosler and others added 2 commits February 3, 2025 13:06

Merge branch 'main' into feature/odsc-65115

8f21350

Merge branch 'main' into feature/odsc-65115

fa5f53d

prasankh approved these changes Feb 4, 2025
View reviewed changes

codeloop merged commit ccabc04 into main Feb 4, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

odsc-65115 : Transform columns names to make them compatible across models #1046

odsc-65115 : Transform columns names to make them compatible across models #1046

codeloop commented Jan 24, 2025•
edited
Loading

ahosler Jan 28, 2025

codeloop Jan 31, 2025

ahosler Jan 28, 2025

ahosler Jan 28, 2025

codeloop Jan 30, 2025•
edited
Loading

odsc-65115 : Transform columns names to make them compatible across models #1046

odsc-65115 : Transform columns names to make them compatible across models #1046

Conversation

codeloop commented Jan 24, 2025•edited Loading

ahosler Jan 28, 2025

Choose a reason for hiding this comment

codeloop Jan 31, 2025

Choose a reason for hiding this comment

ahosler Jan 28, 2025

Choose a reason for hiding this comment

ahosler Jan 28, 2025

Choose a reason for hiding this comment

codeloop Jan 30, 2025•edited Loading

Choose a reason for hiding this comment

codeloop commented Jan 24, 2025•
edited
Loading

codeloop Jan 30, 2025•
edited
Loading