Added API to get multi model deployment config #1055

lu-ohai · 2025-02-04T03:29:31Z

Added API to get multi model deployment config

Input/Output format

The input format for multi model deployment config is listed as below

{
    "shape": [
        "VM.GPU.A10.2",
        "VM.GPU.A10.4",
        "BM.GPU.A100-v2.8",
        "BM.GPU.H100.8"
    ],
    "configuration": {
        "VM.GPU.A10.4": {
            "parameters": {
                "VLLM_PARAMS": "--trust-remote-code --max-model-len 60000"
            },
            "multi_model_deployment": [
                {
                    "gpu_count": 1,
                    "parameters": {
                        "VLLM_PARAMS": "--trust-remote-code --max-model-len 32000"
                    }
                },
                {
                    "gpu_count": 2,
                    "parameters": {
                        "VLLM_PARAMS": "--trust-remote-code --max-model-len 32000"
                    }
                }
            ]
        }
    }
}

The output response format is listed as below

{
    "deployment_config": {
        "model_ocid_1": {
            "shape": [
                "BM.GPU.A10.4",
                "BM.GPU4.8",
                "BM.GPU.L40S-NC.4",
                "BM.GPU.A100-v2.8",
                "BM.GPU.H100.8"
            ],
            "configuration": {
                "BM.GPU.A10.4": {
                    "parameters": {
                        "VLLM_PARAMS": "--enforce-eager --max-num-seqs 16 --max-model-len 65536"
                    },
                    "multi_model_deployment": [
                      {
                          "gpu_count": 1,
                          "parameters": {
                              "VLLM_PARAMS": "--trust-remote-code --max-model-len 32000"
                          }
                      },
                      {
                          "gpu_count": 2,
                          "parameters": {
                              "VLLM_PARAMS": "--trust-remote-code --max-model-len 32000"
                          }
                      }
                  ]
                },
            }
        },
        "model_ocid_2": {
            "shape": [
                "BM.GPU.A10.4",
                "BM.GPU4.8",
                "BM.GPU.L40S-NC.4",
                "BM.GPU.A100-v2.8",
                "BM.GPU.H100.8"
            ],
            "configuration": {
                "BM.GPU.A10.4": {
                    "parameters": {
                        "VLLM_PARAMS": "--enforce-eager --max-num-seqs 16 --max-model-len 65536"
                    },
                    "multi_model_deployment": [
                      {
                          "gpu_count": 1,
                          "parameters": {
                              "VLLM_PARAMS": "--trust-remote-code --max-model-len 32000"
                          }
                      },
                      {
                          "gpu_count": 2,
                          "parameters": {
                              "VLLM_PARAMS": "--trust-remote-code --max-model-len 32000"
                          }
                      }
                  ]
                },
            }
        }
    },
    "gpu_allocation": {
        "VM.GPU.A10.4": {
            "models": [
                {
                    "ocid": "model_ocid_1",
                    "gpu_count": 2
                },
                {
                    "ocid": "model_ocid_2",
                    "gpu_count": 2
                }
            ],
            "total_gpus_available": 4
        }
    }
}

Notebook

No possible gpu allocations savailable

No primary model id provided

Primary model id provided (id ending with jwsq) and it gets the maximum gpu count.

…ence into ODSC-68152/list_compatible_shapes

-actions · 2025-02-04T04:01:22Z

📌 Cov diff with main:

📌 Overall coverage:

-actions · 2025-02-04T18:13:16Z

📌 Cov diff with main:

📌 Overall coverage:

ads/aqua/common/utils.py

ads/aqua/extension/deployment_handler.py

ads/aqua/modeldeployment/entities.py

tests/unitary/with_extras/aqua/test_data/deployment/aqua_multi_model_deployment_config.json

mrDzurb · 2025-02-05T02:36:57Z

Hi @lu-ohai, can you add more description into the PR? Also add the test and validation details. Share what is the expected input data and what would be the output, just provide a couple of use cases.

VipulMascarenhas

overall, the get_multimodel_compatible_shapes API might be a very slow operation given that we have to get the config file from object storage for each model. On average, the get_deployment_config API call takes 4-6 seconds for each model. It could result in a bad experience if user select 2-3 models and waits for 10-15 seconds only to see a message saying the combination is not feasible. We can probably cache the result for each model so that the subsequent calls are faster. Or we can send parallel async requests to fetch multiple configs, instead of reading them sequentially. Some testing will be required to confirm what optimizations may be required.

cc: @mrDzurb

ads/aqua/extension/deployment_handler.py

mrDzurb · 2025-02-05T19:52:10Z

overall, the get_multimodel_compatible_shapes API might be a very slow operation given that we have to get the config file from object storage for each model. On average, the get_deployment_config API call takes 4-6 seconds for each model. It could result in a bad experience if user select 2-3 models and waits for 10-15 seconds only to see a message saying the combination is not feasible. We can probably cache the result for each model so that the subsequent calls are faster. Or we can send parallel async requests to fetch multiple configs, instead of reading them sequentially. Some testing will be required to confirm what optimizations may be required.
cc: @mrDzurb

Totally agree. We should use both technics caching and threadpool.

4-6 seconds to read a file from OS bucket, this is insane :)

lu-ohai · 2025-02-05T22:03:19Z

@mrDzurb @VipulMascarenhas Based on the testing, fetching configs from three model ids takes roughly 5-6 microseconds, so wondering under which case it takes get_deployment_config api 6 seconds to complete. I think we can add the cache layer the followup pr if needed.

…racle/accelerated-data-science into ODSC-68152/list_compatible_shapes

ads/aqua/common/utils.py

darenr

Very nice code

VipulMascarenhas

lgtm 👍

lu-ohai added 3 commits February 3, 2025 21:03

Added get multi model deployment config.

1446ea0

Updated pr.

c925ec4

Merge branch 'main' of https://.com/oracle/accelerated-data-sci…

3187e2b
…ence into ODSC-68152/list_compatible_shapes

oracle-contributor-agreement bot added the OCA VerifiedAll contributors have signed the Oracle Contributor Agreement.label Feb 4, 2025

Updated pr.

25afd80

lu-ohai changed the base branch from main to feature/multi_model_deployment February 4, 2025 19:37

lu-ohai marked this pull request as ready for review February 4, 2025 19:48

lu-ohai requested review from darenr, mayoor, mrDzurb, VipulMascarenhas, qiuosier and ahosler as code owners February 4, 2025 19:48

mrDzurb reviewed Feb 5, 2025
View reviewed changes

Updated pr.

86e9c33

VipulMascarenhas reviewed Feb 5, 2025
View reviewed changes

ads/aqua/extension/deployment_handler.py Outdated Show resolved Hide resolved

Updated pr.

8d51a8b

Merge branch 'feature/multi_model_deployment' of https://.com/o…

c91472f
…racle/accelerated-data-science into ODSC-68152/list_compatible_shapes

mrDzurb reviewed Feb 5, 2025
View reviewed changes

ads/aqua/common/utils.py Outdated Show resolved Hide resolved

Updated pr.

32b73f3

mrDzurb approved these changes Feb 5, 2025
View reviewed changes

darenr reviewed Feb 6, 2025
View reviewed changes

VipulMascarenhas approved these changes Feb 6, 2025
View reviewed changes

Resolve merge conflicts and ruff update

1e418db

VipulMascarenhas merged commit 0f08a64 into feature/multi_model_deployment Feb 6, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added API to get multi model deployment config #1055

Added API to get multi model deployment config #1055

lu-ohai commented Feb 4, 2025•
edited by darenr
Loading

-actions bot commented Feb 4, 2025

-actions bot commented Feb 4, 2025

mrDzurb commented Feb 5, 2025

VipulMascarenhas left a comment

mrDzurb commented Feb 5, 2025

lu-ohai commented Feb 5, 2025

darenr left a comment

VipulMascarenhas left a comment

Added API to get multi model deployment config #1055

Added API to get multi model deployment config #1055

Conversation

lu-ohai commented Feb 4, 2025•edited by darenr Loading