diff --git a/README.md b/README.md
index 5c6c9bb..4e2c9c8 100644
--- a/README.md
+++ b/README.md
@@ -19,6 +19,99 @@ An embedding is a vector, or a list, of floating-point numbers. The distance bet
 
 This embedding API is created for [Magda](https://github.com/magda-io/magda)'s vector / hybrid search solution. The API interface is compatible with OpenAI's `embeddings` API to make it easier to reuse existing tools & libraries.
 
+### Resources requirementsSection
+
+Due to [this issue of ONNX runtime](https://github.com/microsoft/onnxruntime/issues/15080), the peak memory usage of the service is much higher than the model file size.
+e.g. For the default 500MB model file, the peak memory usage could up to 1.8GB - 2GB.
+However, the memory usage will drop back to much lower (for default model, it's aroudn 800MB-900MB) after the model is loaded.
+Please make sure your Kubernetes cluster has enough resources to run the service.
+
+## Requirements
+
+Kubernetes: `>= 1.21.0`
+
+| Repository                    | Name         | Version |
+| ----------------------------- | ------------ | ------- |
+| oci://ghcr.io/magda-io/charts | magda-common | 4.2.1   |
+
+## Values
+
+| Key                                | Type   | Default                        | Description                                                                                                                                                                                                                                                                                                                      |
+| ---------------------------------- | ------ | ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| affinity                           | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| autoscaling.hpa.enabled            | bool   | `false`                        |                                                                                                                                                                                                                                                                                                                                  |
+| autoscaling.hpa.maxReplicas        | int    | `3`                            |                                                                                                                                                                                                                                                                                                                                  |
+| autoscaling.hpa.minReplicas        | int    | `1`                            |                                                                                                                                                                                                                                                                                                                                  |
+| autoscaling.hpa.targetCPU          | int    | `90`                           |                                                                                                                                                                                                                                                                                                                                  |
+| autoscaling.hpa.targetMemory       | string | `""`                           |                                                                                                                                                                                                                                                                                                                                  |
+| bodyLimit                          | int    | Default to 10485760 (10MB).    | Defines the maximum payload, in bytes, that the server is allowed to accept                                                                                                                                                                                                                                                      |
+| closeGraceDelay                    | int    | Default to 25000 (25s).        | The maximum amount of time before forcefully closing pending requests. This should set to a value lower than the Pod's termination grace period (which is default to 30s)                                                                                                                                                        |
+| debug                              | bool   | `false`                        | Start Fastify app in debug mode with nodejs inspector inspector port is 9320                                                                                                                                                                                                                                                     |
+| defaultImage.imagePullSecret       | bool   | `false`                        |                                                                                                                                                                                                                                                                                                                                  |
+| defaultImage.pullPolicy            | string | `"IfNotPresent"`               |                                                                                                                                                                                                                                                                                                                                  |
+| defaultImage.repository            | string | `"ghcr.io/magda-io"`           |                                                                                                                                                                                                                                                                                                                                  |
+| deploymentAnnotations              | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| envFrom                            | list   | `[]`                           |                                                                                                                                                                                                                                                                                                                                  |
+| extraContainers                    | string | `""`                           |                                                                                                                                                                                                                                                                                                                                  |
+| extraEnvs                          | list   | `[]`                           |                                                                                                                                                                                                                                                                                                                                  |
+| extraInitContainers                | string | `""`                           |                                                                                                                                                                                                                                                                                                                                  |
+| extraVolumeMounts                  | list   | `[]`                           |                                                                                                                                                                                                                                                                                                                                  |
+| extraVolumes                       | list   | `[]`                           |                                                                                                                                                                                                                                                                                                                                  |
+| fullnameOverride                   | string | `""`                           |                                                                                                                                                                                                                                                                                                                                  |
+| global.image                       | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| global.rollingUpdate               | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| hostAliases                        | list   | `[]`                           |                                                                                                                                                                                                                                                                                                                                  |
+| image.name                         | string | `"magda-embedding-api"`        |                                                                                                                                                                                                                                                                                                                                  |
+| lifecycle                          | object | `{}`                           | pod lifecycle policies as outlined here: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks                                                                                                                                                                                               |
+| livenessProbe.failureThreshold     | int    | `10`                           |                                                                                                                                                                                                                                                                                                                                  |
+| livenessProbe.httpGet.path         | string | `"/status/liveness"`           |                                                                                                                                                                                                                                                                                                                                  |
+| livenessProbe.httpGet.port         | int    | `3000`                         |                                                                                                                                                                                                                                                                                                                                  |
+| livenessProbe.initialDelaySeconds  | int    | `10`                           |                                                                                                                                                                                                                                                                                                                                  |
+| livenessProbe.periodSeconds        | int    | `20`                           |                                                                                                                                                                                                                                                                                                                                  |
+| livenessProbe.successThreshold     | int    | `1`                            |                                                                                                                                                                                                                                                                                                                                  |
+| livenessProbe.timeoutSeconds       | int    | `5`                            |                                                                                                                                                                                                                                                                                                                                  |
+| logLevel                           | string | `info`.                        | The log level of the application. one of 'fatal', 'error', 'warn', 'info', 'debug', 'trace'; also 'silent' is supported to disable logging. Any other value defines a custom level and requires supplying a level value via levelVal.                                                                                            |
+| nameOverride                       | string | `""`                           |                                                                                                                                                                                                                                                                                                                                  |
+| nodeSelector                       | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| pluginTimeout                      | int    | Default to 10000 (10 seconds). | The maximum amount of time in milliseconds in which a fastify plugin can load. If not, ready will complete with an Error with code 'ERR_AVVIO_PLUGIN_TIMEOUT'.                                                                                                                                                                   |
+| podAnnotations                     | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| podSecurityContext.runAsUser       | int    | `1000`                         |                                                                                                                                                                                                                                                                                                                                  |
+| priorityClassName                  | string | `"magda-9"`                    |                                                                                                                                                                                                                                                                                                                                  |
+| rbac.automountServiceAccountToken  | bool   | `false`                        | Controls whether or not the Service Account token is automatically mounted to /var/run/secrets/kubernetes.io/serviceaccount                                                                                                                                                                                                      |
+| rbac.create                        | bool   | `false`                        |                                                                                                                                                                                                                                                                                                                                  |
+| rbac.serviceAccountAnnotations     | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| rbac.serviceAccountName            | string | `""`                           |                                                                                                                                                                                                                                                                                                                                  |
+| readinessProbe.failureThreshold    | int    | `10`                           |                                                                                                                                                                                                                                                                                                                                  |
+| readinessProbe.httpGet.path        | string | `"/status/readiness"`          |                                                                                                                                                                                                                                                                                                                                  |
+| readinessProbe.httpGet.port        | int    | `3000`                         |                                                                                                                                                                                                                                                                                                                                  |
+| readinessProbe.initialDelaySeconds | int    | `10`                           |                                                                                                                                                                                                                                                                                                                                  |
+| readinessProbe.periodSeconds       | int    | `20`                           |                                                                                                                                                                                                                                                                                                                                  |
+| readinessProbe.successThreshold    | int    | `1`                            |                                                                                                                                                                                                                                                                                                                                  |
+| readinessProbe.timeoutSeconds      | int    | `5`                            |                                                                                                                                                                                                                                                                                                                                  |
+| replicas                           | int    | `1`                            |                                                                                                                                                                                                                                                                                                                                  |
+| resources.limits.memory            | string | `"2000M"`                      | the memory limit of the container Due to [this issue of ONNX runtime](https://github.com/microsoft/onnxruntime/issues/15080), the peak memory usage of the service is much higher than the model file size. When change the default model, be sure to test the peak memory usage of the service before setting the memory limit. |
+| resources.requests.cpu             | string | `"100m"`                       |                                                                                                                                                                                                                                                                                                                                  |
+| resources.requests.memory          | string | `"850M"`                       |                                                                                                                                                                                                                                                                                                                                  |
+| service.annotations                | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| service.httpPortName               | string | `"http"`                       |                                                                                                                                                                                                                                                                                                                                  |
+| service.labels                     | object | `{}`                           |                                                                                                                                                                                                                                                                                                                                  |
+| service.loadBalancerIP             | string | `""`                           |                                                                                                                                                                                                                                                                                                                                  |
+| service.loadBalancerSourceRanges   | list   | `[]`                           |                                                                                                                                                                                                                                                                                                                                  |
+| service.name                       | string | `"magda-embedding-api"`        |                                                                                                                                                                                                                                                                                                                                  |
+| service.nodePort                   | string | `""`                           |                                                                                                                                                                                                                                                                                                                                  |
+| service.port                       | int    | `80`                           |                                                                                                                                                                                                                                                                                                                                  |
+| service.targetPort                 | int    | `3000`                         |                                                                                                                                                                                                                                                                                                                                  |
+| service.type                       | string | `"ClusterIP"`                  |                                                                                                                                                                                                                                                                                                                                  |
+| startupProbe.failureThreshold      | int    | `30`                           |                                                                                                                                                                                                                                                                                                                                  |
+| startupProbe.httpGet.path          | string | `"/status/startup"`            |                                                                                                                                                                                                                                                                                                                                  |
+| startupProbe.httpGet.port          | int    | `3000`                         |                                                                                                                                                                                                                                                                                                                                  |
+| startupProbe.initialDelaySeconds   | int    | `10`                           |                                                                                                                                                                                                                                                                                                                                  |
+| startupProbe.periodSeconds         | int    | `10`                           |                                                                                                                                                                                                                                                                                                                                  |
+| startupProbe.successThreshold      | int    | `1`                            |                                                                                                                                                                                                                                                                                                                                  |
+| startupProbe.timeoutSeconds        | int    | `5`                            |                                                                                                                                                                                                                                                                                                                                  |
+| tolerations                        | list   | `[]`                           |                                                                                                                                                                                                                                                                                                                                  |
+| topologySpreadConstraints          | list   | `[]`                           | This is the pod topology spread constraints https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/                                                                                                                                                                                                  |
+
 ### Build & Run for Local Development
 
 > Please note: for production deployment, please use the released [Docker images](https://github.com/magda-io/magda-embedding-api/pkgs/container/magda-embedding-api) & [helm charts](https://github.com/magda-io/magda-embedding-api/pkgs/container/charts%2Fmagda-embedding-api).
@@ -53,89 +146,3 @@ Deploy to minikube Cluster
 ```bash
 helm -n test upgrade --install test ./deploy/magda-embedding-api -f ./deploy/test-deploy.yaml
 ```
-
-## Requirements
-
-Kubernetes: `>= 1.21.0`
-
-| Repository                    | Name         | Version |
-| ----------------------------- | ------------ | ------- |
-| oci://ghcr.io/magda-io/charts | magda-common | 4.2.1   |
-
-## Values
-
-| Key                                | Type   | Default                        | Description                                                                                                                                                                                                                           |
-| ---------------------------------- | ------ | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| affinity                           | object | `{}`                           |                                                                                                                                                                                                                                       |
-| autoscaling.hpa.enabled            | bool   | `false`                        |                                                                                                                                                                                                                                       |
-| autoscaling.hpa.maxReplicas        | int    | `3`                            |                                                                                                                                                                                                                                       |
-| autoscaling.hpa.minReplicas        | int    | `1`                            |                                                                                                                                                                                                                                       |
-| autoscaling.hpa.targetCPU          | int    | `90`                           |                                                                                                                                                                                                                                       |
-| autoscaling.hpa.targetMemory       | string | `""`                           |                                                                                                                                                                                                                                       |
-| bodyLimit                          | int    | Default to 10485760 (10MB).    | Defines the maximum payload, in bytes, that the server is allowed to accept                                                                                                                                                           |
-| closeGraceDelay                    | int    | Default to 25000 (25s).        | The maximum amount of time before forcefully closing pending requests. This should set to a value lower than the Pod's termination grace period (which is default to 30s)                                                             |
-| debug                              | bool   | `false`                        | Start Fastify app in debug mode with nodejs inspector inspector port is 9320                                                                                                                                                          |
-| defaultImage.imagePullSecret       | bool   | `false`                        |                                                                                                                                                                                                                                       |
-| defaultImage.pullPolicy            | string | `"IfNotPresent"`               |                                                                                                                                                                                                                                       |
-| defaultImage.repository            | string | `"ghcr.io/magda-io"`           |                                                                                                                                                                                                                                       |
-| deploymentAnnotations              | object | `{}`                           |                                                                                                                                                                                                                                       |
-| envFrom                            | list   | `[]`                           |                                                                                                                                                                                                                                       |
-| extraContainers                    | string | `""`                           |                                                                                                                                                                                                                                       |
-| extraEnvs                          | list   | `[]`                           |                                                                                                                                                                                                                                       |
-| extraInitContainers                | string | `""`                           |                                                                                                                                                                                                                                       |
-| extraVolumeMounts                  | list   | `[]`                           |                                                                                                                                                                                                                                       |
-| extraVolumes                       | list   | `[]`                           |                                                                                                                                                                                                                                       |
-| fullnameOverride                   | string | `""`                           |                                                                                                                                                                                                                                       |
-| global.image                       | object | `{}`                           |                                                                                                                                                                                                                                       |
-| global.rollingUpdate               | object | `{}`                           |                                                                                                                                                                                                                                       |
-| hostAliases                        | list   | `[]`                           |                                                                                                                                                                                                                                       |
-| image.name                         | string | `"magda-embedding-api"`        |                                                                                                                                                                                                                                       |
-| lifecycle                          | object | `{}`                           | pod lifecycle policies as outlined here: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks                                                                                                    |
-| livenessProbe.failureThreshold     | int    | `10`                           |                                                                                                                                                                                                                                       |
-| livenessProbe.httpGet.path         | string | `"/status/liveness"`           |                                                                                                                                                                                                                                       |
-| livenessProbe.httpGet.port         | int    | `3000`                         |                                                                                                                                                                                                                                       |
-| livenessProbe.initialDelaySeconds  | int    | `10`                           |                                                                                                                                                                                                                                       |
-| livenessProbe.periodSeconds        | int    | `20`                           |                                                                                                                                                                                                                                       |
-| livenessProbe.successThreshold     | int    | `1`                            |                                                                                                                                                                                                                                       |
-| livenessProbe.timeoutSeconds       | int    | `5`                            |                                                                                                                                                                                                                                       |
-| logLevel                           | string | `info`.                        | The log level of the application. one of 'fatal', 'error', 'warn', 'info', 'debug', 'trace'; also 'silent' is supported to disable logging. Any other value defines a custom level and requires supplying a level value via levelVal. |
-| nameOverride                       | string | `""`                           |                                                                                                                                                                                                                                       |
-| nodeSelector                       | object | `{}`                           |                                                                                                                                                                                                                                       |
-| pluginTimeout                      | int    | Default to 10000 (10 seconds). | The maximum amount of time in milliseconds in which a fastify plugin can load. If not, ready will complete with an Error with code 'ERR_AVVIO_PLUGIN_TIMEOUT'.                                                                        |
-| podAnnotations                     | object | `{}`                           |                                                                                                                                                                                                                                       |
-| podSecurityContext.runAsUser       | int    | `1000`                         |                                                                                                                                                                                                                                       |
-| priorityClassName                  | string | `"magda-9"`                    |                                                                                                                                                                                                                                       |
-| rbac.automountServiceAccountToken  | bool   | `false`                        | Controls whether or not the Service Account token is automatically mounted to /var/run/secrets/kubernetes.io/serviceaccount                                                                                                           |
-| rbac.create                        | bool   | `false`                        |                                                                                                                                                                                                                                       |
-| rbac.serviceAccountAnnotations     | object | `{}`                           |                                                                                                                                                                                                                                       |
-| rbac.serviceAccountName            | string | `""`                           |                                                                                                                                                                                                                                       |
-| readinessProbe.failureThreshold    | int    | `10`                           |                                                                                                                                                                                                                                       |
-| readinessProbe.httpGet.path        | string | `"/status/readiness"`          |                                                                                                                                                                                                                                       |
-| readinessProbe.httpGet.port        | int    | `3000`                         |                                                                                                                                                                                                                                       |
-| readinessProbe.initialDelaySeconds | int    | `10`                           |                                                                                                                                                                                                                                       |
-| readinessProbe.periodSeconds       | int    | `20`                           |                                                                                                                                                                                                                                       |
-| readinessProbe.successThreshold    | int    | `1`                            |                                                                                                                                                                                                                                       |
-| readinessProbe.timeoutSeconds      | int    | `5`                            |                                                                                                                                                                                                                                       |
-| replicas                           | int    | `1`                            |                                                                                                                                                                                                                                       |
-| resources.limits.memory            | string | `"2000M"`                      |                                                                                                                                                                                                                                       |
-| resources.requests.cpu             | string | `"100m"`                       |                                                                                                                                                                                                                                       |
-| resources.requests.memory          | string | `"850M"`                       |                                                                                                                                                                                                                                       |
-| service.annotations                | object | `{}`                           |                                                                                                                                                                                                                                       |
-| service.httpPortName               | string | `"http"`                       |                                                                                                                                                                                                                                       |
-| service.labels                     | object | `{}`                           |                                                                                                                                                                                                                                       |
-| service.loadBalancerIP             | string | `""`                           |                                                                                                                                                                                                                                       |
-| service.loadBalancerSourceRanges   | list   | `[]`                           |                                                                                                                                                                                                                                       |
-| service.name                       | string | `"magda-embedding-api"`        |                                                                                                                                                                                                                                       |
-| service.nodePort                   | string | `""`                           |                                                                                                                                                                                                                                       |
-| service.port                       | int    | `80`                           |                                                                                                                                                                                                                                       |
-| service.targetPort                 | int    | `3000`                         |                                                                                                                                                                                                                                       |
-| service.type                       | string | `"ClusterIP"`                  |                                                                                                                                                                                                                                       |
-| startupProbe.failureThreshold      | int    | `30`                           |                                                                                                                                                                                                                                       |
-| startupProbe.httpGet.path          | string | `"/status/startup"`            |                                                                                                                                                                                                                                       |
-| startupProbe.httpGet.port          | int    | `3000`                         |                                                                                                                                                                                                                                       |
-| startupProbe.initialDelaySeconds   | int    | `10`                           |                                                                                                                                                                                                                                       |
-| startupProbe.periodSeconds         | int    | `10`                           |                                                                                                                                                                                                                                       |
-| startupProbe.successThreshold      | int    | `1`                            |                                                                                                                                                                                                                                       |
-| startupProbe.timeoutSeconds        | int    | `5`                            |                                                                                                                                                                                                                                       |
-| tolerations                        | list   | `[]`                           |                                                                                                                                                                                                                                       |
-| topologySpreadConstraints          | list   | `[]`                           | This is the pod topology spread constraints https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/                                                                                                       |
diff --git a/README.md.gotmpl b/README.md.gotmpl
index 3a76bdc..1671c23 100644
--- a/README.md.gotmpl
+++ b/README.md.gotmpl
@@ -19,6 +19,21 @@ An embedding is a vector, or a list, of floating-point numbers. The distance bet
 
 This embedding API is created for [Magda](https://github.com/magda-io/magda)'s vector / hybrid search solution. The API interface is compatible with OpenAI's `embeddings` API to make it easier to reuse existing tools & libraries.
 
+### Resources requirementsSection
+
+Due to [this issue of ONNX runtime](https://github.com/microsoft/onnxruntime/issues/15080), the peak memory usage of the service is much higher than the model file size. 
+e.g. For the default 500MB model file, the peak memory usage could up to 1.8GB - 2GB.
+However, the memory usage will drop back to much lower (for default model, it's aroudn 800MB-900MB) after the model is loaded.
+Please make sure your Kubernetes cluster has enough resources to run the service.
+
+{{ template "chart.maintainersSection" . }}
+
+{{ template "chart.requirementsSection" . }}
+
+{{ template "chart.valuesHeader" . }}
+
+{{ template "chart.valuesTable" . }}
+
 ### Build & Run for Local Development
 
 > Please note: for production deployment, please use the released [Docker images](https://github.com/magda-io/magda-embedding-api/pkgs/container/magda-embedding-api) & [helm charts](https://github.com/magda-io/magda-embedding-api/pkgs/container/charts%2Fmagda-embedding-api).
@@ -53,12 +68,4 @@ Deploy to minikube Cluster
 
 ```bash
 helm -n test upgrade --install test ./deploy/magda-embedding-api -f ./deploy/test-deploy.yaml
-```
-
-{{ template "chart.maintainersSection" . }}
-
-{{ template "chart.requirementsSection" . }}
-
-{{ template "chart.valuesHeader" . }}
-
-{{ template "chart.valuesTable" . }}
\ No newline at end of file
+```
\ No newline at end of file
diff --git a/deploy/magda-embedding-api/values.yaml b/deploy/magda-embedding-api/values.yaml
index ec3d5a7..a6d8c54 100644
--- a/deploy/magda-embedding-api/values.yaml
+++ b/deploy/magda-embedding-api/values.yaml
@@ -161,4 +161,7 @@ resources:
     cpu: "100m"
     memory: "850M"
   limits:
+    # -- (string) the memory limit of the container
+    # Due to [this issue of ONNX runtime](https://github.com/microsoft/onnxruntime/issues/15080), the peak memory usage of the service is much higher than the model file size. 
+    # When change the default model, be sure to test the peak memory usage of the service before setting the memory limit.
     memory: "2000M"
\ No newline at end of file