Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sf merlion for time-based and forecast AD #956

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

codeloop
Copy link
Member

@codeloop codeloop commented Sep 29, 2024

Add support for salesforce merlion model for time-based and forecast anomaly detection.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Sep 29, 2024
Copy link

⚠️ This PR changed pyproject.toml file. ⚠️

  • PR Creator must update 📃 THIRD_PARTY_LICENSES.txt, if any 📚 library added/removed in pyproject.toml.
  • PR Approver must confirm 📃 THIRD_PARTY_LICENSES.txt updated, if any 📚 library added/removed in pyproject.toml.

Copy link

📌 Cov diff with main:

Coverage-0%

📌 Overall coverage:

Coverage-60.43%

Copy link

⚠️ This PR changed pyproject.toml file. ⚠️

  • PR Creator must update 📃 THIRD_PARTY_LICENSES.txt, if any 📚 library added/removed in pyproject.toml.
  • PR Approver must confirm 📃 THIRD_PARTY_LICENSES.txt updated, if any 📚 library added/removed in pyproject.toml.

Copy link

📌 Cov diff with main:

Coverage-0%

📌 Overall coverage:

Coverage-60.44%

Copy link

⚠️ This PR changed pyproject.toml file. ⚠️

  • PR Creator must update 📃 THIRD_PARTY_LICENSES.txt, if any 📚 library added/removed in pyproject.toml.
  • PR Approver must confirm 📃 THIRD_PARTY_LICENSES.txt updated, if any 📚 library added/removed in pyproject.toml.

@codeloop codeloop changed the title Add sf merlion for time-based ad Add sf merlion for time-based and forecast ad Sep 30, 2024
@codeloop codeloop changed the title Add sf merlion for time-based and forecast ad Add sf merlion for time-based and forecast AD Sep 30, 2024
Copy link

⚠️ This PR changed pyproject.toml file. ⚠️

  • PR Creator must update 📃 THIRD_PARTY_LICENSES.txt, if any 📚 library added/removed in pyproject.toml.
  • PR Approver must confirm 📃 THIRD_PARTY_LICENSES.txt updated, if any 📚 library added/removed in pyproject.toml.

Copy link

⚠️ This PR changed pyproject.toml file. ⚠️

  • PR Creator must update 📃 THIRD_PARTY_LICENSES.txt, if any 📚 library added/removed in pyproject.toml.
  • PR Approver must confirm 📃 THIRD_PARTY_LICENSES.txt updated, if any 📚 library added/removed in pyproject.toml.

Copy link

📌 Cov diff with main:

Coverage-0%

📌 Overall coverage:

Coverage-60.43%

Copy link
Member

@ahosler ahosler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. In testing, I see that the anomaly score is no longer normalized from 0-1. Does Merlion have a different interpretation for the meaning of this score ? (CSV uploaded below)
    I believe we should stick with the [0,1] convention unless we have good reason to change.

  2. Also Merlion seemed over-reactive on a few of the tests I've run. Are there any default settings we should be tweaking?

  3. Could Merlion be made to work for non-timeseries AD? Or is it primarily time-based?

outliers.csv

@@ -179,7 +179,8 @@ anomaly = [
"oracledb",
"report-creator==1.0.9",
"rrcf==0.4.4",
"scikit-learn"
"scikit-learn",
"salesforce-merlion[all]==2.0.4"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What features does the "all" get us?

@@ -364,6 +364,7 @@ spec:
- oneclasssvm
- isolationforest
- randomcutforest
- merlion_ad
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change this to "merlion"?

@@ -36,7 +36,7 @@ def _build_model(self) -> AnomalyOutput:
# Set tree parameters
num_trees = model_kwargs.get("num_trees", 200)
shingle_size = model_kwargs.get("shingle_size", None)
anomaly_threshold = model_kwargs.get("anamoly_threshold", 95)
anomaly_threshold = model_kwargs.get("anomaly_threshold", 95)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants