Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add zoom functionality #20

Merged
merged 18 commits into from
Aug 2, 2023
Merged

Add zoom functionality #20

merged 18 commits into from
Aug 2, 2023

Conversation

andersy005
Copy link
Member

@andersy005 andersy005 commented Jul 25, 2023

This PR introduces a feature to the benchmarks that allows for configurable zoom functionality. while executing the benchmarks, users can now specify the desired operation (zoom_in or zoom_out) along with the corresponding zoom level.

when dealing with the request data, should we filter out requests that do not correspond to the pyramid level? for example, should we exclude requests that are irrelevant to the specified zoom level?

python main.py --operation zoom_out --zoom-level 2

produces

[
    {
        "browser_name": "chromium",
        "browser_version": "114.0.5735.35",
        "operation": "zoom_out",
        "playwright_python_version": "1.34.0",
        "provider": "unknown",
        "request_data": [
            {
                "method": "GET",
                "request_start": 427544151.336,
                "response_end": 427544379.397,
                "total_response_time_ms": 228.061,
                "url": "https://carbonplan-maps.s3.us-west-2.amazonaws.com/v2/demo/4d/tavg-prec-month/.zmetadata"
            },
            {
                "method": "GET",
                "request_start": 427544381.87,
                "response_end": 427544540.098,
                "total_response_time_ms": 158.228,
                "url": "https://carbonplan-maps.s3.us-west-2.amazonaws.com/v2/demo/4d/tavg-prec-month/0/month/0"
            },
            {
                "method": "GET",
                "request_start": 427544382.041,
                "response_end": 427544540.734,
                "total_response_time_ms": 158.693,
                "url": "https://carbonplan-maps.s3.us-west-2.amazonaws.com/v2/demo/4d/tavg-prec-month/0/band/0"
            },
            {
                "method": "GET",
                "request_start": 427544575.472,
                "response_end": 427544865.467,
                "total_response_time_ms": 289.995,
                "url": "https://carbonplan-maps.s3.us-west-2.amazonaws.com/v2/demo/4d/tavg-prec-month/1/climate/0.0.0.1"
            },
            {
                "method": "GET",
                "request_start": 427544575.691,
                "response_end": 427544798.217,
                "total_response_time_ms": 222.526,
                "url": "https://carbonplan-maps.s3.us-west-2.amazonaws.com/v2/demo/4d/tavg-prec-month/1/climate/0.0.1.1"
            },
            {
                "method": "GET",
                "request_start": 427544575.859,
                "response_end": 427544851.985,
                "total_response_time_ms": 276.126,
                "url": "https://carbonplan-maps.s3.us-west-2.amazonaws.com/v2/demo/4d/tavg-prec-month/1/climate/0.0.0.0"
            },
            {
                "method": "GET",
                "request_start": 427544576.387,
                "response_end": 427544859.685,
                "total_response_time_ms": 283.298,
                "url": "https://carbonplan-maps.s3.us-west-2.amazonaws.com/v2/demo/4d/tavg-prec-month/1/climate/0.0.1.0"
            }
        ],
        "timer_end": 1613.4000000357628,
        "timer_start": 671.1000000238419,
        "total_duration_in_ms": 942.3000000119209,
        "zoom_level": 2
    }
]

Copy link
Member

@katamartin katamartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question about how this interacts with the onIdle listener, otherwise this is looking good to me.

How does this interact with the onIdle listener? It seems like onIdle might be invoked multiple times while zooming through the map depending on how quickly tiles are loaded (e.g., on initial load, after zooming to 1st level, after zooming to 2nd level, etc.). Or are we basically fine relying on the fact that the listener is instantiated after the zooming is kicked off?

@andersy005
Copy link
Member Author

How does this interact with the onIdle listener? It seems like onIdle might be invoked multiple times while zooming through the map depending on how quickly tiles are loaded (e.g., on initial load, after zooming to 1st level, after zooming to 2nd level, etc.). Or are we basically fine relying on the fact that the listener is instantiated after the zooming is kicked off?

good catch. i had forgotten about this. looking at this now, i'm wondering if moving the check for the idle state into the for loop that performs the zoom operations could potentially provide more granular performance measurements. Specifically, it could allow us to measure the time it takes for the map to become idle after each individual zoom operation, rather than after all zoom operations have been performed.

@andersy005
Copy link
Member Author

andersy005 commented Jul 25, 2023

i'm thinking of something along these lines

if zoom_level:
    for _ in range(zoom_level):
        if operation == 'zoom_in':
            page.keyboard.press('=')
        elif operation == 'zoom_out':
            page.keyboard.press('-')

        # Start timer
        page.evaluate(
            """
        window._timerStart = performance.now();
        """
        )

        # Wait for the map to be idle and then stop timer
        page.evaluate(
            """
            () => {
            window._error = null;
            if (!window._map) {
                window._error = 'window._map does not exist'
                console.error(window._error)
                window._timerEnd = performance.now()
            }

            return new Promise((resolve, reject) => {
                const THRESHOLD = 5000
                // timeout after THRESHOLD ms if idle event is not seen
                setTimeout(() => {
                    window._error = `No idle events seen after ${THRESHOLD}ms`;
                    reject(window._error)
                }, THRESHOLD)
                window._map.onIdle(() => {
                    console.log('window._map.onIdle callback called')
                    window._timerEnd = performance.now()
                    resolve()
                })
            }).catch((error) => {
                window._error = 'Error in page.evaluate: ' + error;
                console.error(window._error);
                window._timerEnd = performance.now()
            })
            }

        """
        )

        if error := page.evaluate('window._error'):
            raise Exception(error)

        timer_end = page.evaluate('window._timerEnd')
        timer_start = page.evaluate('window._timerStart')

        # Record system metrics for each zoom operation
        data = {
            'request_data': filtered_request_data,
            'timer_start': timer_start,
            'timer_end': timer_end,
            'total_duration_in_ms': timer_end - timer_start,
            'playwright_python_version': playwright_python_version,
            'provider': provider_name,
            'browser_name': playwright.chromium.name,
            'browser_version': browser.version,
            'operation': operation,
            'zoom_level': zoom_level,
        }

        all_data.append(data)

with this change, we would have a separate set of metrics for each zoom operation, which might affect how we analyze and interpret the performance data. do we want this level of granularity?

@katamartin
Copy link
Member

with this change, we would have a separate set of metrics for each zoom operation, which might affect how we analyze and interpret the performance data. do we want this level of granularity?

If we go this route, should we also separately listen for the initial (pre-zoom) onIdle invocation?

Would it be possible to make this choice (metrics per operation vs. metrics based on sum of operations) easy to configure? I could imagine a helper function like evaluateAction(action) that returns metrics. action would be a function that gets invoked and could either contain a batch of actions (e.g., mount page, zoom in, zoom in, etc.) or a single action (e.g., mount page).

@maxrjones
Copy link
Contributor

when dealing with the request data, should we filter out requests that do not correspond to the pyramid level? for example, should we exclude requests that are irrelevant to the specified zoom level?

Would it be possible to make this choice (metrics per operation vs. metrics based on sum of operations) easy to configure? I could imagine a helper function like evaluateAction(action) that returns metrics. action would be a function that gets invoked and could either contain a batch of actions (e.g., mount page, zoom in, zoom in, etc.) or a single action (e.g., mount page).

Regarding both of these comments, I think it would be best to handle the data filtering outside of the main data collection script. This would make it easy to change which metrics are considered important without needing to re-run the benchmarks. For the original requests data, I separated the request extraction from main.py in 37b1cf7 as part of #18.

Specifically for the choice of per operation vs sum of operations, this would be simple to configure as a post-processing step if the data produced directly from running the benchmarks were something like:

all_data = {
    'playwright_python_version': playwright_python_version,
    'provider': provider_name,
    'browser_name': playwright.chromium.name,
    'browser_version': browser.version,
    'operation': operation,
    '0': {'time_start': timer_start, 'time_end': time_end},
    '1': {'time_start': timer_start, 'time_end': time_end},
}

where the number of numeric keys equals the number of zoom levels.

main.py Outdated Show resolved Hide resolved
@andersy005
Copy link
Member Author

@maxrjones, i've enabled storing chrome traces in s3.

the RMSE values that i am getting are not close to zero when I run the benchmarks on my local Linux machine. is this expected? if so, does this mean the baselines are specifically tied to the machine they are generated from?

Screenshot 2023-07-31 at 15 41 42

you can view these exact plots by running

python scripts/analysis.py --timestamp 2023-07-31T22-32-38 --run 1 --s3-bucket s3://carbonplan-scratch

main.py Outdated
Comment on lines 73 to 81
chrome_args = [
'--enable-features=Vulkan,UseSkiaRenderer',
'--use-vulkan=swiftshader',
'--enable-unsafe-webgpu',
'--disable-vulkan-fallback-to-gl-for-testing',
'--dignore-gpu-blocklist',
'--use-angle=vulkan',
]
browser = await playwright.chromium.launch(args=chrome_args)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was able to ensure GPU rasterization. confirmed this by launching chromium with headless=False and visiting chrome://gpu page. i get the following,

Graphics Feature Status
Canvas: Hardware accelerated
Canvas out-of-process rasterization: Disabled
Direct Rendering Display Compositor: Disabled
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
OpenGL: Enabled
Rasterization: Hardware accelerated
Raw Draw: Disabled
Skia Graphite: Disabled
Video Decode: Hardware accelerated
Video Encode: Software only. Hardware acceleration disabled
Vulkan: Enabled
WebGL: Hardware accelerated
WebGL2: Hardware accelerated
WebGPU: Software only, hardware acceleration unavailable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andersy005 can you please clarify the source of this report? is this local or on an ec2 instance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is from my local linux machine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, i'm getting different results on an instance provisioned via the jupyterhub testing

main.py Outdated Show resolved Hide resolved
@maxrjones
Copy link
Contributor

@maxrjones, i've enabled storing chrome traces in s3.

the RMSE values that i am getting are not close to zero when I run the benchmarks on my local Linux machine. is this expected? if so, does this mean the baselines are specifically tied to the machine they are generated from?
Screenshot 2023-07-31 at 15 41 42

you can view these exact plots by running

python scripts/analysis.py --timestamp 2023-07-31T22-32-38 --run 1 --s3-bucket s3://carbonplan-scratch

this seems possible. there are two options for progressing with this approach:

  1. document how to generate the reference snapshots using the same platform that will be used to run the benchmarks
  2. define the end time based on the time the rmse reaches a minimum rather than the time at which it reaches 0

I will explore option 2.

main.py Outdated Show resolved Hide resolved
@andersy005
Copy link
Member Author

i'm going to merge this shortly. i'm open to addressing any additional feedback in separate pull requests.

@andersy005 andersy005 merged commit 5101d75 into main Aug 2, 2023
2 checks passed
@andersy005 andersy005 deleted the add-zoom-levels-and-panning branch August 2, 2023 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants