Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [firestore-bigquery-export] Server returned 502 when mirroring data to BQ #2133

Open
boywijnmaalen opened this issue Jul 23, 2024 · 14 comments
Labels
extension: firestore-bigquery-export Related to firestore-bigquery-export extension type: bug Something isn't working

Comments

@boywijnmaalen
Copy link

Describe your configuration

  • Extension name: firestore-bigquery-export
  • Extension version: 0.1.51
  • Configuration values:
    BigQuery Dataset location: europe-****
    BigQuery Project ID: ***
    Collection path: ***
    Enable Wildcard Column field with Parent Firestore Document IDs: true
    Dataset ID: ***
    Table ID: ***
    BigQuery SQL table Time Partitioning option type: DAY
    BigQuery Time Partitioning column name: timestamp
    Firestore Document field name for BigQuery SQL Time Partitioning field option: timestamp
    BigQuery SQL Time Partitioning table schema field(column) type: TIMESTAMP
    BigQuery SQL table clustering: ***
    Maximum number of synced documents per second: 100
    Backup Collection Name: Parameter not set
    Transform function URL: Parameter not set
    Use new query syntax for snapshots: no
    Exclude old data payloads: no
    Use Collection Group query: no
    Cloud KMS key name: Parameter not set

Describe the problem

in the last 30 days I can find 4 occurrences of this error:

Error when mirroring data to BigQuery FirebaseFunctionsError: Unexpected response with status: 502

image

on/around these timestamps we have updates missing for Firestore documents in BQ.

fix: a manual edit of the FS doc in question resulted in a sync to BQ (as expected).

the error contains a HTML response;

image

stack trace mentioned in the prntscrn;

at FunctionsApiClient.toFirebaseError (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:297:20) at FunctionsApiClient.enqueue (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:146:32) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async /workspace/lib/index.js:108:9 { errorInfo: { code: 'functions/unknown-error', message: 'Unexpected response with status: 502 }

Steps to reproduce:

unsure how to reproduce as I do not fully understand the problem, however, as indicated by the error, a 502, it seems the error occurs in the function running the firestore-bigquery-export business logic.

Expected result

to have every update for relevant Firestore docs synced to BQ

Actual result

missing updates for Firestore docs in BQ

@boywijnmaalen boywijnmaalen added the type: bug Something isn't working label Jul 23, 2024
@cabljac cabljac added the extension: firestore-bigquery-export Related to firestore-bigquery-export extension label Jul 23, 2024
@Nushio
Copy link

Nushio commented Jul 24, 2024

I get these on a daily basis. We're beyond frustrated with this extension.

Screenshot 2024-07-24 at 9 34 30 AM

@cabljac
Copy link
Contributor

cabljac commented Jul 25, 2024

Hi, thanks for raising this. I'm looking into it. It may an intermittent error on the BQ side, which would be difficult to resolve from extensions. I'll keep you updated with my progress

@cabljac
Copy link
Contributor

cabljac commented Jul 25, 2024

So the log is occurring here

When a cloud task fails to enqueue, the function makes this log. Then if more than 10 seconds have passed since the original firestore write event, it returns. Otherwise it will throw the error.

I'll have to look into the reasons for implementing it like this.

Maybe we should be retrying the enqueue at this point instead, to prevent data loss.

@boywijnmaalen
Copy link
Author

appreciate your time looking into this.

the link you provided speaks of a '1mb limit' payload limit for cloud tasks.

I can confirm our data object is no where near 1mb in size.

@cabljac
Copy link
Contributor

cabljac commented Jul 25, 2024

Yeah the code comment about payload size isn't relevant here, but is another existing issue with this extension for some users with large payload sizes.

The relevant link in the link is the catch block at line 143, if you're interested.

I'll see if we can get this issue prioritised.

@Nushio
Copy link

Nushio commented Jul 25, 2024

So the log is occurring here

When a cloud task fails to enqueue, the function makes this log. Then if more than 10 seconds have passed since the original firestore write event, it returns. Otherwise it will throw the error.

I'll have to look into the reasons for implementing it like this.

Maybe we should be retrying the enqueue at this point instead, to prevent data loss.

Speaking of dataloss, I found another huge issue with the cloud tasks and data loss and a way to replicate:

If there's any pending tasks on the cloud task queue when an upgrade or configuration change is made to the extension, the cloud tasks are cleared.

As a way to test this, set up the extension, and navigate to the extension's task list
https://console.cloud.google.com/cloudtasks ending in "syncBigQuery",

  1. Pause the queue, and make a few modifications on firestore so that they're queued on the cloud task queue.
  2. Check on BigQuery that the changes aren't replicating (they shouldn't, queue is paused)
  3. Change the extension configuration, and check that the queue is now clear
  4. Check BigQuery to see that the tasks are indeed gone and the data never made it to BQ.

@cabljac
Copy link
Contributor

cabljac commented Jul 25, 2024

thanks for bringing that up @Nushio

I think same might happen even if they're not paused as well, if the queues are backed up enough. I'll open a new issue to track this, and discuss it with the extensions team. I don't have an immediate viable solution for that in mind either, as the task queue name is pretty much fixed across reconfigurations (i think).

@Nushio
Copy link

Nushio commented Jul 25, 2024

Yes, our queue wasn't paused when I ran an update and lost data as a result.

(It's not lost, it's still on firestore, but now I need to ping each document to trigger a sync)

Pausing was a simple way to replicate the issue.

@boywijnmaalen
Copy link
Author

@cabljac I was wondering if there is an update on this?

this weekend we had a super busy weekend (business wise) and it turned out several Firestore document updates did not land in BQ.

we've added alerting when errors happen in the extension, however the errors do not include the document ID in question, so we only know the sync mechanism failed, but nog for which document (we have to look that up manually).

our alerting is triggered on; resource.labels.function_name=~"ext-firestore-bigquery-export" severity=ERROR

image

as you can see the errors occured more or less at the same time, with one outlier.

I'm aware one error type is not in scope of this specific issue. however I wanted it share it with you anyway as the general issue we're facing is; missing/failing firestore>biguery events.
Please let me know if I should raise a separate issue for this one

Error 1; (perhaps slightly off (issue) topic)

{ "textPayload": "Error when mirroring data to BigQuery FirebaseAppError: Credential implementation provided to initializeApp() via the \"credential\" property failed to fetch a valid Google OAuth2 access token with the following error: \"Error fetching access token: Could not fetch URI /computeMetadata/v1/instance/service-accounts/default/token\n\".\n at /workspace/node_modules/firebase-admin/lib/app/firebase-app.js:85:19\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async FunctionsApiClient.enqueue (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:138:13)\n at async /workspace/lib/index.js:108:9 {\n errorInfo: {\n code: 'app/invalid-credential',\n message: 'Credential implementation provided to initializeApp() via the \"credential\" property failed to fetch a valid Google OAuth2 access token with the following error: \"Error fetching access token: Could not fetch URI /computeMetadata/v1/instance/service-accounts/default/token\\n' +\n '\".'\n },\n codePrefix: 'app'\n}", "resource": { "type": "cloud_function", "labels": { "project_id": "XXXX", "region": "europe-west2", "function_name": "ext-firestore-bigquery-export-XXXX-fsexportbigquery" } }, "timestamp": "2024-08-23T19:51:53.216910Z", "receiveTimestamp": "2024-08-23T19:51:53.338847621Z" }

Error 2; (same as original issue topic)

{ "textPayload": "Error when mirroring data to BigQuery FirebaseFunctionsError: Unexpected response with status: 502 and body: <!DOCTYPE html>\n<html lang=en>\n <meta charset=utf-8>\n <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n <title>Error 502 (Server Error)!!1</title>\n <style>\n *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n </style>\n <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n <p><b>502.</b> <ins>That’s an error.</ins>\n <p>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds. <ins>That’s all we know.</ins>\n\n at FunctionsApiClient.toFirebaseError (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:297:20)\n at FunctionsApiClient.enqueue (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:146:32)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async /workspace/lib/index.js:108:9 {\n errorInfo: {\n code: 'functions/unknown-error',\n message: 'Unexpected response with status: 502 and body: <!DOCTYPE html>\\n' +\n '<html lang=en>\\n' +\n ' <meta charset=utf-8>\\n' +\n ' <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\\n' +\n ' <title>Error 502 (Server Error)!!1</title>\\n' +\n ' <style>\\n' +\n ' *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\\n' +\n ' </style>\\n' +\n ' <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\\n' +\n ' <p><b>502.</b> <ins>That’s an error.</ins>\\n' +\n ' <p>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds. <ins>That’s all we know.</ins>\\n'\n },\n codePrefix: 'functions'\n}", "resource": { "type": "cloud_function", "labels": { "region": "europe-west2", "project_id": "XXXX", "function_name": "ext-firestore-bigquery-export-XXXX-fsexportbigquery" } }, "timestamp": "2024-08-23T15:36:11.806726Z", "receiveTimestamp": "2024-08-23T15:36:12.078665762Z", }

@boywijnmaalen
Copy link
Author

@cabljac

a kind reminder 🙏🏾

do you have an update on a potential fix?

@cabljac
Copy link
Contributor

cabljac commented Sep 17, 2024

Hi @boywijnmaalen

Just looking to get prioritisation on this, which we may now have

@cabljac
Copy link
Contributor

cabljac commented Sep 17, 2024

The issue with my fix idea is that it's not really going to fix it - it will reduce the frequency of this error but we're still going to get data loss...

@cabljac
Copy link
Contributor

cabljac commented Sep 17, 2024

Out of interest @boywijnmaalen do you have EventArc events enabled for the extension? Trying to rule things out

@boywijnmaalen
Copy link
Author

@cabljac

Unsure what you mean if we have EventArc enabled for the extension?
(We use the Google EventArc API because of our (v2) Firebase functions/handlers, but probably this is not what you are referring too I think)

I just had a look at our firestore>bigquery extension properties, but from what I can tell there is an option to enable events if you want use custom event handlers. however we do not make use of custom event handlers.

as far as I can tell only the BigQuery API and Cloud Tasks API are required for this extension. I'm not aware of any specific event related APIs that need to be enabled as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension: firestore-bigquery-export Related to firestore-bigquery-export extension type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants