Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing Power BI queries that contain comments (aka --) causes it to ignore anything after #11251

Open
aaverbec opened this issue Aug 27, 2024 · 1 comment
Labels
bug Bug report

Comments

@aaverbec
Copy link

aaverbec commented Aug 27, 2024

Describe the bug
When we ingest a power bi report that has datasets that contain -- (2 hyphens),
the SQL parser ignores everything after it in the upstream lineage. I am using version 14.0.3 of the CLI

EDIT: It also doesn't seem to parse SQL that has hard new lines instead of #(lf) (second screenshot)

To Reproduce

Example query

select#(lf)upper(cs.customercode) as customercode#(lf), cs.ear2id as ear2id#(lf), db.branch_rollup_name#(lf) , db.subregion1#(lf) , db.subregion2#(lf)from nast_truckload_domain.broker.dim_customer cs#(lf)--join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true#(lf)join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true#(lf)join nast_customer_domain.broker.ref_branch db on db.Branch_code = so.branch_code#(lf)-- join nast_customer_domain.broker.customer_long_term_value as clv on cs.ear2id = clv.ear2_id#(lf)-- join nast_customer_domain.broker.customer_order_load cl on cl.customer_code = cs.customercode and cl.revenue>0 and cl.activity_date>= '2023-01-01'#(lf)where cs.customerstatusid = 1 --active#(lf)-- and cs.customercode = 'C8817970'#(lf)and db.primary_business_line_id in ('62','73')#(lf)-- and cs.ear2id = 199

It's a little easier to read if you replace the #(lf) with a new line.

select upper(cs.customercode) as customercode , cs.ear2id as ear2id , db.branch_rollup_name , db.subregion1 , db.subregion2 from nast_truckload_domain.broker.dim_customer cs --join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true join nast_customer_domain.broker.ref_branch db on db.Branch_code = so.branch_code -- join nast_customer_domain.broker.customer_long_term_value as clv on cs.ear2id = clv.ear2_id -- join nast_customer_domain.broker.customer_order_load cl on cl.customer_code = cs.customercode and cl.revenue>0 and cl.activity_date>= '2023-01-01' where cs.customerstatusid = 1 --active -- and cs.customercode = 'C8817970' and db.primary_business_line_id in ('62','73') -- and cs.ear2id = 199

Only the first table shows up because the line after it is commented out. The 3rd and 4th lines of the FROM clause are ignored.
image

I have tested removing the comments and they show up in the lineage. If there is a comment in the SELECT clause, none of the tables show upstream in the lineage.

Expected behavior
I would expect that the parser removes any lines of code from a -- to a new line or #(lf) character as #(lf) is how power bi represents new lines.

Screenshots

Example of query with new lines where lineage doesn't work. If I remove the new lines, it works fine.
image

Desktop (please complete the following information):

  • OS: [e.g. iOS] Windows 11
  • Browser [e.g. chrome, safari] Chrome
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@aaverbec aaverbec added the bug Bug report label Aug 27, 2024
@aaverbec
Copy link
Author

aaverbec commented Aug 27, 2024

@Bongton Thanks for the quick response. What does that file/application do? (Figured I should ask before I go download a random file)

Edit: Looks like Bongton's comment was removed so perhaps it was a phishing attempt to get me to download a file with a virus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

No branches or pull requests

2 participants
@aaverbec and others