You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When we ingest a power bi report that has datasets that contain -- (2 hyphens),
the SQL parser ignores everything after it in the upstream lineage. I am using version 14.0.3 of the CLI
EDIT: It also doesn't seem to parse SQL that has hard new lines instead of #(lf) (second screenshot)
To Reproduce
Example query
select#(lf)upper(cs.customercode) as customercode#(lf), cs.ear2id as ear2id#(lf), db.branch_rollup_name#(lf) , db.subregion1#(lf) , db.subregion2#(lf)from nast_truckload_domain.broker.dim_customer cs#(lf)--join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true#(lf)join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true#(lf)join nast_customer_domain.broker.ref_branch db on db.Branch_code = so.branch_code#(lf)-- join nast_customer_domain.broker.customer_long_term_value as clv on cs.ear2id = clv.ear2_id#(lf)-- join nast_customer_domain.broker.customer_order_load cl on cl.customer_code = cs.customercode and cl.revenue>0 and cl.activity_date>= '2023-01-01'#(lf)where cs.customerstatusid = 1 --active#(lf)-- and cs.customercode = 'C8817970'#(lf)and db.primary_business_line_id in ('62','73')#(lf)-- and cs.ear2id = 199
It's a little easier to read if you replace the #(lf) with a new line.
select upper(cs.customercode) as customercode , cs.ear2id as ear2id , db.branch_rollup_name , db.subregion1 , db.subregion2 from nast_truckload_domain.broker.dim_customer cs --join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true join nast_customer_domain.broker.ref_branch db on db.Branch_code = so.branch_code -- join nast_customer_domain.broker.customer_long_term_value as clv on cs.ear2id = clv.ear2_id -- join nast_customer_domain.broker.customer_order_load cl on cl.customer_code = cs.customercode and cl.revenue>0 and cl.activity_date>= '2023-01-01' where cs.customerstatusid = 1 --active -- and cs.customercode = 'C8817970' and db.primary_business_line_id in ('62','73') -- and cs.ear2id = 199
Only the first table shows up because the line after it is commented out. The 3rd and 4th lines of the FROM clause are ignored.
I have tested removing the comments and they show up in the lineage. If there is a comment in the SELECT clause, none of the tables show upstream in the lineage.
Expected behavior
I would expect that the parser removes any lines of code from a -- to a new line or #(lf) character as #(lf) is how power bi represents new lines.
Screenshots
Example of query with new lines where lineage doesn't work. If I remove the new lines, it works fine.
Desktop (please complete the following information):
OS: [e.g. iOS] Windows 11
Browser [e.g. chrome, safari] Chrome
Version [e.g. 22]
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Describe the bug
When we ingest a power bi report that has datasets that contain -- (2 hyphens),
the SQL parser ignores everything after it in the upstream lineage. I am using version 14.0.3 of the CLI
EDIT: It also doesn't seem to parse SQL that has hard new lines instead of #(lf) (second screenshot)
To Reproduce
Example query
select#(lf)upper(cs.customercode) as customercode#(lf), cs.ear2id as ear2id#(lf), db.branch_rollup_name#(lf) , db.subregion1#(lf) , db.subregion2#(lf)from nast_truckload_domain.broker.dim_customer cs#(lf)--join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true#(lf)join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true#(lf)join nast_customer_domain.broker.ref_branch db on db.Branch_code = so.branch_code#(lf)-- join nast_customer_domain.broker.customer_long_term_value as clv on cs.ear2id = clv.ear2_id#(lf)-- join nast_customer_domain.broker.customer_order_load cl on cl.customer_code = cs.customercode and cl.revenue>0 and cl.activity_date>= '2023-01-01'#(lf)where cs.customerstatusid = 1 --active#(lf)-- and cs.customercode = 'C8817970'#(lf)and db.primary_business_line_id in ('62','73')#(lf)-- and cs.ear2id = 199
It's a little easier to read if you replace the #(lf) with a new line.
select upper(cs.customercode) as customercode , cs.ear2id as ear2id , db.branch_rollup_name , db.subregion1 , db.subregion2 from nast_truckload_domain.broker.dim_customer cs --join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true join nast_customer_domain.broker.dim_customer_ear2_single_ownership_history as so on cs.ear2id = so.ear2_id and is_current = true join nast_customer_domain.broker.ref_branch db on db.Branch_code = so.branch_code -- join nast_customer_domain.broker.customer_long_term_value as clv on cs.ear2id = clv.ear2_id -- join nast_customer_domain.broker.customer_order_load cl on cl.customer_code = cs.customercode and cl.revenue>0 and cl.activity_date>= '2023-01-01' where cs.customerstatusid = 1 --active -- and cs.customercode = 'C8817970' and db.primary_business_line_id in ('62','73') -- and cs.ear2id = 199
Only the first table shows up because the line after it is commented out. The 3rd and 4th lines of the FROM clause are ignored.
I have tested removing the comments and they show up in the lineage. If there is a comment in the SELECT clause, none of the tables show upstream in the lineage.
Expected behavior
I would expect that the parser removes any lines of code from a -- to a new line or #(lf) character as #(lf) is how power bi represents new lines.
Screenshots
Example of query with new lines where lineage doesn't work. If I remove the new lines, it works fine.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: