Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INSERT errors when using default values #445

Open
2 tasks done
dpxcc opened this issue Nov 19, 2024 · 2 comments
Open
2 tasks done

INSERT errors when using default values #445

dpxcc opened this issue Nov 19, 2024 · 2 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@dpxcc
Copy link
Contributor

dpxcc commented Nov 19, 2024

What happens?

INSERT errors when using default values because pgduckdb_get_querydef() results in invalid SQL query

To Reproduce

CREATE TEMP TABLE t (a int DEFAULT 3, b int) USING duckdb;
INSERT INTO t (b) VALUES (123), (456);

throws error

2024-11-19 00:15:50.950 UTC [3577] ERROR:  (PGDuckDB/CreatePlan) Prepared query returned an error: 'Binder Error: Column name/value mismatch for insert on t: expected 2 columns but 1 values were supplied

because pgduckdb_get_querydef() returns

INSERT INTO pg_temp.main.t (a, b) VALUES (123), (456)

OS:

Linux

pg_duckdb Version (if built from source use commit hash):

0.1.0

Postgres Version (if built from source use commit hash):

17.0

Hardware:

No response

Full Name:

Cheng Chen

Affiliation:

Mooncake Labs

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?

  • Yes, I have
@JelteF JelteF added the bug Something isn't working label Nov 19, 2024
@JelteF
Copy link
Collaborator

JelteF commented Nov 19, 2024

Okay I took a quick look at why this is happening. The relevant part of pgduckdb_get_querydef that generates the broken column list can be found here: https://github.com/duckdb/pg_duckdb/blob/main/src/vendor/pg_ruleutils_17.c#L6677-L6707

It uses the targetList from the query to create that list, which seems sensible. Surprisingly the a column is part of that list though. Somewhere it gets added in with a Const node for the expr field:

      {TARGETENTRY
      :expr
         {CONST
         :consttype 23
         :consttypmod -1
         :constcollid 0
         :constlen 4
         :constbyval true
         :constisnull false
         :location -1
         :constvalue 4 [ 3 0 0 0 0 0 0 0 ]
         }
      :resno 1
      :resname a
      :ressortgroupref 0
      :resorigtbl 0
      :resorigcol 0
      :resjunk false
      }

Interestingly this is not a problem for places where postgres creates an INSERT INTO string. So I guess in this case it doesn't

> CREATE TABLE test_rule(a int DEFAULT 3, b int);
CREATE TABLE
> CREATE RULE test_rule_insert AS ON UPDATE TO test_rule DO INSERT INTO test_rule(b) VALUES (123), (456);
CREATE RULE
> SELECT * from pg_rules where rulename = 'test_rule_insert';
 schemaname │ tablename │     rulename     │                                      definition
────────────┼───────────┼──────────────────┼──────────────────────────────────────────────────────────────────────────────────────
 public     │ test_rule │ test_rule_insert │ CREATE RULE test_rule_insert AS                                                     ↵
            │           │                  │     ON UPDATE TO public.test_rule DO  INSERT INTO test_rule (b) VALUES (123), (456);
(1 row)

I can think of 2 possible ways of handling this:

  1. Make sure the Const node does not get added to the query, like I expect is done for the CREATE RULE.
  2. Change our vendored ruleutils file to not write a target entry if the expression is of type Const.

I feel like 1 would be the cleanest, since it seems like we're calling the function with an incorrect type of query tree. I'm not sure if we can fix that though, because this field is already Const node is already present as soon as we get the parse tree in our planner hook. So I don't know what other query tree we could pass it.

So 2 is probably easier. The main downside of 2 is that there might be an issue if Const nodes are generated in the targetList for other reasons. I wasn't able to do that though with simple queries, like a VALUES list with a single item or a SELECT 1;

@JelteF
Copy link
Collaborator

JelteF commented Nov 19, 2024

Okay, a bit more info. The function that adds these Const nodes is rewriteTargetListIU, and there's a hook we can use to get the query tree before that function is called: post_parse_analyze_hook.

Using the post_parse_analyze_hook to store the original query tree seems like it could be a bit expensive though. At the very least we'd have to call NeedsDuckdbExecution there.

Looking at rewriteTargetListIU it doesn't seem to do any other things that would impact the query returned by pgduckdb_get_querydef in a meaningful way (i.e. where it breaks the query like reported here). So I think the best approach is to go for option 2, and make our vendored ruleutils ignore target entries of type Const.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants