Skip to content

ClickHouse: Support scalar expressions in WITH clause#2333

Open
alrevuelta wants to merge 1 commit intoapache:mainfrom
alrevuelta:with-scalar-clickhouse
Open

ClickHouse: Support scalar expressions in WITH clause#2333
alrevuelta wants to merge 1 commit intoapache:mainfrom
alrevuelta:with-scalar-clickhouse

Conversation

@alrevuelta
Copy link
Copy Markdown

Fixes #2221.

ClickHouse's WITH clause supports a reversed-order form <expression> AS <identifier> (alongside, and freely interleaved with, traditional CTEs). Currently the parser hard-errors on the leading non-identifier token:

WITH 42 AS answer SELECT answer FROM t
-- Expected: identifier, found: 42

Changes

  • New WithItem enum (Cte(Cte) | Named { expr, alias }); With::cte_tables: Vec<Cte> becomes With::items: Vec<WithItem>.
  • New dialect flag Dialect::supports_with_clause_scalar_expression (default false, true for ClickHouseDialect only).
  • New parse_with_item routes each WITH-list element: when the leading token can't begin a CTE name, it parses <expr> AS <ident> directly; otherwise it tries the CTE form via maybe_parse and falls back. parse_cte itself is unchanged.

Tests

Added in tests/sqlparser_clickhouse.rs, covering: literal scalar (the issue's example), string literal, aggregate, scalar subquery, bare-identifier disambiguation, mixing scalar + CTE in one list, lambda, and a negative test confirming non-ClickHouse dialects still reject the form.

Comment on lines +1901 to +1910
#[test]
fn parse_with_clause_named_expression_unsupported_in_other_dialects() {
// The named-expression form is only enabled for ClickHouse; other
// dialects should still reject `WITH 42 AS answer …`.
let res = sqlparser::parser::Parser::parse_sql(
&GenericDialect {},
"WITH 42 AS answer SELECT answer FROM t",
);
assert!(res.is_err(), "expected parse error, got {res:?}");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can skip this, no need to test what the parser does for an invalid query for unsupported dialects

}

#[test]
fn parse_with_clause_named_expression_ast() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can merge this with the test above since they cover the same feature

#[test]
fn parse_with_clause_named_expression() {
// Plain literal scalar.
clickhouse().verified_stmt("WITH 42 AS answer SELECT answer FROM t");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the tests, lets use all_dialects_where instead of hardcoding it to clickhouse

Comment thread src/parser/mod.rs
Comment on lines +14654 to +14661
// CTE form must start with an identifier. If the leading token
// can't begin one (e.g. `42`, `(SELECT …)`, `(x, y) -> …`), this
// is unambiguously the named-expression form.
if matches!(self.peek_token().token, Token::Word(_)) {
if let Some(cte) = self.maybe_parse(|p| p.parse_cte())? {
return Ok(WithItem::Cte(cte));
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be simplified to use peek_subquery_or_cte_start? e.g.

that we do if self.dialect.supports() and !self.peek_subquery_or_cte_start { parse expr } else { parse cte }

Comment thread src/parser/mod.rs
Comment on lines +14642 to +14648
/// Parse a single item in a `WITH` clause.
///
/// In standard SQL this is always a CTE (`name [(cols)] AS (query)`).
/// Dialects that enable [`Dialect::supports_with_clause_scalar_expression`]
/// — currently only ClickHouse — also accept the reversed form
/// `<expression> AS <identifier>`, which can be freely interleaved with
/// CTEs in the same comma-separated list.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Parse a single item in a `WITH` clause.
///
/// In standard SQL this is always a CTE (`name [(cols)] AS (query)`).
/// Dialects that enable [`Dialect::supports_with_clause_scalar_expression`]
/// — currently only ClickHouse — also accept the reversed form
/// `<expression> AS <identifier>`, which can be freely interleaved with
/// CTEs in the same comma-separated list.
/// Parse a single item in a `WITH` clause.

Comment thread src/ast/query.rs
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
pub enum WithItem {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repr wise can we do something like?

pub enum WithExpression {
    Cte(...),
    Cse(ExprWithAlias), // can we reuse exprwithalias?
}

Comment thread src/ast/query.rs
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
pub enum WithItem {
/// A traditional common table expression: `name [(cols)] AS [MATERIALIZED] (query)`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// A traditional common table expression: `name [(cols)] AS [MATERIALIZED] (query)`.
/// A common table expression.

Comment thread src/ast/query.rs
Comment on lines +780 to +785
/// `<expr> AS <alias>` — binds an expression (literal, scalar subquery,
/// lambda, …) to a name visible in the surrounding query.
///
/// See ClickHouse's [common scalar expressions][1].
///
/// [1]: https://clickhouse.com/docs/sql-reference/statements/select/with#common-scalar-expressions
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// `<expr> AS <alias>` — binds an expression (literal, scalar subquery,
/// lambda, …) to a name visible in the surrounding query.
///
/// See ClickHouse's [common scalar expressions][1].
///
/// [1]: https://clickhouse.com/docs/sql-reference/statements/select/with#common-scalar-expressions
/// A common scalar expression.
///
/// [Clickhouse]: https://clickhouse.com/docs/sql-reference/statements/select/with#common-scalar-expressions

Comment thread src/ast/query.rs
pub cte_tables: Vec<Cte>,
/// The items declared by this `WITH` clause: traditional CTEs and,
/// for dialects that support it, named expressions.
pub items: Vec<WithItem>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub items: Vec<WithItem>,
pub exprs: Vec<WithExpression>,

Comment thread src/ast/query.rs
Comment on lines +757 to +758
/// The items declared by this `WITH` clause: traditional CTEs and,
/// for dialects that support it, named expressions.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The items declared by this `WITH` clause: traditional CTEs and,
/// for dialects that support it, named expressions.
/// The expressions declared by this `WITH` clause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ClickHouse: Parsing error for WITH statement using scalar expressions

2 participants