Skip to content

Commit

Permalink
Version 0.11 - Upgrade to libpg_query 15 (#18)
Browse files Browse the repository at this point in the history
* First cut of upgrading to pg15

* Fix A_Const generation

* Tweak with a_const changes

* Fixes A_Const deserialization

* Further fixes for updated structures

* Numeric column mods in table test

* Parse tests

* Update str tests

* Tweaking generation of nodes

* Further work to get constants parsing properly

* Further progress towards pg 15 update

* Compiling tests

* Fixes functions requiring bool nodes

* Alter table additions

* Switch to unsupported instead of unimplemented

* Break into features

* Readme updates

* Linting
  • Loading branch information
paupino authored Jul 27, 2023
1 parent 93c1d43 commit d28a96c
Show file tree
Hide file tree
Showing 18 changed files with 1,850 additions and 1,203 deletions.
175 changes: 106 additions & 69 deletions Cargo.lock

Large diffs are not rendered by default.

10 changes: 6 additions & 4 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
[package]
name = "pg_parse"
description = "PostgreSQL parser that uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parse tree."
version = "0.10.0"
version = "0.11.0"
authors = ["Paul Mason <[email protected]>"]
edition = "2018"
edition = "2021"
documentation = "https://docs.rs/pg_parse/"
build = "build.rs"
license = "MIT"
readme = "./README.md"
repository = "https://github.com/paupino/pg_parse"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[features]
default = ["str"]
str = []

[dependencies]
serde = { version = "1", features = ["derive"] }
Expand All @@ -21,7 +23,7 @@ regex = "1.7"
version-sync = "0.9"

[build-dependencies]
bindgen = "0.63"
bindgen = "0.66"
heck = "0.4"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Add the following to your `Cargo.toml`

```toml
[dependencies]
pg_parse = "0.10"
pg_parse = "0.11"
```

## Example: Parsing a query
Expand All @@ -30,23 +30,22 @@ assert!(result.is_ok());
let result = result.unwrap();
assert!(matches!(*&result[0], Node::SelectStmt(_)));

// We can also convert back to a string
// We can also convert back to a string, if the `str` feature is enabled (enabled by default).
#[cfg(feature = "str")]
assert_eq!(result[0].to_string(), "SELECT * FROM contacts");
```

## What's the difference between pg_parse and pg_query.rs?

The [`pganalyze`](https://github.com/pganalyze/) organization will maintain the official implementation called [`pg_query.rs`](https://github.com/pganalyze/pg_query.rs). This
closely resembles the name of the C library also published by the team (`libpg_query`). This implementation will use the protobuf
The [`pganalyze`](https://github.com/pganalyze/) organization maintains the official implementation: [`pg_query.rs`](https://github.com/pganalyze/pg_query.rs). This
closely resembles the name of the C library also published by the team (`libpg_query`). This implementation uses the protobuf
interface introduced with version 13 of `libpg_query`.

This library similarly consumes `libpg_query` however utilizes the older JSON interface to manage parsing. The intention of this library
is to maintain a dependency "light" implementation with `serde` being the only required runtime dependency. While this was originally called
`pg_query.rs` it makes sense to decouple itself from the official naming convention and go on it's own. Hence `pg_parse`.
is to maintain a dependency "light" implementation with `serde` and `serde_json` being the only required runtime dependencies.

So which one should you use? You probably want to use the official `pg_query.rs` library as that will continue to be
kept closely up to date with `libpg_query` updates. This library will continue to be maintained however may not be as up
to date as the official implementation.
kept closely up to date with `libpg_query` updates. This library will continue to be maintained however may not be as up-to-date as the official implementation.

## Credits

Expand Down
22 changes: 21 additions & 1 deletion VERSION.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,27 @@
# Version 0.11

**This release upgrades `libpg_query` so contains breaking changes.**

New:
* Introduced a `parse_debug` function to allow for consuming functions to inspect the raw JSON output from `libpg_query`.
This is likely only used internally by library authors, but a useful feature nonetheless.

Modified:
* Updated `libpg_query` to [15-4.2.2](https://github.com/pganalyze/libpg_query/tree/15-4.2.2). This required a lot of refactoring to support the modified
AST being generated.
* String generation is now feature gated under `str`. This feature is not feature complete
so should be used with caution. Please note, this is currently enabled by default.

Other:
* Bumped the project version to `2021` and updated syntax accordingly.

Please note that some syntax support has been dropped between Postgres version releases. For example,
the `?` placeholder is no longer supported. For a full list, please see the `libpg_query` changelog.

# Version 0.10

Modified:
* Updated `libpg_query` to [13.2.2](https://github.com/pganalyze/libpg_query/releases/tag/13-2.2.0).
* Updated `libpg_query` to [13-2.2.0](https://github.com/pganalyze/libpg_query/releases/tag/13-2.2.0).
* Build optimization to prevent rebuilding when no changes [#16](https://github.com/paupino/pg_parse/pull/16).

Thank you [@haileys](https://github.com/haileys) for your contribution!
Expand Down
45 changes: 30 additions & 15 deletions build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -208,10 +208,11 @@ fn make_aliases(
node_types: &HashSet<String>,
type_resolver: &mut TypeResolver,
) -> std::io::Result<()> {
const IGNORE: [&str; 5] = [
const IGNORE: [&str; 6] = [
"BlockId",
"ExpandedObjectHeader",
"Name",
"ParallelVacuumState",
"ParamListInfo",
"VacAttrStatsP",
];
Expand Down Expand Up @@ -354,19 +355,27 @@ fn make_nodes(
writeln!(out, " {} {{ }},", name)?;
continue;
}
// Only one
let field = &def.fields[0];

// If this is an A_Const we handle this specially
if name.eq("A_Const") {
writeln!(out, " {name}(ConstValue),")?;
continue;
}

// These may have many fields, though we may want to handle that explicitly
writeln!(out, " {} {{", name)?;
writeln!(
out,
" #[serde(rename = \"{}\")]",
field.name.as_ref().unwrap(),
)?;
writeln!(
out,
" value: {}",
type_resolver.resolve(field.c_type.as_ref().unwrap())
)?;
for field in &def.fields {
let field_name = field.name.as_ref().unwrap();
let resolved_type = type_resolver.resolve(field.c_type.as_ref().unwrap());
writeln!(out, " #[serde(default)]")?;
// We force each of these as an Option so we can be explicit about when we
// want to handle absence of a field.
if resolved_type.starts_with("Option<") {
writeln!(out, " {field_name}: {resolved_type},")?;
} else {
writeln!(out, " {field_name}: Option<{resolved_type}>,")?;
}
}
writeln!(out, " }},")?;
}

Expand All @@ -389,7 +398,7 @@ fn make_nodes(

for field in &def.fields {
let (name, c_type) = match (&field.name, &field.c_type) {
(&Some(ref name), &Some(ref c_type)) => (name, c_type),
(Some(name), Some(c_type)) => (name, c_type),
_ => continue,
};

Expand Down Expand Up @@ -420,7 +429,7 @@ fn make_nodes(
deserializer,
if optional { ", default" } else { "" }
)?;
} else if type_resolver.is_primitive(c_type) {
} else if type_resolver.is_optional(c_type) {
if has_data {
write!(out, ", ")?;
}
Expand Down Expand Up @@ -506,6 +515,7 @@ impl TypeResolver {
primitive.insert("[]Node", "Vec<Node>");
primitive.insert("Node*", "Option<Box<Node>>");
primitive.insert("Expr*", "Option<Box<Node>>");
primitive.insert("String*", "Option<String>");

// Bitmapset is defined in bitmapset.h and is roughly equivalent to a vector of u32's.
primitive.insert("Bitmapset*", "Option<Vec<u32>>");
Expand Down Expand Up @@ -543,10 +553,15 @@ impl TypeResolver {
self.primitive.contains_key(ty) || self.aliases.get(ty).copied().unwrap_or_default()
}

pub fn is_optional(&self, ty: &str) -> bool {
self.is_primitive(ty) || ty.ends_with('*')
}

pub fn custom_deserializer(ty: &str) -> Option<(&str, bool)> {
match ty {
"[]Node" => Some(("crate::serde::deserialize_node_array", false)),
"List*" => Some(("crate::serde::deserialize_node_array_opt", true)),
"String*" => Some(("crate::serde::deserialize_nested_string_opt", true)),
_ => None,
}
}
Expand Down
2 changes: 1 addition & 1 deletion lib/libpg_query
Submodule libpg_query updated 666 files
27 changes: 27 additions & 0 deletions src/ast.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
#![allow(unused)]
#![allow(clippy::all)]

use serde::Deserializer;

// Type aliases
pub type bits32 = u32;

Expand All @@ -13,6 +15,31 @@ include!(concat!(env!("OUT_DIR"), "/ast.rs"));
#[derive(Debug, serde::Deserialize)]
pub struct Value(pub Node);

#[derive(Debug, Clone, PartialEq)]
pub enum ConstValue {
Bool(bool),
Integer(i64),
Float(String),
String(String),
BitString(String),
Null,
NotNull,
}

impl ConstValue {
pub fn name(&self) -> &'static str {
match self {
ConstValue::Bool(_) => "ConstBool",
ConstValue::Integer(_) => "ConstInteger",
ConstValue::Float(_) => "ConstFloat",
ConstValue::String(_) => "ConstString",
ConstValue::BitString(_) => "ConstBitString",
ConstValue::Null => "ConstNull",
ConstValue::NotNull => "ConstNotNull",
}
}
}

impl Value {
pub fn inner(&self) -> &Node {
&self.0
Expand Down
4 changes: 4 additions & 0 deletions src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ use std::fmt::{Display, Formatter};
pub enum Error {
ParseError(String),
InvalidAst(String),
InvalidAstWithDebug(String, String),
InvalidJson(String),
}

Expand All @@ -13,6 +14,9 @@ impl Display for Error {
match self {
Error::ParseError(value) => write!(f, "Parse Error: {}", value),
Error::InvalidAst(value) => write!(f, "Invalid AST: {}", value),
Error::InvalidAstWithDebug(value, debug) => {
write!(f, "Invalid AST: {}. Debug: {}", value, debug)
}
Error::InvalidJson(value) => write!(f, "Invalid JSON: {}", value),
}
}
Expand Down
4 changes: 3 additions & 1 deletion src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@
//! let result = result.unwrap();
//! assert!(matches!(*&result[0], Node::SelectStmt(_)));
//!
//! // We can also convert back to a string
//! // We can also convert back to a string, if the `str` feature is enabled (enabled by default).
//! #[cfg(feature = "str")]
//! assert_eq!(result[0].to_string(), "SELECT * FROM contacts");
//! ```
//!
Expand All @@ -36,6 +37,7 @@ mod bindings;
mod error;
mod query;
mod serde;
#[cfg(feature = "str")]
mod str;

pub use error::*;
Expand Down
40 changes: 39 additions & 1 deletion src/query.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,45 @@ pub fn parse(stmt: &str) -> Result<Vec<crate::ast::Node>> {
}
}

/// Normalizes the given SQL statement, returning a parametized version.
/// Similar to `parse`: parses the given SQL statement into the given abstract syntax tree
/// but also returns the raw output generated by the postgres parser.
///
/// # Example
///
/// ```rust
/// use pg_parse::ast::Node;
///
/// let result = pg_parse::parse_debug("SELECT * FROM contacts");
/// assert!(result.is_ok());
/// let (stmt, raw) = result.unwrap();
/// let el: &Node = &stmt[0];
/// assert!(matches!(*el, Node::SelectStmt(_)));
/// assert!(raw.contains("\"SelectStmt\""));
/// ```
pub fn parse_debug(stmt: &str) -> Result<(Vec<crate::ast::Node>, String)> {
unsafe {
let c_str = CString::new(stmt).unwrap();
let result = pg_query_parse(c_str.as_ptr() as *const c_char);

// Capture any errors first
if !result.error.is_null() {
let error = &*result.error;
let message = CStr::from_ptr(error.message).to_string_lossy().into();
pg_query_free_parse_result(result);
return Err(Error::ParseError(message));
}

// Parse the JSON into the AST
let raw = CStr::from_ptr(result.parse_tree);
let debug = raw.to_string_lossy().to_string();
let parsed: ParseResult = serde_json::from_slice(raw.to_bytes())
.map_err(|e| Error::InvalidAstWithDebug(e.to_string(), debug.to_string()))?;
pg_query_free_parse_result(result);
Ok((parsed.stmts.into_iter().map(|s| s.stmt).collect(), debug))
}
}

/// Normalizes the given SQL statement, returning a parameterized version.
///
/// # Example
///
Expand Down
Loading

0 comments on commit d28a96c

Please sign in to comment.