fix: remove legacy bnf

This commit is contained in:
2026-03-12 22:24:52 -07:00
parent 31a6c8b91b
commit 944326f114
8 changed files with 46 additions and 1997 deletions

View File

@@ -12,42 +12,24 @@ Language reference: https://www.ibm.com/docs/en/i/7.5.0?topic=introduction-overv
cargo build --release
```
### Running
The compiler ships as a standalone binary that loads the embedded BNF grammar, builds a parser, and runs a suite of RPG IV snippet examples to demonstrate the grammar in action:
### Compiling an RPG IV program
```rust-langrpg/README.md
cargo run --bin demo
cargo run --release -- -o hello hello.rpg
./hello
```
You will see output similar to:
```rust-langrpg/README.md
=== RPG IV Free-Format Parser ===
[grammar] Loaded successfully.
[parser] Built successfully (all non-terminals resolved).
=== Parsing Examples ===
┌─ simple identifier (identifier) ─────────────────────
│ source : "myVar"
│ result : OK
└──────────────────────────────────────────────
...
=== Summary ===
total : 42
matched : 42
failed : 0
All examples parsed successfully.
DSPLY Hello, World!
```
### Hello World in RPG IV
The following is a complete Hello World program written in RPG IV free-format syntax, as understood by this parser:
The following is a complete Hello World program written in RPG IV free-format syntax:
hello.rpg:
`hello.rpg`:
```rust-langrpg/README.md
CTL-OPT DFTACTGRP(*NO);
@@ -68,10 +50,22 @@ Breaking it down:
- `DSPLY greeting;` — displays the value of `greeting` to the operator message queue
- `RETURN;` — returns from the procedure
To validate this program, execute the compiler to build the data:
### Compiler options
```sh
cargo run --release -- -o main hello.rpg
```
rust-langrpg [OPTIONS] <SOURCES>...
Arguments:
<SOURCES>... RPG IV source file(s) to compile
Options:
-o <OUTPUT> Output executable path [default: a.out]
--emit-ir Print LLVM IR to stdout instead of producing a binary
-O <LEVEL> Optimisation level 0-3 [default: 0]
--no-link Produce a .o object file, skip linking
--runtime <PATH> Path to librpgrt.so [default: auto-detect]
-h, --help Print help
-V, --version Print version
```
## Architecture
@@ -90,38 +84,28 @@ RPG IV source (.rpg)
┌─────────────────────────────────────────┐
│ 1. BNF validation (bnf crate)
src/rpg.bnf — embedded at compile
time via include_str!
└────────────────┬────────────────────────┘
│ parse tree (validation only)
┌─────────────────────────────────────────┐
│ 2. Lowering pass (src/lower.rs) │
│ Hand-written recursive-descent │
│ tokenizer + parser → typed AST │
│ 1. Parsing + lowering (src/lower.rs)
Hand-written tokenizer +
recursive-descent parser
│ → typed AST (src/ast.rs) │
└────────────────┬────────────────────────┘
│ ast::Program
┌─────────────────────────────────────────┐
3. LLVM code generation (src/codegen.rs│
2. LLVM code generation (src/codegen.rs│
│ inkwell bindings → LLVM IR module │
└────────────────┬────────────────────────┘
│ .o object file
┌─────────────────────────────────────────┐
4. Linking (cc + librpgrt.so) │
3. Linking (cc + librpgrt.so) │
│ Produces a standalone Linux ELF │
└─────────────────────────────────────────┘
```
### Stage 1 — BNF validation (`src/rpg.bnf` + `bnf` crate)
### Stage 1 — Parsing and lowering to a typed AST (`src/lower.rs`)
The RPG IV free-format grammar is encoded in BNF notation in `src/rpg.bnf` and embedded at compile time with `include_str!`. At startup the compiler parses the grammar with the [`bnf`](https://docs.rs/bnf/latest/bnf/) crate to build a `GrammarParser`. Each source file is validated against the top-level `<program>` rule before any further processing. This stage acts as a gate: malformed source is rejected early with a clear parse error.
### Stage 2 — Lowering to a typed AST (`src/lower.rs`)
The BNF parser only validates structure; it does not produce a typed tree suitable for code generation. A hand-written tokenizer and recursive-descent parser in `lower.rs` converts the raw source text into the typed `Program` AST defined in `src/ast.rs`.
A hand-written tokenizer and recursive-descent parser converts the raw source text directly into the typed `Program` AST defined in `src/ast.rs`. RPG IV keywords are case-insensitive and the parser handles mixed-case source naturally.
The AST covers the full language surface that the compiler handles:
@@ -133,7 +117,7 @@ The AST covers the full language surface that the compiler handles:
Unrecognised constructs produce `Statement::Unimplemented` or placeholder declaration variants rather than hard errors, so the compiler continues to lower the parts it understands.
### Stage 3 — LLVM code generation (`src/codegen.rs`)
### Stage 2 — LLVM code generation (`src/codegen.rs`)
The typed `Program` is handed to the code generator, which uses [`inkwell`](https://crates.io/crates/inkwell) (safe Rust bindings to LLVM 21) to build an LLVM IR module:
@@ -147,7 +131,7 @@ The typed `Program` is handed to the code generator, which uses [`inkwell`](http
The module is then compiled to a native `.o` object file for the host target via LLVM's target machine API, with optional optimisation passes (`-O0` through `-O3`).
### Stage 4 — Linking
### Stage 3 — Linking
The object file is linked into a standalone ELF executable by invoking the system C compiler (`cc`). The executable is linked against `librpgrt.so`.
@@ -180,17 +164,14 @@ DSPLY Hello, World!
```
rust-langrpg/
├── src/
│ ├── rpg.bnfRPG IV free-format BNF grammar (embedded at compile time)
│ ├── lib.rs — Grammar loader and demo helpers
│ ├── lib.rs Library root (re-exports ast, lower, codegen)
│ ├── ast.rs — Typed AST node definitions
│ ├── lower.rs — Tokenizer + recursive-descent lowering pass
│ ├── lower.rs — Tokenizer + recursive-descent parser + lowering pass
│ ├── codegen.rs — LLVM IR code generation (inkwell)
── main.rs — Compiler CLI (clap) + linker invocation
│ └── bin/
│ └── demo.rs — Grammar demo binary
── main.rs — Compiler CLI (clap) + linker invocation
├── rpgrt/
│ └── src/
│ └── lib.rs — Runtime library (librpgrt.so)
├── hello.rpg — Hello World example program
└── count.rpg — Counting loop example program
└── fib.rpg Fibonacci sequence example program
```