Report Generator in Rust

logo

Developing a Backend App in Rust

In this blog post, we will see how easy it is to build a backend application in Rust using a couple of community libraries.

To make it more practical, we are going to implement a service which converts an HTML report into a PDF file.

Rust Programming Language

Before we begin, let’s shortly look at Rust from a programming language perspective. Here are some of the main language characteristics:

  1. System programming language
  2. Statically typed language
  3. Focused on performance and stability
  4. No garbage collector, no runtime
  5. Safe concurrency through compiler checks, no data-races
  6. Memory management is done by the compiler through
    the borrow checker, lifetimes, mutable/immutable references separation, etc.
  7. Developed by Mozilla and completely open-sourced on GitHub

Application Requirements

Here is what needs to be covered by our Rust implementation:

  1. An HTTP endpoint to send dynamic user data and get a PDF report back
  2. The report template must be stored on disk, so that one can modify it when needed
  3. The template must be in HTML format
  4. The template must support transformation of input data before it is rendered into the final PDF

Solution

In general, our solution will be based on the following Rust crates1:

  1. HTTP Server by using Rocket web-framework
  2. Template engine by using handlebars-rust
  3. wkhtmltopdf command line tool to generate PDF (not a crate)
  4. Utility crates for logging, ini-file parsing, UUID generation

Implementation

A common approach when developing an application in Rust is to use its package manager called Cargo. It downloads dependencies, compiles packages, publishes packages to a central crates storage for collaboration and much more. Here is the link to the installation guide for Cargo, which has to be installed first: Installation.

A Cargo project is configured via a TOML file, which describes project dependencies and more. It also has a convenient command-line option to create a draft package, which we are going to use as well. Run the following command in your terminal:

cargo new pdf-generator

It should create a new folder with a predefined file structure as well the Cargo.toml file. We are going to replace the auto-generated Cargo.toml file with the content below.

Cargo.toml:

[package]
name = "pdf-generator"
version = "0.1.0"

[dependencies]
rocket = "0.4.0"
rocket_codegen = "0.4.0"
rocket_contrib = { version = "0.4.0", features = ["json"]}

handlebars = "1.1.0"

serde = "1.0.79"
serde_json = "*"
serde_derive = "*"

dotenv = "0.13.0"

uuid = {version = "0.7.1", features = ["serde", "v4"]}

log = "0.4.5"
env_logger = "0.5.13"

A Cargo file contains a list of crates and their versions which we are going to use to implement the application.

Workflow

The service workflow, which we are going to implement, can be described as a synchronous call over HTTP and looks like this:

HTTP Route -> Service -> Template Engine -> wkhtmltopdf -> PDF

The PDF is sent back with an application/pdf HTTP header value.

HTTP endpoint

First, we define a Rocket route at the uri “/generate” and mount it into Rocket framework.

src/routes.rs:

// here come package and crate imports 

#[post("/generate", format = "application/json", data = "<req>")]
pub fn generate_report(service: State<ReportService>, req: Json<GetReport>)
                       -> Result<NamedFile, BadRequest<String>> {
    let params = req.0.user_params;
    let report = service.render(&req.0.template_name, params);

    report
        .map_err(|e| 
            BadRequest(Some(format!("Failed to generate report: {:?}", e))))
        .and_then(|path| 
            NamedFile::open(path).map_err(|e| BadRequest(Some(e.to_string()))))
}

pub fn mount_routes(service: ReportService) -> Rocket {
    rocket::ignite()
        .manage(service)
        .mount(
            "/api/v1",
            routes![generate_report],
        )
}

Later on, we will launch the Rocket server using the mount_routes function.

Note that we omit module import statements everywhere (i.e. use/extern) to focus on the main package code. Please go to the source code repository for these details.

Macros

You have probably noticed the usage of a special syntax in above code snippet like ! and #. These symbols come from the Rust macros feature.

The # character is an attribute-like macro, which is going to be expanded by the compiler into another Rust code. In the above route macro, Rocket will generate additional code to handle HTTP requests according to parameters like URI, body format and binding variable to set the payload.

The ! character is a function-like Rust macro. Examples above are routes! and format!.

To read more on the Rust macros feature: The Rust Programming Language -> Macros

Input Parameters

There are two parameters in the generate_report function. The first one is a Rocket state wrapper. This is the way one can work with in-memory state when using the Rocket framework. The second parameter contains the report user data. In above case, it comes from the HTTP body JSON payload, which will be decoded into the GetRequest struct:

Here is a definition of the struct for generating the report:

#[derive(Deserialize)]
pub struct GetReport {
    template_name: String,
    user_params: JsonValue,
}

The second field is deserialized into a JSON object. It covers the case that user data is dynamic. It does not makes sense to parse it into a specific structure, because its structure is unknown beforehand. Basically, user_params is dynamic user data as per requirement #1. We will dump it as JSON text into the template engine and will render it using Handlebars-rust.

Rendering

ReportService contains the main service logic.

src/service.rs:

pub struct ReportService {
    template_engine: TemplateEngine,
    work_dir: String,
}

There is a render function, which will be called from the HTTP route. Here are the functions of impl ReportService :

pub fn render<T>(&self, template_name: String, data: T)
                     -> Result<PdfPath, RenderingError> 
                     where T: Serialize + std::fmt::Debug {
    debug!("rendering report for data {:?}", &data);
    let html = self.template_engine.render(&template_name, &data)
        .map_err(|e| RenderingError(format!("Failed to render, error: {:?}", e)))?;

    let destination_pdf = self.dest_name(&template_name);

    debug!("destination PDF {}", &destination_pdf);
    let output = ReportService::run_blocking(html, &destination_pdf)?;

    debug!("status: {}", output.status);
    debug!("stdout: {}", String::from_utf8_lossy(&output.stdout));
    debug!("stderr: {}", String::from_utf8_lossy(&output.stderr));

    if output.status.success() {
        Ok(destination_pdf)
    } else {
        Err(RenderingError(
                format!("Failed to render template: {:?}", template_name))
        )
    }
}

Eventually, it runs an OS command to execute the wkhtmltopdf tool passing the rendered HTML as byte array:

fn run_blocking(html: String, destination_pdf: &str) 
    -> Result<Output, RenderingError> {
    
    let mut child = Command::new(WKHTMLTOPDF_CMD)
        .stdin(Stdio::piped())
        .stdout(Stdio::null())
        .arg(USE_STDIN_MARKER)
           .arg(&destination_pdf)
           .spawn()
           .map_err(|e| 
                RenderingError(format!("Failed to spawn child process: {}", e)))?;
    {
        let stdin = child.stdin.as_mut()
             .ok_or(RenderingError("Failed to open stdin".to_string()))?;

         stdin.write_all(html.as_bytes())
            .map_err(|e| 
                RenderingError(format!("Failed to write HTML into stdin, error: {}", e)))?;
    }

    let output = child.wait_with_output()
        .map_err(|e| 
            RenderingError(format!("Failed to read stdout, error: {}", e)))?;

    Ok(output)
}

Note that we use the ? postfix operator everywhere in the package code, which is an early-return for Result.Err value or a get/unwrap of a value in case of Result.Ok value. By using ?, we short-circuit error handling and return an error state immediately back to caller. Otherwise, in any non-error case, the Ok value is unwrapped and the program gets to continue its flow.

Here is a function to generate the file name of the final PDF report. The name will be unique:

fn dest_name(&self, template_name: &str) -> PdfPath {
    format!("{}/{}-{}.pdf", self.work_dir, Uuid::new_v4(), template_name)
}

Template Engine

We abstract the Handlebars crate by wrapping it into a separate struct to render HTML templates in memory. The result of the TemplateEngine is an HTML text, which is used by ReportService.

src/templates.rs:

pub struct TemplateEngine {
    handlebars: Handlebars
}

#[derive(Debug)]
pub struct TemplateError(String);

Implementation:

impl TemplateEngine {
    pub fn new() -> Result<Self, TemplateError> {
        let handlebars = TemplateEngine::init_template_engine()?;
        Ok(TemplateEngine { handlebars })
    }
    
    fn init_template_engine() -> Result<Handlebars, TemplateError> {
        let mut handlebars = Handlebars::new();
        let path = Path::new("./templates");
        handlebars
            .register_templates_directory(".html", path)
            .map_err(|e| 
                TemplateError(format!("Failed to register templates dir {}", e)))?;       

        Ok(handlebars)
    }
    
    pub fn render<T>(&self, template_name: &str, data: T)
                     -> Result<String, TemplateError> 
                     where T: Serialize + std::fmt::Debug {
        debug!("render template: {:?}", template_name);
        self.handlebars.render(&template_name, &data)
            .map_err(|e| TemplateError(e.to_string()))
    }
}

As we can see, the Handlebars crate takes a generic data type T, which must implement the traits Serializable and Debug. It is actually going to be a JSON text, when it comes to call the handlebars.render function.

Launch Application

Now, we have all the pieces to start the application with the Rocket framework. Here are our main.rs functions:

fn main() {
    init_logger();
    info!("Starting pdf-generator...");

    match ReportService::new() {
        Ok(s) => {
            let error = mount_routes(s).launch();
            drop(error);
        }
        Err(e) => {
            error!("Failed to start pdf-generator service, error: {:?}", e);
            panic!(e)
        }
    }
}

fn init_logger() {
    let mut builder = Builder::new();
    builder.target(Target::Stdout);
    env::var("RUST_LOG").iter().for_each(|s| { builder.parse(s.as_str()); });
    builder.init();
}

The main function creates a new ReportService. If it is initialised successfully, then the .launch() function is called.

At this point, the main application thread is blocked by Rocket, which is waiting to handle incoming requests.

In order to launch the report-generator locally, let’s use the Cargo build tool. It has a special run command to execute an application binary in debug/unoptimized way:

cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.24s
     Running `target/debug/pdf-generator`
 INFO 2019-01-09T20:35:32Z: pdf_generator: Starting pdf-generator... 
 INFO 2019-01-09T20:35:32Z: pdf_generator::templates: Number of registered templates: 1
 INFO 2019-01-09T20:35:32Z: pdf_generator::templates: Templates registered:
 INFO 2019-01-09T20:35:32Z: pdf_generator::templates: book-order-report

As you might have noticed, there is one template registered, which I prepared to test the service. This template is an HTML file with Handlebars scripts. Let’s look at the important part of the original HTML file:

<p class="p1"><span class="s1"><b>Book Order</b></span></p>
<p class="p2"><span class="s1"></span><br></p>
<p class="p1"><span class="s1">Customer Name: <b>{{customer_name}}</b></span></p>
.....
<table cellspacing="0" cellpadding="0" class="t1">
  <tbody>
    <tr>
      <td valign="top" class="td1">
        <p class="p3"><span class="s1"><b>Book</b></span></p>
      </td>
      <td valign="top" class="td2">
        <p class="p4"><span class="s1"><b>Amount, EUR</b></span></p>
      </td>
    </tr>

    {{#each ordered_books as |b| ~}}
    <tr>
      <td valign="top" class="td3">
        <p class="p5"><span class="s1"><b>{{b.book_name}}</b></span></p>
      </td>
      <td valign="top" class="td4">
        <p class="p6"><span class="s1">{{b.amount}}</span></p>
      </td>
    </tr>
    {{/each~}}

    ...
  </tbody>
</table>

Handlebars code is written inside double-curly braces {{ }}. It supports loops and a couple of more programming constructions to create a simple template. This template is an example of rendering a theoretical customer order at some book store. There are a couple customer fields like name and address as well as an array of books they purchased.

Above template prints:

  • simple user fields in HTML paragraphs <p>.
  • array of books inside the HTML table. One table row <tr> per book.

Acceptance Test

The idea of a pdf-generator is that the HTML template is designed for a specific user data JSON structure. If the user data structure is changed, then the HTML template needs to be adjusted accordingly. However, the application Rust code stays the same.

In order to print something meaningful based on the HTML template above, we need to send the following JSON structure as POST HTTP request:

{
  "template_name": "book-order-report",
  "user_params": {
    "customer_name": "Frank Smith",
    "address": "Frankfurt am Main, Mainzer str. 100",
    "ordered_books": [
      {
        "book_name": "Getting Things Done: The Art of Stress-Free Productivity. Authors ...",
        "amount": 9.51
      },
      {
        "book_name": "Funky Business - Talent Makes Capital Dance. Authors ...",
        "amount": 14.99
      },
      {
        "book_name": "The Rust Programming Language (Manga Guide). Authors ...",
        "amount": 23.99
      }
    ],
    "total": 48.49
  }
}

and finally the result looks like this:

result_pdf

Improvements

There are a couple of things which could be improved in the implementation above:

  • the user data could be received as a String. The report name could be moved to URI parameters, so that the request body could be used as is by the template engine. This would help to avoid memory allocation and CPU cycles when parsing JSON text into the JSON object
  • if a service throughput was more important than a single request performance, we could design an asynchronous flow. For example, the report-generator could accept HTTP requests as tasks. One more endpoint could be added to show task statuses and destination links where prepared PDFs would be stored
  • a clean working directory where generated PDF files are stored. This could be done either by some scheduler, or the application itself could spawn an asynchronous delayed task to clean the generated file.

Summary

We have seen that writing a backend application in Rust is easy and is great fun, thanks to community libraries and a nice build tool. There are more and more crates becoming available and, what is more important, some of them have reached stable version 1.x, so that one can use them as building blocks to make something bigger. Familiar C-style syntax plus modern features makes Rust a good choice for the implementation of backend and cloud-native applications, development tools or command-line utilities.

Although Rust is in active development, it provides stable, beta and nightly release versions. The Rust Development Team follows a six week schedule to release a new stable compiler version. More on the release cycle is here

Rust can be attractive from a Functional Programming perspective as well. For that Rust provides closures, Iterator trait, pattern matching, built-in Result enum for error-handling, Option enum for empty values, separation of mutability and immutability, separation of data and logic via structs and functions.

Links

  1. The Rust Programming Language book
  2. Source Code of the Report Generator on Github
  3. Rocket framework
  4. wkhtmltopdf tool
  5. Central registry of Rust Crates crates.io
  1. A Rust dependency/library is called crate.  ↩

TAGS

Comments

Please accept our cookie agreement to see full comments functionality. Read more