Last Updated: 2024–08-25

Background

Some components require a dataflow developer to reference specific files in a locally accessible file location. When you need to bring an external asset (i.e. a file) to Datavolo Cloud, you do not have the simplicity that exists on classic bare-metal installations. Our SaaS model limits your ability to place arbitrary files in a directory path of your choosing. An alternative solution is needed.

Scope of the tutorial

In this tutorial, you will learn how to manage assets via the UI and how to reference them as properties in components such as processors and controller services.

Learning objectives

Once you've completed this tutorial, you will be able to:

Prerequisites

Tutorial video

As an option, you can watch a video version of this tutorial.

Review the use case

For this tutorial, you will explore uploading two different assets and leverage each of them in distinct scenarios.

Utilize a driver

Upload a Postgres database driver jar file and reference it within a database connection pooling controller service.

View file contents

Upload an image file and create a follow that actually ingests the binary data into a FlowFile.

Download the files

Save the two files listed below to your local workstation for use later in the tutorial.

Initialize the canvas

Access a Runtime

Log into Datavolo Cloud and access (create if necessary) a Runtime.

Create a Parameter Context

Click on the navigator drawer and then select Parameter Contexts.

Select the plus icon located just below the navigation drawer on the right side.

Enter myAssets for Name and click Apply.

In the list of Parameter Contexts notice the one you just created then click on the Datavolo logo in the upper left corner to return to the canvas.

Drag and drop the Process Group icon from the components toolbar to the canvas.

Release your mouse, set the Name to Asset Tutorial and in the Parameter Context pulldown select myAssets before clicking on Add.

Double click on the newly created process group and notice the following Breadcrumbs in the lower left corner.

The remainder of this tutorial will occur in this visual abstraction from the top-level of the canvas.

Prepare the asset

Right click on a blank area of the canvas and select Parameters on the contextual menu that surfaces.

In the list of Parameter Contexts, click on the vertical ellipsis on the far right of the myAssets item and select Manage Assets.

In the Manage Assets box, either click on the upload icon to open a file chooser selection window OR drag and drop a file into the FILENAME pane to upload the postgresql-42.7.3.jar file you saved on your workstation at the beginning of this tutorial. Once uploaded, you will see it in the list as shown below.

Click on Close to return to the Parameter Contexts list. Use the vertical ellipsis to select Edit.

In the Parameters tab of Edit Parameter Context, click on the blue plus icon.

Enter pgdriver for Name, click the Reference Assets checkbox, and then check the previously added jar file. To save, click Ok.

Exit Edit Parameter Context window by clicking Apply followed by Close.

Exit Parameters Contexts list by clicking on Back to Process Group in the upper left corner.

Add processor and controller service

Add an ExecuteSQL processor to the canvas.

Double click on the new Processor and navigate to the Properties tab of the Edit Processor pop up. Click on the vertical ellipsis in the VALUE of the Database Connection Pooling Service PROPERTY and select Create a new service.

Choose DBCPConnectionPool in the Add Controller Service pop up before clicking Add.

You are returned to Edit Processor > Properties and the new connection pool controller service is populated for the Database Connection Pooling Service. Click on the vertical ellipsis again and this time select Go to Service.

Use the asset

In the Controller Services list, notice the DBCPConnectionPool previously created. For that list item, use the vertical ellipsis and choose Edit.

Edit the VALUE for Database Driver Location(s) to be #{pgdriver}, then click Ok and then Apply to save the property's value.

That's all there is to it. When the controller service needs to use that parameter value, it will get a string that represents the fully qualified file name to where the database driver is physically located.

The last section was the canonical use case of using uploaded assets. In this section, let's explore visualizing the fully qualified name of an uploaded asset. Additionally, let's verify that the contents of the file are actually present in the uploaded copy of it.

While we are at it, let's see a shortcut that allows you to upload an asset as you are creating a parameter.

Create another parameter

Return to the Edit Parameter Context pop up (hint: right click on the canvas) and click the blue plus icon to create a second Parameter.

Enter billyPic for the Name in the Add Parameter window and notice when you check the Reference Assets checkbox you will be presented with a list of previously uploaded assets as well as an upload icon.

Upload the file

Click the upload icon (or use the drag and drop feature) and as you did in the previous section, upload a file from your workstation. This time use the william-shakespeare-portrait.jpg one instead. Click on Close on the list of both files that are present now.

You are returned to the Add Parameter window and can now check the box for the new file before you click on Ok to assign it.

Add a processor

Back on the canvas, add a Processor of TYPE GenerateFlowFile.

In its Properties list, update Custom Text to be #{billyPic}.

Add a Funnel to the canvas and Create Connection from GenerateFlowFile to it for the success Relationship as shown below.

Run it once

Right click on GenerateFlowFile and select Run Once (i.e. do NOT click on Start).

After the canvas is refreshed, verify the GenerateFlowFile process is NOT running AND that there is a single FlowFile queued up in the success Connection.

View the file name

Right click on the Connection and select List Queue. On the list of FlowFiles that surfaces, click on the vertical ellipsis on the far right of the first (and only) item to select View Content.

Because the GenerateFlowFile processor used the Custom Text value (which was set to the billyPic Parameter) to represent the content for the FlowFile it generated, you will see the fully qualified file name of the uploaded asset displayed to you. Your path will be different, but the file name at the end will be the same as shown below.

You can close the new browser tab that was created to return to the FlowFile list. Click on Back to Connection to return to the canvas.

Add a processor

Add a Processor of TYPE FetchFile. Create a Connection to it from the Funnel used in the last step.

Create another Funnel and Create Connection to it from FetchFile for the success Relationship.

Go to the Properties tab of the Edit Processor screen for FetchFile and set File to Fetch to #{billyPic}. Then, in the Relationships tab auto-terminate the failure, not.found, and permission.denied Relationships.

Start the processor

Start FetchFile and see that the single FlowFile is now queued up in the new success Connection as shown below.

View the file contents

List Queue on this new Connection and View Content for the FlowFile it contains to verify that the contents of the asset uploaded are indeed a striking image of Willy Shakespeare himself.

For this section, you will remove the postgresql-42.7.3.jar file.

Remove references

Delete ExecuteSQL from the dataflow. Right click on the canvas and select Controller Services. On the DBCPConnectionPool row, click on the vertical ellipsis and Delete on the choices that surface.

Click on Back to Process Group to return to the canvas.

Right click on the canvas and select Parameters. Click on the pgdriver property and notice that it lists No referencing components. Go ahead and Delete it which itself will remove the reference to the database driver file.

Notice that the billyPic property that references the image file is still in use by Referencing Components.

Click Apply to save the changes and return to the list of Parameter Contexts.

Remove a file

Click on the vertical ellipsis and then Manage Assets for the Parameter Context named myAssets. Notice the delete icons look slightly different for the two files previously uploaded.

Hover over the delete icon for the .jar file and notice the tooltip that surfaces.

Click on the delete icon for the .jar file and then attempt to remove the .jpg file. It shows that it is still referenced and cannot be deleted.

Click Close.

Congratulations, you've completed the Managing assets tutorial!

What you learned

What's next?

Check out some of these codelabs...

Further reading

Reference docs