WP3-28

From COMP4DRONES
Revision as of 10:40, 25 February 2022 by Uniss (talk | contribs) (Created page with "=Accelerator Design Methodology for OOCP= {|class="wikitable" | ID|| WP3-28 |- | Contributor || UNISS |- | Levels || System |- | Require || Application definition and FPGA-based System-on-Chip |- | Provide || Ready-to-use reconfigurable HW accelerator |- | Input | * Dataflow application specification(s) * HDL actor definition(s) * Communication protocol * Target architecture |- | Input | * Multi-dataflow network * Coarse-grained reconfigurable accelera...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Accelerator Design Methodology for OOCP

ID WP3-28
Contributor UNISS
Levels System
Require Application definition and FPGA-based System-on-Chip
Provide Ready-to-use reconfigurable HW accelerator
Input
  • Dataflow application specification(s)
  • HDL actor definition(s)
  • Communication protocol
  • Target architecture
Input
  • Multi-dataflow network
  • Coarse-grained reconfigurable accelerator RTL
  • Co-processor RTL
  • Programming tables
C4D building block The methodology is generic and applicable to generate accelerators for different tasks and scenarios. With respect to C4D, it could POTENTIALLY be used to implement HW accelerators related to perception, actuation, flight-control, payload management or data management.
TRL 3 to 4
Building Block diagram for WP3-28

Detailed Description

Although FPGA technology has the potential to satisfy the many performances, energy and predictability requirements of drone systems and applications, FPGA development is notoriously a complex task.

To deal with this problematic, the baseline feature of this component revolves around the composition of coarse-grained reconfigurable HW accelerators (CGRA) starting from a set of dataflow applications. The baseline feature involves two main components:

  • Multi-Dataflow Generator (MDG): it merges together different dataflows into one unique reconfigurable multi-dataflow by the insertion of switching modules. Currently, two merging algorithms are supported: empiric and Moreano. The former is more suitable for non-recursive dataflows but less optimized than the latter.
  • Platform composer (PC): it derives the RTL description of the CGRA from the multi-dataflow. It requires the user to define the communication protocol between actors in hardware (XML) and the RTL description of the actors involved in the dataflows (HDL Components Library, HCL).

This component also provides an automatic coprocessor generation, which automatically embeds the generated CGRA into a ready-to-use Xilinx IP. The user can choose among different options:

  • Processor: soft-core (Microblaze) or hardcore (ARM)
  • Processor-Coprocessor coupling: Memory-mapped or FIFO-based
  • Direct Access Memory Module: enable or not the usage of DMA

Specifications and contribution

The purpose of this component is to provide a methodology to generate application-specific HW accelerators that can be directly plugged in the final system. Additionally, this component automatically enables coarse-grained reconfiguration capabilities, which enables the capability of having a HW accelerator that can work at different working points (i.e. trade-offs among Quality of Service and Energy consumption) or functionalities (i.e. different implementations related to the same C4D building block).

Regarding the contribution associated to C4D, this component will be extended to be able to automatically generate plug-and-play coarse-grained reconfigurable HW accelerators that can be used by WP3-22 component. To do so, a unified methodology will be provided so as to combine the FPGA overlay provided in WP3-22 with the CGRA generation supported by this component.

Design and Implementation

Considering that the HW accelerators that are generated using this component are application specific, the required steps to generate the multi-dataflow networks and its associated CGRA are the following:

  • Define the three inputs that are required:
    • Implement the task(s) to be accelerated using a dataflow approach.
    • Define the HDL version of the actors composing the tasks (manually or with HLS tools).
    • Define the communication protocol to be used inside and outside the accelerator.
  • Using the MDG functionality, if more than one task has been specified, merge the tasks to be accelerated into a reconfigurable multi-dataflow.
  • Depending on the target architecture, the user must select the files to be generated, that could include the accelerator RTL description and a wrapping logic surrounding the accelerator itself that could allow the user to 1) use the coarse-grained reconfigurable accelerator as a co-processor or 2) to plug the accelerator with an FPGA overlay, as in the case of this project.
  • Run the automatically generated scripts to port the code to Vivado.
  • Synthesize and implement it on the target FPGA device..