Telecoms, Media & Technology is part of the Knowledge and Networking Division of Informa PLC
This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 3099067.
Rick O' Connor
Welcome & Foundation Overview
MediaTek RISC-V Processor on Sensorhub Application
New Members of AndeStar V5 Processor IPs
Andes Technology Corporation
Our Passion on the Popularization of RISC-V
Nuclei System Technology
RISC-V is an emerging instruction set architecture. Because of its openness, compactness, modularity, and extensibility, RISC-V is quickly adopted in applications such as AIoT and many more. With its ecosystem with explosive growth potential, the future development is limitless. The panel will be focused on the opportunities and challenges RISC-V faces today.
The RISC-V P (DSP) extension task group is formed in August 2018. The chair of the task group is Chuan-Hua Chang from Andes Technology. The co-chair of the task group is Eric Flamand from GreenWaves Technologies. The charter of the P extension task group is as follows: Define and ratify Packed-SIMD DSP extension instructions operating on XLEN-bit integer registers for embedded RISC-V processors. The TG will also define compiler intrinsic functions that can be directly used in high-level programming languages. The scope of XLEN is expected to be 32, 64, and 128. The task group will use AndeStar V3 DSP ISA extension and Pulp DSP ISA extension as references for defining the P ISA extension. Both ISA extensions have been supported by gcc compiler and have been implemented in several silicon chips. The proposed instructions have been documented in a spreadsheet under the review of the task group participants. This talk will be focused on reporting the progress and the current status of the P extension task group on designing the extension and the issues discussed and determined by the task group participants. And more performance data on general DSP functions and audio/speech applications will be given to help determine the usefulness of the proposed instructions.
RISC-V is designed to scale from low-power microcontroller to high-performance supercomputers. The key to such a scalable performance is its vector extension ISA. Different from conventional wide-word parallelism based SIMD ISA, such as x86 SSE/AVX, and ARM/NEON, which is common in microprocessors, RISC-V exploits vector register based designs. Vector register based computing is more scalable since the same vector instructions would take fewer cycles on an implementation with more vector pipelines. Since a vector register could be long, so a vector instruction may take multiple or even many cycles. In order to avoid the delay waiting for the completion of a producing vector instruction, vector chaining is often implemented to allow the depending vector instruction to start execution as soon as its first element is ready. Full chaining in Cray-1/Cray-XMP/YMP includes memory chaining as well as functional unit chaining. However, Cray vector supercomputers did not adopt virtual memory systems and data caches. For modern microprocessors that are heavily built around cache hierarchies, the implementation of memory chaining may be challenging. We use an in-house developed RISC-V simulator to evaluate the performance tradeoffs between the full chaining and various restricted chaining implementations.
Traditional operating systems use virtual memory to create protection domains for processes. Each process has its own private virtual address space (protection domain) so they can be protected from each other. The drawbacks of this multiple address space model are that context switch overhead is high and data sharing is complex and difficult. In order to avoid these drawbacks, single address space operating system (SASOS) has been proposed. The biggest problem with SASOS is protection. There are several ways to create multiple protection domains on a single address space and capability system is one of the them. There are also several ways to implement capability system and segmentation is one of them. Besides being used as a mechanism to build a SASOS, there are several other benefits of segmentation. It can be used to point to I/O address space so a user level device driver can access its hardware device directly without kernel intervention. It can translate a segment address directly to physical address so TLB pollution from segment with very low or no locality of reference can be avoided. It can be helpful to implement software-managed TLB. It can also be used as a way to do upcall from kernel to user space.
Mediatek's first RISC-V based processor would be in 2019 smartphone SoC for sensor hub lower power application. This RISC-V based processor improve 2X performance and 50% power efficiency on voice wakeup (VoW) at the same time. The RISC-V based LLVM & LLDB toolchain is open to external to enable the possibility for MediaTek customer to do product differentiation and 3rd party vendors to optimize their algorithm on MediaTek SoC. The RISC-V based processor would be mass production in MediaTek 2019 SoC product and foreseeable to have more than 1 billion instances shipment in 3 years.
AndeStar V5 is an extension architecture based on and fully compliant to RISC-V. AndeStar V5 processors already have 6 members, covering 32-bit and 64-bit cores with fast-and-compact integer control, high-performance floating-point, and Linux support. In this talk, we will give an update of AndeStar V5 processor solutions with new features and more benchmarking data. In addition, we will introduce a couple new V5 processor IPs to further broaden AndeStar support to the RISC-V community.
Nuclei System Technology Co. Ltd. is a leading RISC-V core IP company in China. We focus on popularization and commercialization of RISC-V and work with our partner and the local community to grow the ecosystem in China. Bob Hu, the founder of the company, is a famous RISC-V technology evangelist. He is the creator open source RISC-V core Hummingbird E203, and the author of the first and the second RISC-V books in Chinese. Together with him, we continuously share our passion and thought of RISC-V to the others. Happily, more and more people not just only know what is RISC-V and start to adopt the cores based on it in their product. Currently, we have released N200 series ultra-low power RISC-V core IP for IoT application and is actively extending the product line. Our vision is to work closely with partners to help our customers to reach success by accelerating the innovation. In this talk, we will discuss the status of RISC-V in China, introduce our contribution to the popularization of RISC-V in China, and share our thoughts for the next step.
ARM's TrustZone has become a reference for security in microcontrollers, but exactly what TrustZone is and what use cases it enables are often misunderstood by end users. RISC-V offers a simpler, more core-centric approach to MCU security that enables a simpler implementation for designers. This talk compares the hardware features of both TrustZone A, M and PSA to the RISC-V Privileged Architecture Specification v1.1. It then delves into the software stacks available in TrustZone and RISC-V to discuss the design point, use cases, complexity and attack vectors mitigated in each architecture. The listener will be left with a detailed understanding of how RISC-V security compares to TrustZone security in order to inform a platform migration decision.
Cryptospec is a security system to protect sensitive information from unauthorized accesses and consists of a macro cell module, a secure RTOS, service boxes, and a server application. A cryptospec module is an isolated macro cell to manage the security of a 64-bit RISC-V system. Its macro cell is also a cryptographic boundary. The module prevents CPUs (RV64GC) from running unauthorized software by controlling their clock, reset and signals related to activations. Cryptocell module interfaces to RISC-V Coreplex via wide tilelink interconnects and a hardware mailbox to achieve higher performance particularly in symmetric crypto. We design the module is designed FIPS140-2 certifiable. A cryptospec module consists of one E51 (RV64IMAC) secure MCU and one SH-2 real-time MCU. Secure RTOS cross reference bootloader and OSes against stored keys and previous boot records stored in nonvolatile memories to establish CPUs' root of trust. The module's RTOS handles electric, magnetic and physical system tears to ensure system integrity. The mailbox isolates cryptographic module's memory workspace is isolated from CPUs' memories. The cryptographic memories consist of RAM, flash memories, and mask ROM. Cryptospec secure RTOS manages initial and subsequent key injection accepts CPU host commands, download user-defined secure applications, and handle all events.
This talk presents an energy-efficient face detection (FD) model with real-time processing on an Andes RISC-V processor. Both region-of-interest (ROI) detection and shallow pipelined models have been developed to reduce overall computational complexity. In addition, algorithm transformation has been conducted to further reduce complex operations while maintaining acceptable detection accuracy Finally, computation-intensive parts are identified and explored to investigate the feasibility of parallelism by taking available memory bandwidth into account. For real-time face detection, our proposal achieves at least 16 times speedup over the well-known MTCNN model with model size reduced from 1.95 M to 0.19 M, making it very suitable for hardware-constrained systems. The proposed FD solution has also been successfully demonstrated on an Andes RISC-V processor with Andes Neural Network Library using the proposed P-extension (DSP/SIMD instructions) in an FPGA-based platform with the ability to detect multiple faces for 640X480 video sequences. The speedups at various steps including the application of DSP instructions will be presented.
TVM is an open, deep-learning compiler framework to be able to handle various AI models and frameworks such as MXNet, PyTorch, Tensorflow, CoreML, ONNX for MPUs, GPUs and specialized accelerators. In this work, we investigate the enabled flow for the TVM flow and stack on RISC-V with SIMD instructions. RISC-V is with two vector construct proposals, superword vector (V) and Subword SIMD vector (P) instructions to be used as a fall-back engine for AI computing. TVM is based on TVM IR originally from Halide IR. Thus it allows scheduling to be deployed for various operators with various architecture configurations. For the TVM on superword vector, we lower TVM IR to LLVM vector IR for RISC-V optimizations. For the case of Packed instructions, we need quantization scheduling methods to quantize AI models into fixed-point instruction for packed subword SIMD computations on RISC-V. In addition, we add the schedulers to generate SIMD intrinsics when TVM lowering to LLVM IR layer. Our experiment is done by integrating P and V instructions into Spike simulators. Early experiments for MatMul operators from AI models at TVM on RISC-V with Subword SIMD shows that we generate 104 SIMD instructions among 229 instructions.
Please join us for 3-minute previews of the poster sessions that will be displayed during the evening networking reception on Tuesday, March 12.