## arm

## Trusted Firmware-M FP support in TF-M Update

2022 March

Feder Liang Arm

#### FP support in TF-M (Update)

- Armv8.0 FP support (IPC, SFN)
- Armv8.1 FP support (IPC, SFN)
- M-Profile Vector Extension (MVE) support vs. FP support

#### **FP Context**

#### FP context is shared between S and NS.

| Name        | Description                                    | Security State<br>Banked |                             |
|-------------|------------------------------------------------|--------------------------|-----------------------------|
| CPACR       | Coprocessor Access Control Register            | Yes                      |                             |
| CPPWR       | Coprocessor Power Control Register             | No                       |                             |
| FP Register | FP caller save registers (S0–S15)              | No                       | FP context to be protected: |
| Bank        | FP callee save registers (S16–S31)             |                          | • FP registers (S0-S31)     |
| FPSCR       | Floating-point Status and Control Register     | No                       | • FPSCR                     |
| FPCCR       | Floating Point Context Control Register        | Partial                  |                             |
|             | LSPEN Enable/Disable lazy stacking             | No                       |                             |
| FPCAR       | Floating Point Context Address Register        | Yes                      |                             |
| FPDSCR      | Floating Point Default Status Control Register | Yes                      |                             |
| MVFR0, 1, 2 | Media and FP Feature Register 0, 1, 2          | No                       |                             |

#### **FPSCR**

Floating-point Status and Control Register

- Providing status information about the floating-point operation results.
- Defining some of the floating-point operation behaviors.
- The vector element size used when applying low-overhead-loop tail predication to vector instructions.
- The exception bits can be used by software to detect abnormalities in floating point operations



The FPSCR bit assignments are:

## Armv8.0 FP support for IPC Model

| Items                           | Deta                                                                                                                                                   | Details                                                                                               |  |  |  |  |  |  |  |  |
|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|--|--|
| Submit                          | Release 1.5.0                                                                                                                                          | <u>5519438</u>                                                                                        |  |  |  |  |  |  |  |  |
| Scope                           | SPE only                                                                                                                                               | SPE and <b>NSPE</b>                                                                                   |  |  |  |  |  |  |  |  |
|                                 | FP context saved in secure partition's stack                                                                                                           | FP context saved in secure partition's stack.                                                         |  |  |  |  |  |  |  |  |
| FP context protection mechanism | FP context is saved (also invalidated) and restore automatically by cortex-m processor hardware mechanism during exception entry and exception return. |                                                                                                       |  |  |  |  |  |  |  |  |
|                                 | Prevent non-secure from modifying FPU's power setting (CPPWR).                                                                                         |                                                                                                       |  |  |  |  |  |  |  |  |
| Toolchain                       | GNU Arm embedded toolchain                                                                                                                             | GNU Arm embedded toolchain                                                                            |  |  |  |  |  |  |  |  |
| ABI                             | soft, <b>softfp</b> , hard                                                                                                                             | soft, hard                                                                                            |  |  |  |  |  |  |  |  |
| NS FPU usage                    | Disable non-secure access to Floating-<br>point Unit (FPU).                                                                                            | Permit non-secure access to FPU.                                                                      |  |  |  |  |  |  |  |  |
| Lazy stacking                   | Enable/Disable lazy stacking in SPE only.                                                                                                              | Lazy stacking can only be enabled or<br>disabled for <b>whole system</b> from<br><b>SPE</b> (LSPENS). |  |  |  |  |  |  |  |  |

#### Armv8.0 FP support for IPC model in TF-M

Lazy stacking disabled

- Veneer code is written as assembly code in TF-M.
- NS FP context is saved and restored on NS agent thread's stack. FPSCR is consistent for NS.
- Secure FP context is saved (also invalidated) and restored on SP's stack before context switch to NS agent thread. FPSCR is also consistent for each SP.



## Armv8.0 FP support for SFN Model (Under development)

For processor support armv8-m.main, isolation level 1

| Items                           | Details                                                                                                             |  |  |  |  |  |  |
|---------------------------------|---------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|
| Scope                           | SPE and NSPE                                                                                                        |  |  |  |  |  |  |
|                                 | Veneer implemented in assembly code.                                                                                |  |  |  |  |  |  |
| FP context protection mechanism | Secure FP context are invalidated before function return to NS if secure FP context s active.                       |  |  |  |  |  |  |
| PP context protection mechanism | FPSCR is saved on NS agent thread's stack before secure function call and is restored before function return to NS. |  |  |  |  |  |  |
|                                 | Prevent non-secure from modifying FPU's power setting (CPPWR).                                                      |  |  |  |  |  |  |
| Toolchain                       | GNU Arm embedded toolchain                                                                                          |  |  |  |  |  |  |
| ABI                             | soft, hard                                                                                                          |  |  |  |  |  |  |
| NS FPU usage                    | Permit non-secure access to FPU.                                                                                    |  |  |  |  |  |  |
| Lazy stacking                   | Lazy stacking can only be enabled or disabled for whole system from SPE (LSPENS).                                   |  |  |  |  |  |  |

#### Armv8.0 FP support for SFN model (Demo)

|       |                 |          |                              | _handle_t handle, uint          | 32_τ | • |   |    |     | tst.w  |
|-------|-----------------|----------|------------------------------|---------------------------------|------|---|---|----|-----|--------|
| (<br> | ctri_param<br>( | , const  | psa_invec *in_vec,           | <pre>psa_outvec *out_vec)</pre> |      | • |   |    |     | beq    |
|       | ASM v           | volatil  | e(                           |                                 |      | • |   |    |     | eor    |
|       | ".:             | syntax i | unified                      |                                 | \n"  | • |   |    |     | vmov   |
|       |                 | push     | {r2, r3}                     |                                 | \n"  | • |   |    |     | vmov   |
|       |                 | ldr      | r2, [sp, #8]                 |                                 | \n"  | • |   |    |     | vmov   |
|       |                 | ldr      | r3, ="M2S(STACK_S            | EAL_PATTERN)"                   | \n"  | • |   |    |     | vmov   |
|       |                 | cmp      | r2, r3                       | _ ,                             | \n"  | • |   |    |     | vmov   |
|       |                 | bne      | reent_panic4                 |                                 | \n"  | • |   |    |     | vmov   |
|       |                 | рор      | $\{r2, r3\}$                 |                                 | \n"  | • |   |    |     | vmov   |
|       |                 | mov      | r12, r3                      |                                 | \n"  | • |   |    |     | vmov   |
|       |                 | mrs      | r3, control                  |                                 | \n"  | • |   |    | "no | _sfpa2 |
|       |                 | push     | {r2, r3}                     | Save FPSCR(NS)                  | \n"  |   |   |    |     |        |
|       |                 | mov      | r3, r12                      | Save H Ser(IVS)                 | \n"  | • |   |    |     | рор    |
|       |                 | push     | {lr}                         |                                 | \n"  | • |   |    |     | mov    |
|       |                 |          |                              |                                 |      | • |   |    |     | рор    |
|       |                 | vmrs     | r4, fpscr                    |                                 | \n"  | • |   |    |     | msr    |
|       |                 | push     | {r4}                         |                                 | \n"  | • |   |    |     | bxns   |
|       |                 |          |                              |                                 |      | • |   |    | "re | ent_pa |
|       |                 | bl       | <pre>psa_call_pack_sfn</pre> | 1                               | \n"  | • |   |    |     | svc    |
|       |                 |          |                              |                                 |      | • |   |    |     | b      |
|       |                 | рор      | {r4}                         |                                 | \n"  | • |   | ); |     |        |
|       |                 | vmsr     | fpscr, r4                    | Restore                         | \n"  | • | } | -  |     |        |
|       |                 |          |                              | FPSCR(NS)                       |      |   |   |    |     |        |

| " mrs<br>" tst.w<br>" beq                                | r4, control<br>r4, #8<br>no_sfpa2                       | Check SFPA<br>(Secure Floating-<br>point active.) |  |
|----------------------------------------------------------|---------------------------------------------------------|---------------------------------------------------|--|
| " eor<br>" vmov<br>" vmov<br>" vmov                      | r2, r2, r2<br>d0, r2, r2<br>d1, r2, r2<br>d2, r2, r2    |                                                   |  |
| " vmov<br>" vmov                                         | d3, r2, r2<br>d4, r2, r2<br>d5, r2, r2<br>d6, r2, r2    | Clear FP caller<br>registers                      |  |
| " vmov<br>"no_sfpa2:                                     | d7, r2, r2                                              |                                                   |  |
| " pop<br>" mov<br>" pop<br>" msr<br>" bxns<br>"reent_pan | <pre>{r2, r3} lr, r3 {r2, r3} control, r3 lr ic4:</pre> |                                                   |  |
| " svc<br>" b                                             | IIC4:<br>"M2S(TFM_SVC_PS<br>·                           | A_PANIC)"                                         |  |

### Armv8.1 Features for FP

- FPCXT
  - Usage for function call.
  - Avoid corrupting of providing inconsistent of FPSCR between S and NS.
- VSCCLRM
  - Avoid trigger the creation of inadvertent FP context during invalidation of S FP context before return to NS.
  - Because it doesn't clear any registers if there isn't a secure context active (as indicated by CONTROL\_S.SFPA).
  - Side effect of create FP context inadvertently
    - Because this FP context has to be saved and restored for every context switch, it wastes time, stack space, power etc. for the rest of the lifetime of that thread.

#### FPCXT in Armv8.1

#### • FPSCR

| Bit   | Field   | Descriptions                                               |
|-------|---------|------------------------------------------------------------|
| 31    | Ν       | Negative condition flag.                                   |
| 30    | Z       | Zero condition flag.                                       |
| 29    | С       | Carry condition flag.                                      |
| 28    | V       | Overflow condition flag.                                   |
| 27    | QC      | Cumulative saturation bit.                                 |
| 26    | AHP     | Alternative half-precision control bit.                    |
| 25    | DN      | Default NaN mode control bit.                              |
|       |         | Flush-to-zero mode control for single and double precision |
| 24    | FZ      | Floating-point.                                            |
| 23:22 | RMode   | Rounding mode control field.                               |
| 21:20 |         | Reserved.                                                  |
|       |         | Flush-to-zero mode control bit on half-precision data-     |
| 19    | FZ16    | processing instructions.                                   |
|       |         | The vector element size used when applying low-overhead-   |
| 18:16 | LTPSIZE | loop tail predication to vector instructions.              |
| 15:08 |         | Reserved.                                                  |
| 7     | IDC     | Input Denormal cumulative exception bit.                   |
| 6:5   |         | Reserved.                                                  |
| 4     | IXC     | Inexact cumulative exception bit.                          |
| 3     | UFC     | Underflow cumulative exception bit.                        |
| 2     | OFC     | Overflow cumulative exception bit.                         |
| 1     | DZC     | Divide by Zero cumulative exception bit.                   |
| 0     | IOC     | Invalid Operation cumulative exception bit.                |

Same

#### • FPCXT Floating-point context payload

- Introduced in V8.1 to provide consistent FPSCR for S or NS both.
- Save/restore during security changes, avoid inconsistent FPSCR occurs.

| Bit   | Field   | Descriptions                                               |
|-------|---------|------------------------------------------------------------|
| 31    | SFPA    | Secure Floating-point active. CONTROL.SFPA                 |
| 30    |         | Reserved.                                                  |
| 29    |         | Reserved.                                                  |
| 28    |         | Reserved.                                                  |
| 27    | QC      | Cumulative saturation bit.                                 |
| 26    | AHP     | Alternative half-precision control bit.                    |
| 25    | DN      | Default NaN mode control bit.                              |
|       |         | Flush-to-zero mode control for single and double precision |
| 24    | FZ      | Floating-point.                                            |
| 23:22 | RMode   | Rounding mode control field.                               |
| 21:20 |         | Reserved.                                                  |
|       |         | Flush-to-zero mode control bit on half-precision data-     |
| 19    | FZ16    | processing instructions.                                   |
|       |         | The vector element size used when applying low-overhead-   |
| 18:16 | LTPSIZE | loop tail predication to vector instructions.              |
| 15:08 |         | Reserved.                                                  |
| 7     | IDC     | Input Denormal cumulative exception bit.                   |
| 6:5   |         | Reserved.                                                  |
| 4     | IXC     | Inexact cumulative exception bit.                          |
| 3     | UFC     | Underflow cumulative exception bit.                        |
| 2     | OFC     | Overflow cumulative exception bit.                         |
| 1     | DZC     | Divide by Zero cumulative exception bit.                   |
| 0     | IOC     | Invalid Operation cumulative exception bit.                |

Ο

### Armv8.1 FP support in TF-M for IPC model

| Items                                                           | Details                                                                                                                                                 |  |  |  |  |  |  |
|-----------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|
| Scope                                                           | SPE and NSPE                                                                                                                                            |  |  |  |  |  |  |
|                                                                 | FP context saved in secure partition's stack.                                                                                                           |  |  |  |  |  |  |
| FP context protection mechanism<br>(Same as Armv8.0 FP support) | FP context is saved (also invalidated) and restored automatically by cortex-m processor hardware mechanism during exception entry and exception return. |  |  |  |  |  |  |
| (Same as Armvo.0 FP Support)                                    | revent non-secure from modifying FPU's power setting (CPPWR).                                                                                           |  |  |  |  |  |  |
| Toolchain                                                       | GNU Arm embedded toolchain:<br>add <b>ARM_ARCH_8_1M_MAIN</b> manually.                                                                                  |  |  |  |  |  |  |
| ABI                                                             | soft, hard                                                                                                                                              |  |  |  |  |  |  |
| NS FPU usage                                                    | Permit non-secure access to FPU.                                                                                                                        |  |  |  |  |  |  |
| Lazy stacking                                                   | Lazy stacking can only be enabled or disabled for whole system from SPE (LSPENS).                                                                       |  |  |  |  |  |  |
| FPCXT and VSCCLRN<br>are not used                               |                                                                                                                                                         |  |  |  |  |  |  |

## Armv8.1 FP support for SFN model (Under development)

| Items                           | Details                                                                                                                                      |  |  |  |  |  |  |
|---------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|
| Scope                           | SPE and NSPE                                                                                                                                 |  |  |  |  |  |  |
|                                 | Veneer implemented in assembly code                                                                                                          |  |  |  |  |  |  |
| ED contaxt protection machanism | Secure FP context are invalidated before function return to NS if secure FP context active: By <b>VSCCLRM</b>                                |  |  |  |  |  |  |
| FP context protection mechanism | FPSCR is saved before secure function call: vstr FPCXTNS, [sp, #-4]!<br>And is restored before function return to NS: vldr FPCXTNS, [sp], #4 |  |  |  |  |  |  |
|                                 | Prevent non-secure from modifying FPU's power setting (CPPWR).                                                                               |  |  |  |  |  |  |
| Toolchain                       | GNU Arm embedded toolchain                                                                                                                   |  |  |  |  |  |  |
| ABI                             | soft, hard                                                                                                                                   |  |  |  |  |  |  |
| NS FPU usage                    | Permit non-secure access to FPU.                                                                                                             |  |  |  |  |  |  |
| Lazy stacking                   | Lazy stacking can only be enabled or disabled for whole system from SPE (LSPENS).                                                            |  |  |  |  |  |  |

#### Armv8.1 FP support for SFN model (Demo)

- psa\_status\_t tfm\_psa\_call\_veneer(psa\_handle\_t h, uint32\_t ctrl, const psa\_invec \*in\_vec, psa\_outvec \*out\_vec)
- ASM volatile( Save FPCXTNS \n" ".syntax unified (FPSCR NS) " push {lr} \n" " vstr FPCXTNS, [sp, #-4]! \n" • Clear FP caller registers " bl psa\_call\_pack\_sfn n'' / \* ABI to psa framework \*/ if secure context active " vscclrm {s0-s15, vpr} \n" (as indicated by " vldr FPCXTNS, [sp], #4 \n" • CONTROL\_S.SFPA) " pop {r3} \n \n" " mov lr, r3 **Restore FPCXTNS** " bxns Ir \n" (FPSCR NS) );

arm

#### MVE support vs. FP support

- M-Profile Vector Extension (MVE) is an optional vector architectural extension introduced as part of the ARMv8.1-M architecture, for the Arm Cortex-M processor series.
- MVE delivers a significant performance uplift for machine learning and digital signal processing applications for small, embedded devices.
- MVE reuses FP registers, enabling MVE is same as FP does.
- MVE working routine are same as FP mechanism.
  - Such as CONTROL.FPCA, CONTROL.SFPA, FPCCR.LSPACT, lazy stacking, etc.
  - Besides FP registers and FPSCR, VPR register are also pushed to stack by hardware during exception entry.
- MVE is perfectly designed to exploit all existing security mechanisms already designed for FP.

+ + + + + + + + + + + + + + +





+ + + + + + + + + + + + + +

+ + + + + + + + + + + + + +

+ + + + + + + + + + + + + +

+ + + + + + + + + + + + + + +

· · · · · · · · · · · · · · · ·

| ar         | 'n | <b>n</b> * |  |  |  |  | <ul> <li>Thank You</li> <li>Danke</li> </ul> |
|------------|----|------------|--|--|--|--|----------------------------------------------|
|            |    |            |  |  |  |  | Gracias                                      |
|            |    |            |  |  |  |  | 谢谢<br>ありがとう                                  |
|            |    |            |  |  |  |  | Asante                                       |
|            |    |            |  |  |  |  | Merci<br>감사합니다                               |
|            |    |            |  |  |  |  | धन्यवाद<br>Kiitos                            |
|            |    |            |  |  |  |  | شکرًا                                        |
|            |    |            |  |  |  |  | ধন্যবাদ                                      |
| © 2021 Arm |    |            |  |  |  |  | תודה 💼 🐳                                     |

# The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners.

www.arm.com/company/policies/trademarks

© 2021 Arm