ARM Releases Machine Readable Architecture Specification

https://alastairreid.github.io/alastairreid.github.io/ARM-v8a-xml-release/

The companies that design processors all provide specifications of their products. These specifications are usually in the form of books or PDF documents. 
It is then up to diligent engineers and academics around the world to read those documents and transcribe the relevant parts into computer languages such as C, C++, Verilog, O’Caml, Coq, Isabelle, … to implement tools that generate, dissect or analyse programs for those processors. This is a very tedious, error-prone process. 

ARM has released version 8.2 of the ARM v8-A processor specification in machine readable form.

The semantics is written in ARM’s ASL Specification Language so it is all executable and has been tested very thoroughly.

If this is for real, what you think you can you do with this?

7 Comments

  1. Tomi Engdahl says:

    ARM Releases Machine Readable Architecture Specification (alastairreid.github.io)
    https://news.ycombinator.com/item?id=14174533

    I work at a chip company. All of our register definitions and other specifications are created in a machine-readable but human-friendly YAML format.

    The specs drive scripts which generate the hardware, test interfaces, C headers for software and the PDF manual we give to our customers. I assume everyone is doing this: how else can you guarantee thousands of pages of documentation and a large API match the product perfectly?

    What’s new is ARM is giving that internal specification to customers. Makes a lot of sense given their customers are also chip companies with similar practices. I think it’s a good idea and everyone should do it.

    What’s new is ARM is giving that internal specification to customers.

    Not just customers but everyone, judging by the fact that I could just click the links in the post and download the files.

    Yes, ARM’s architecture licensees (companies allowed to design their own ARM-compatible processors) have had access to the v8-A specs for years. The new bit is making it available so that researchers, companies and individuals can use it in software projects.

    This is a lot more than register specifications. It is a specification of the complete functionality of the architecture, written in a format that allows you to generate simulators, do formal verification of RTL code, or probably even generate synthesizable RTL code as well.

    Licensing aside it is still pretty bold for a fabless IP-only company like ARM to release this level of detailed specification; unlike e.g. Intel or AMD who could still have an advantage even if others made competing implementations, because of their fab technology.

    I’d bet there already are clones/semi-clones of ARMs hidden in various obscure far-East ICs which have no user-accessible firmware… we just don’t hear about them because the ones we do, would’ve been sued out of existence. I’ve seen presumably-unlicensed MIPS cores a lot in those.

    Reply
  2. Tomi Engdahl says:

    ARM Releases Machine Readable Architecture Specification (alastairreid.github.io)
    https://www.reddit.com/r/programming/comments/66kyez/arm_releases_machine_readable_architecture/

    Look in the file ISA_v82A_A64_xml_00bet3.1/xhtml/notice.html (or in ISA_v82A_A64_xml_00bet3.1/notice.xml). I am not a lawyer so I will not attempt to interpret it but it talks about copyright, IP rights, export regulations, trademarks, damages and the possibility of inaccuracies and typos.

    Look in the file ISA_v82A_A64_xml_00bet3.1/xhtml/notice.html (or in ISA_v82A_A64_xml_00bet3.1/notice.xml). I am not a lawyer so I will not attempt to interpret it but it talks about copyright, IP rights, export regulations, trademarks, damages and the possibility of inaccuracies and typos.

    There has already been a formalization of the 32 bit Arm architecture in Isabelle/HOL that was succesfully used with SeL4. It would be really cool to adapt the proofs to use this architecture specification instead.

    The place where that is desperately needed is for the CompCert formally verified compiler. It contains an ARM spec that looks nothing like ARM’s specs for the same instructions so it is very hard to be sure it is right. In fact, they have had bugs because it is not.

    It is important to remember that formal verification still has to trust something and that the processor specification is a huge part of your Trusted Computing Base – as this great paper points out: https://locore.cs.washington.edu/papers/fonseca-dsbugs.pdf

    Reply
  3. Tomi Engdahl says:

    ASL is ARM’s Architecture Specification Language. See https://alastairreid.github.io/specification_languages/ for a previous description.

    It is a strongly typed imperative language designed to be good for defining hardware semantics so it has things like the ability to talk about values of any particular number of bits (“bits(32)”, “bits(5)”, etc.)

    It is also designed to avoid the problems you run into if you use C/C++ to define your semantics. So when defining instructions, it typically reads the 64-bit input registers, convert them to unbounded precision integers, performs the operation and then explicitly converts the result back to a 64-bit result to write back to the register file. This avoids errors because you fail to use a 65-bit intermediate result or you forget that signed and unsigned will give different results.

    (See https://alastairreid.github.io/specification_languages/ for more about why ASL is the way it is.)

    Reply
  4. Tomi Engdahl says:

    From https://www.reddit.com/r/programming/comments/66kyez/arm_releases_machine_readable_architecture/

    I have written an experimental translator that can convert the machine readable spec to synthesizable Verilog. One of the hardware engineers tried to synthesize it (i.e., turn it into a hardware design) just for a laugh. The tool reported that it would need 1TB of memory to finish the job so we didn’t see it finish but our estimate is that it would require approximately 1 square metre of silicon and run at 1Hz. Suffice to say that no hardware engineers feel threatened.

    But if you want an ARM processor in an FPGA, you could get an ARM Cortex-M1 specifically optimised for FPGAs from Actel http://www.actel.com/eZone/ESC/p2.html, Intel/Altera https://www.altera.com/products/design-software/design-software/embedded-software-developers/soc-eds/overview.html, Xilinx, etc.

    I think that both the ARM Cortex-M0 and the Cortex-M1 are free under some circumstances https://www.arm.com/products/designstart/

    Reply
  5. Tomi Engdahl says:

    From https://www.reddit.com/r/programming/comments/66kyez/arm_releases_machine_readable_architecture/

    Any thoughts about using this kind of spec to generate compiler backends? It seems to me that it would be fairly doable (if not now, then eventually, or perhaps soon but very slowly) to use this kind of input to generate an efficient LLVM backend, to automate adapting an existing compiler to a new target.

    Are you thinking of something like the Bansal-Aiken work on binary translation? https://theory.stanford.edu/~aiken/publications/papers/osdi08.pdf

    Hmmm, there is a formal spec for LLVM, this is a formal spec for ARM. How hard can it be?

    There is a long tradition of people scanning or retyping information from ARM and other architecture specs.

    And within ARM there was an effort to build a formal spec about 15-20 years ago. But it didn’t get enough momentum in time for it to really stick. So, given the enormous effort of testing and maintaining a spec, it bitrotted pretty fast.

    Reply
  6. Tomi Engdahl says:

    Trustworthy specifications of ARM® v8-A and v8-M system level architecture
    http://ieeexplore.ieee.org/document/7886675/

    Processor specifications are of critical importance for verifying programs, compilers, operating systems/hypervisors, and, of course, for verifying microprocessors themselves. But to be useful, the scope of these specifications must be sufficient for the task, the specification must be applicable to processors of interest and the specification must be trustworthy. This paper describes a 5 year project to change ARM’s existing architecture specification process so that machine-readable, executable specifications can be automatically generated from the same materials used to generate ARM’s conventional architecture documentation. We have developed executable specifications of both ARM’s A-class and M-class processor architectures

    Our focus has been on enabling the formal verification of ARM processors but, recognising the value of this specification for verifying software, we are currently preparing a public release of the machine-readable specification.

    Reply
  7. Tomi Engdahl says:

    ARM Releases Machine Readable Architecture Specification
    https://alastairreid.github.io/alastairreid.github.io/ARM-v8a-xml-release/

    The device you are reading this post on consists of a very tall stack of layers – all the way from transistors and NAND gates all the way up to processors, C, Linux/ Android/ iOS/ Windows to the browser. Each of these layers may be written by a different team possibly in a different company and the interface between these layers is documented and specified so that each team knows what it can assume and what it must provide. One of the most important interfaces in this stack is the one between hardware and software that says what a processor will do if it is configured a certain way, provided with page tables, interrupt handlers, put in a certain privilege level and finallyprovided with a program to execute. If you want to write a compiler, operating system, hypervisor or security layer, then you need to know how the processor will behave. If someone gives you a virus and asks you to figure out how it works and how to defend against it, then you need to know how the processor will behave.

    oday ARM released version 8.2 of the ARM v8-A processor specification in machine readable form. This specification describes almost all of the architecture: instructions, page table walks, taking interrupts, taking synchronous exceptions such as page faults, taking asynchronous exceptions such as bus faults, user mode, system mode, hypervisor mode, secure mode, debug mode. It details all the instruction formats and system register formats. The semantics is written in ARM’s ASL Specification Language so it is all executable and has been tested very thoroughly using the same architecture conformance tests that ARM uses to test its processors. (See my paper “Trustworthy Specifications of ARM v8-A and v8-M System Level Architecture”.)

    Reply

Leave a Comment

Your email address will not be published. Required fields are marked *

*

*