12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187118811891190119111921193119411951196119711981199120012011202120312041205120612071208 |
- Info file internals, produced by Makeinfo, -*- Text -*-
- from input file internals.texinfo.
- This file documents the internals of the GNU compiler.
- Copyright (C) 1988 Free Software Foundation, Inc.
- Permission is granted to make and distribute verbatim copies of
- this manual provided the copyright notice and this permission notice
- are preserved on all copies.
- Permission is granted to copy and distribute modified versions of this
- manual under the conditions for verbatim copying, provided also that the
- section entitled ``GNU CC General Public License'' is included exactly as
- in the original, and provided that the entire resulting derived work is
- distributed under the terms of a permission notice identical to this one.
- Permission is granted to copy and distribute translations of this manual
- into another language, under the above conditions for modified versions,
- except that the section entitled ``GNU CC General Public License'' and
- this permission notice may be included in translations approved by the
- Free Software Foundation instead of in the original English.
- File: internals, Node: Sharing, Prev: Calls, Up: RTL
- Structure Sharing Assumptions
- =============================
- The compiler assumes that certain kinds of RTL expressions are unique;
- there do not exist two distinct objects representing the same value. In
- other cases, it makes an opposite assumption: that no RTL expression object
- of a certain kind appears in more than one place in the containing structure.
- These assumptions refer to a single function; except for the RTL objects
- that describe global variables and external functions, no RTL objects are
- common to two functions.
- * Each pseudo-register has only a single `reg' object to represent it,
- and therefore only a single machine mode.
- * For any symbolic label, there is only one `symbol_ref' object
- referring to it.
- * There is only one `const_int' expression with value zero, and only one
- with value one.
- * There is only one `pc' expression.
- * There is only one `cc0' expression.
- * There is only one `const_double' expression with mode `SFmode' and
- value zero, and only one with mode `DFmode' and value zero.
- * No `label_ref' appears in more than one place in the RTL structure; in
- other words, it is safe to do a tree-walk of all the insns in the
- function and assume that each time a `label_ref' is seen it is
- distinct from all others that are seen.
- * Only one `mem' object is normally created for each static variable or
- stack slot, so these objects are frequently shared in all the places
- they appear. However, separate but equal objects for these variables
- are occasionally made.
- * No RTL object appears in more than one place in the RTL structure
- except as described above. Many passes of the compiler rely on this
- by assuming that they can modify RTL objects in place without unwanted
- side-effects on other insns.
- * During initial RTL generation, shared structure is freely introduced.
- After all the RTL for a function has been generated, all shared
- structure is copied by `unshare_all_rtl' in `emit-rtl.c', after which
- the above rules are guaranteed to be followed.
- * During the combiner pass, shared structure with an insn can exist
- temporarily. However, the shared structure is copied before the
- combiner is finished with the insn. This is done by
- `copy_substitutions' in `combine.c'.
- File: internals, Node: Machine Desc, Next: Machine Macros, Prev: RTL, Up: Top
- Machine Descriptions
- ********************
- A machine description has two parts: a file of instruction patterns (`.md'
- file) and a C header file of macro definitions.
- The `.md' file for a target machine contains a pattern for each instruction
- that the target machine supports (or at least each instruction that is
- worth telling the compiler about). It may also contain comments. A
- semicolon causes the rest of the line to be a comment, unless the semicolon
- is inside a quoted string.
- See the next chapter for information on the C header file.
- * Menu:
- * Patterns:: How to write instruction patterns.
- * Example:: An explained example of a `define_insn' pattern.
- * RTL Template:: The RTL template defines what insns match a pattern.
- * Output Template:: The output template says how to make assembler code
- from such an insn.
- * Output Statement:: For more generality, write C code to output
- the assembler code.
- * Constraints:: When not all operands are general operands.
- * Standard Names:: Names mark patterns to use for code generation.
- * Pattern Ordering:: When the order of patterns makes a difference.
- * Dependent Patterns:: Having one pattern may make you need another.
- * Jump Patterns:: Special considerations for patterns for jump insns.
- * Peephole Definitions::Defining machine-specific peephole optimizations.
- * Expander Definitions::Generating a sequence of several RTL insns
- for a standard operation.
- File: internals, Node: Patterns, Next: Example, Prev: Machine Desc, Up: Machine Desc
- Everything about Instruction Patterns
- =====================================
- Each instruction pattern contains an incomplete RTL expression, with pieces
- to be filled in later, operand constraints that restrict how the pieces can
- be filled in, and an output pattern or C code to generate the assembler
- output, all wrapped up in a `define_insn' expression.
- A `define_insn' is an RTL expression containing four operands:
- 1. An optional name. The presence of a name indicate that this instruction
- pattern can perform a certain standard job for the RTL-generation pass
- of the compiler. This pass knows certain names and will use the
- instruction patterns with those names, if the names are defined in the
- machine description.
- The absence of a name is indicated by writing an empty string where
- the name should go. Nameless instruction patterns are never used for
- generating RTL code, but they may permit several simpler insns to be
- combined later on.
- Names that are not thus known and used in RTL-generation have no
- effect; they are equivalent to no name at all.
- 2. The "RTL template" (*Note RTL Template::.) is a vector of incomplete RTL
- expressions which show what the instruction should look like. It is
- incomplete because it may contain `match_operand' and `match_dup'
- expressions that stand for operands of the instruction.
- If the vector has only one element, that element is what the
- instruction should look like. If the vector has multiple elements,
- then the instruction looks like a `parallel' expression containing
- that many elements as described.
- 3. A condition. This is a string which contains a C expression that is the
- final test to decide whether an insn body matches this pattern.
- For a named pattern, the condition (if present) may not depend on the
- data in the insn being matched, but only the target-machine-type
- flags. The compiler needs to test these conditions during
- initialization in order to learn exactly which named instructions are
- available in a particular run.
- For nameless patterns, the condition is applied only when matching an
- individual insn, and only after the insn has matched the pattern's
- recognition template. The insn's operands may be found in the vector
- `operands'.
- 4. The "output template": a string that says how to output matching insns
- as assembler code. `%' in this string specifies where to substitute
- the value of an operand. *note Output Template::.
- When simple substitution isn't general enough, you can specify a piece
- of C code to compute the output. *note Output Statement::.
- File: internals, Node: Example, Next: RTL Template, Prev: Patterns, Up: Machine Desc
- Example of `define_insn'
- ========================
- Here is an actual example of an instruction pattern, for the 68000/68020.
- (define_insn "tstsi"
- [(set (cc0)
- (match_operand:SI 0 "general_operand" "rm"))]
- ""
- "*
- { if (TARGET_68020 || ! ADDRESS_REG_P (operands[0]))
- return \"tstl %0\";
- return \"cmpl #0,%0\"; }")
- This is an instruction that sets the condition codes based on the value of
- a general operand. It has no condition, so any insn whose RTL description
- has the form shown may be handled according to this pattern. The name
- `tstsi' means ``test a `SImode' value'' and tells the RTL generation pass
- that, when it is necessary to test such a value, an insn to do so can be
- constructed using this pattern.
- The output control string is a piece of C code which chooses which output
- template to return based on the kind of operand and the specific type of
- CPU for which code is being generated.
- `"rm"' is an operand constraint. Its meaning is explained below.
- File: internals, Node: RTL Template, Next: Output Template, Prev: Example, Up: Machine Desc
- RTL Template for Generating and Recognizing Insns
- =================================================
- The RTL template is used to define which insns match the particular pattern
- and how to find their operands. For named patterns, the RTL template also
- says how to construct an insn from specified operands.
- Construction involves substituting specified operands into a copy of the
- template. Matching involves determining the values that serve as the
- operands in the insn being matched. Both of these activities are
- controlled by special expression types that direct matching and
- substitution of the operands.
- `(match_operand:M N TESTFN CONSTRAINT)'
- This expression is a placeholder for operand number N of the insn.
- When constructing an insn, operand number N will be substituted at
- this point. When matching an insn, whatever appears at this position
- in the insn will be taken as operand number N; but it must satisfy
- TESTFN or this instruction pattern will not match at all.
- Operand numbers must be chosen consecutively counting from zero in
- each instruction pattern. There may be only one `match_operand'
- expression in the pattern for each expression number, and they must
- appear in order of increasing expression number.
- TESTFN is a string that is the name of a C function that accepts two
- arguments, a machine mode and an expression. During matching, the
- function will be called with M as the mode argument and the putative
- operand as the other argument. If it returns zero, this instruction
- pattern fails to match. TESTFN may be an empty string; then it means
- no test is to be done on the operand.
- Most often, TESTFN is `"general_operand"'. It checks that the
- putative operand is either a constant, a register or a memory
- reference, and that it is valid for mode M.
- For an operand that must be a register, TESTFN should be
- `"register_operand"'. This prevents GNU CC from creating insns that
- have memory references in these operands, insns which would only have
- to be taken apart in the reload pass.
- For an operand that must be a constant, either TESTFN should be
- `"immediate_operand"', or the instruction pattern's extra condition
- should check for constants, or both.
- CONSTRAINT is explained later (*Note Constraints::.).
- `(match_dup N)'
- This expression is also a placeholder for operand number N. It is
- used when the operand needs to appear more than once in the insn.
- In construction, `match_dup' behaves exactly like `match_operand': the
- operand is substituted into the insn being constructed. But in
- matching, `match_dup' behaves differently. It assumes that operand
- number N has already been determined by a `match_operand' appearing
- earlier in the recognition template, and it matches only an
- identical-looking expression.
- `(address (match_operand:M N "address_operand" ""))'
- This complex of expressions is a placeholder for an operand number N
- in a ``load address'' instruction: an operand which specifies a memory
- location in the usual way, but for which the actual operand value used
- is the address of the location, not the contents of the location.
- `address' expressions never appear in RTL code, only in machine
- descriptions. And they are used only in machine descriptions that do
- not use the operand constraint feature. When operand constraints are
- in use, the letter `p' in the constraint serves this purpose.
- M is the machine mode of the *memory location being addressed*, not
- the machine mode of the address itself. That mode is always the same
- on a given target machine (it is `Pmode', which normally is `SImode'),
- so there is no point in mentioning it; thus, no machine mode is
- written in the `address' expression. If some day support is added for
- machines in which addresses of different kinds of objects appear
- differently or are used differently (such as the PDP-10), different
- formats would perhaps need different machine modes and these modes
- might be written in the `address' expression.
- File: internals, Node: Output Template, Next: Output Statement, Prev: RTL Template, Up: Machine Desc
- Output Templates and Operand Substitution
- =========================================
- The "output template" is a string which specifies how to output the
- assembler code for an instruction pattern. Most of the template is a fixed
- string which is output literally. The character `%' is used to specify
- where to substitute an operand; it can also be used to identify places
- different variants of the assembler require different syntax.
- In the simplest case, a `%' followed by a digit N says to output operand N
- at that point in the string.
- `%' followed by a letter and a digit says to output an operand in an
- alternate fashion. Four letters have standard, built-in meanings described
- below. The machine description macro `PRINT_OPERAND' can define additional
- letters with nonstandard meanings.
- `%cDIGIT' can be used to substitute an operand that is a constant value
- without the syntax that normally indicates an immediate operand.
- `%nDIGIT' is like `%cDIGIT' except that the value of the constant is
- negated before printing.
- `%aDIGIT' can be used to substitute an operand as if it were a memory
- reference, with the actual operand treated as the address. This may be
- useful when outputting a ``load address'' instruction, because often the
- assembler syntax for such an instruction requires you to write the operand
- as if it were a memory reference.
- `%lDIGIT' is used to substitute a `label_ref' into a jump instruction.
- `%' followed by a punctuation character specifies a substitution that does
- not use an operand. Only one case is standard: `%%' outputs a `%' into the
- assembler code. Other nonstandard cases can be defined in the
- `PRINT_OPERAND' macro.
- The template may generate multiple assembler instructions. Write the text
- for the instructions, with `\;' between them.
- When the RTL contains two operand which are required by constraint to match
- each other, the output template must refer only to the lower-numbered
- operand. Matching operands are not always identical, and the rest of the
- compiler arranges to put the proper RTL expression for printing into the
- lower-numbered operand.
- One use of nonstandard letters or punctuation following `%' is to
- distinguish between different assembler languages for the same machine; for
- example, Motorola syntax versus MIT syntax for the 68000. Motorola syntax
- requires periods in most opcode names, while MIT syntax does not. For
- example, the opcode `movel' in MIT syntax is `move.l' in Motorola syntax.
- The same file of patterns is used for both kinds of output syntax, but the
- character sequence `%.' is used in each place where Motorola syntax wants a
- period. The `PRINT_OPERAND' macro for Motorola syntax defines the sequence
- to output a period; the macro for MIT syntax defines it to do nothing.
- File: internals, Node: Output Statement, Next: Constraints, Prev: Output Template, Up: Machine Desc
- C Statements for Generating Assembler Output
- ============================================
- Often a single fixed template string cannot produce correct and efficient
- assembler code for all the cases that are recognized by a single
- instruction pattern. For example, the opcodes may depend on the kinds of
- operands; or some unfortunate combinations of operands may require extra
- machine instructions.
- If the output control string starts with a `*', then it is not an output
- template but rather a piece of C program that should compute a template.
- It should execute a `return' statement to return the template-string you
- want. Most such templates use C string literals, which require doublequote
- characters to delimit them. To include these doublequote characters in the
- string, prefix each one with `\'.
- The operands may be found in the array `operands', whose C data type is
- `rtx []'.
- It is possible to output an assembler instruction and then go on to output
- or compute more of them, using the subroutine `output_asm_insn'. This
- receives two arguments: a template-string and a vector of operands. The
- vector may be `operands', or it may be another array of `rtx' that you
- declare locally and initialize yourself.
- When an insn pattern has multiple alternatives in its constraints, often
- the appearance of the assembler code determined mostly by which alternative
- was matched. When this is so, the C code can test the variable
- `which_alternative', which is the ordinal number of the alternative that
- was actually satisfied (0 for the first, 1 for the second alternative, etc.).
- For example, suppose there are two opcodes for storing zero, `clrreg' for
- registers and `clrmem' for memory locations. Here is how a pattern could
- use `which_alternative' to choose between them:
- (define_insn ""
- [(set (match_operand:SI 0 "general_operand" "r,m")
- (const_int 0))]
- ""
- "*
- return (which_alternative == 0
- ? \"clrreg %0\" : \"clrmem %0\");
- ")
- File: internals, Node: Constraints, Next: Standard Names, Prev: Output Statement, Up: Machine Desc
- Operand Constraints
- ===================
- Each `match_operand' in an instruction pattern can specify a constraint for
- the type of operands allowed. Constraints can say whether an operand may
- be in a register, and which kinds of register; whether the operand can be a
- memory reference, and which kinds of address; whether the operand may be an
- immediate constant, and which possible values it may have. Constraints can
- also require two operands to match.
- * Menu:
- * Simple Constraints:: Basic use of constraints.
- * Multi-Alternative:: When an insn has two alternative constraint-patterns.
- * Class Preferences:: Constraints guide which hard register to put things in.
- * Modifiers:: More precise control over effects of constraints.
- * No Constraints:: Describing a clean machine without constraints.
- File: internals, Node: Simple Constraints, Next: Multi-Alternative, Prev: Constraints, Up: Constraints
- Simple Constraints
- ------------------
- The simplest kind of constraint is a string full of letters, each of which
- describes one kind of operand that is permitted. Here are the letters that
- are allowed:
- `m'
- A memory operand is allowed, with any kind of address that the machine
- supports in general.
- `o'
- A memory operand is allowed, but only if the address is "offsetable".
- This means that adding a small integer (actually, the width in bytes
- of the operand, as determined by its machine mode) may be added to the
- address and the result is also a valid memory address.
- For example, an address which is constant is offsetable; so is an
- address that is the sum of a register and a constant (as long as a
- slightly larger constant is also within the range of address-offsets
- supported by the machine); but an autoincrement or autodecrement
- address is not offsetable. More complicated indirect/indexed
- addresses may or may not be offsetable depending on the other
- addressing modes that the machine supports.
- Note that in an output operand which can be matched by another
- operand, the constraint letter `o' is valid only when accompanied by
- both `<' (if the target machine has predecrement addressing) and `>'
- (if the target machine has preincrement addressing).
- `<'
- A memory operand with autodecrement addressing (either predecrement or
- postdecrement) is allowed.
- `>'
- A memory operand with autoincrement addressing (either preincrement or
- postincrement) is allowed.
- `r'
- A register operand is allowed provided that it is in a general register.
- `d', `a', `f', ...
- Other letters can be defined in machine-dependent fashion to stand for
- particular classes of registers. `d', `a' and `f' are defined on the
- 68000/68020 to stand for data, address and floating point registers.
- `i'
- An immediate integer operand (one with constant value) is allowed.
- This includes symbolic constants whose values will be known only at
- assembly time.
- `n'
- An immediate integer operand with a known numeric value is allowed.
- Many systems cannot support assembly-time constants for operands less
- than a word wide. Constraints for these operands should use `n'
- rather than `i'.
- `I', `J', `K', ...
- Other letters in the range `I' through `M' may be defined in a
- machine-dependent fashion to permit immediate integer operands with
- explicit integer values in specified ranges. For example, on the
- 68000, `I' is defined to stand for the range of values 1 to 8. This
- is the range permitted as a shift count in the shift instructions.
- `F'
- An immediate floating operand (expression code `const_double') is
- allowed.
- `G', `H'
- `G' and `H' may be defined in a machine-dependent fashion to permit
- immediate floating operands in particular ranges of values.
- `s'
- An immediate integer operand whose value is not an explicit integer is
- allowed.
- This might appear strange; if an insn allows a constant operand with a
- value not known at compile time, it certainly must allow any known
- value. So why use `s' instead of `i'? Sometimes it allows better
- code to be generated.
- For example, on the 68000 in a fullword instruction it is possible to
- use an immediate operand; but if the immediate value is between -32
- and 31, better code results from loading the value into a register and
- using the register. This is because the load into the register can be
- done with a `moveq' instruction. We arrange for this to happen by
- defining the letter `K' to mean ``any integer outside the range -32 to
- 31'', and then specifying `Ks' in the operand constraints.
- `g'
- Any register, memory or immediate integer operand is allowed, except
- for registers that are not general registers.
- `N' (a digit)
- An operand that matches operand number N is allowed. If a digit is
- used together with letters, the digit should come last.
- This is called a "matching constraint" and what it really means is
- that the assembler has only a single operand that fills two roles
- considered separate in the RTL insn. For example, an add insn has two
- input operands and one output operand in the RTL, but on most machines
- an add instruction really has only two operands, one of them an
- input-output operand.
- Matching constraints work only in circumstances like that add insn.
- More precisely, the matching constraint must appear in an input-only
- operand and the operand that it matches must be an output-only operand
- with a lower number.
- For operands to match in a particular case usually means that they are
- identical-looking RTL expressions. But in a few special cases
- specific kinds of dissimilarity are allowed. For example, `*x' as an
- input operand will match `*x++' as an output operand. For proper
- results in such cases, the output template should always use the
- output-operand's number when printing the operand.
- `p'
- An operand that is a valid memory address is allowed. This is for
- ``load address'' and ``push address'' instructions.
- If `p' is used in the constraint, the test-function in the
- `match_operand' must be `address_operand'.
- In order to have valid assembler code, each operand must satisfy its
- constraint. But a failure to do so does not prevent the pattern from
- applying to an insn. Instead, it directs the compiler to modify the code
- so that the constraint will be satisfied. Usually this is done by copying
- an operand into a register.
- Contrast, therefore, the two instruction patterns that follow:
- (define_insn ""
- [(set (match_operand:SI 0 "general_operand" "r")
- (plus:SI (match_dup 0)
- (match_operand:SI 1 "general_operand" "r")))]
- ""
- "...")
- which has two operands, one of which must appear in two places, and
- (define_insn ""
- [(set (match_operand:SI 0 "general_operand" "r")
- (plus:SI (match_operand:SI 1 "general_operand" "0")
- (match_operand:SI 2 "general_operand" "r")))]
- ""
- "...")
- which has three operands, two of which are required by a constraint to be
- identical. If we are considering an insn of the form
- (insn N PREV NEXT
- (set (reg:SI 3)
- (plus:SI (reg:SI 6) (reg:SI 109)))
- ...)
- the first pattern would not apply at all, because this insn does not
- contain two identical subexpressions in the right place. The pattern would
- say, ``That does not look like an add instruction; try other patterns.''
- The second pattern would say, ``Yes, that's an add instruction, but there
- is something wrong with it.'' It would direct the reload pass of the
- compiler to generate additional insns to make the constraint true. The
- results might look like this:
- (insn N2 PREV N
- (set (reg:SI 3) (reg:SI 6))
- ...)
-
- (insn N N2 NEXT
- (set (reg:SI 3)
- (plus:SI (reg:SI 3) (reg:SI 109)))
- ...)
- Because insns that don't fit the constraints are fixed up by loading
- operands into registers, every instruction pattern's constraints must
- permit the case where all the operands are in registers. It need not
- permit all classes of registers; the compiler knows how to copy registers
- into other registers of the proper class in order to make an instruction
- valid. But if no registers are permitted, the compiler will be stymied: it
- does not know how to save a register in memory in order to make an
- instruction valid. Instruction patterns that reject registers can be made
- valid by attaching a condition-expression that refuses to match an insn at
- all if the crucial operand is a register.
- File: internals, Node: Multi-Alternative, Next: Class Preferences, Prev: Simple Constraints, Up: Constraints
- Multiple Alternative Constraints
- --------------------------------
- Sometimes a single instruction has multiple alternative sets of possible
- operands. For example, on the 68000, a logical-or instruction can combine
- register or an immediate value into memory, or it can combine any kind of
- operand into a register; but it cannot combine one memory location into
- another.
- These constraints are represented as multiple alternatives. An alternative
- can be described by a series of letters for each operand. The overall
- constraint for an operand is made from the letters for this operand from
- the first alternative, a comma, the letters for this operand from the
- second alternative, a comma, and so on until the last alternative. Here is
- how it is done for fullword logical-or on the 68000:
- (define_insn "iorsi3"
- [(set (match_operand:SI 0 "general_operand" "=%m,d")
- (ior:SI (match_operand:SI 1 "general_operand" "0,0")
- (match_operand:SI 2 "general_operand" "dKs,dmKs")))]
- ...)
- The first alternative has `m' (memory) for operand 0, `0' for operand 1
- (meaning it must match operand 0), and `dKs' for operand 2. The second
- alternative has `d' (data register) for operand 0, `0' for operand 1, and
- `dmKs' for operand 2. The `=' and `%' in the constraint for operand 0 are
- not part of any alternative; their meaning is explained in the next section.
- If all the operands fit any one alternative, the instruction is valid.
- Otherwise, for each alternative, the compiler counts how many instructions
- must be added to copy the operands so that that alternative applies. The
- alternative requiring the least copying is chosen. If two alternatives
- need the same amount of copying, the one that comes first is chosen. These
- choices can be altered with the `?' and `!' characters:
- `?'
- Disparage slightly the alternative that the `?' appears in, as a
- choice when no alternative applies exactly. The compiler regards this
- alternative as one unit more costly for each `?' that appears in it.
- `!'
- Disparage severely the alternative that the `!' appears in. When
- operands must be copied into registers, the compiler will never choose
- this alternative as the one to strive for.
- When an insn pattern has multiple alternatives in its constraints, often
- the appearance of the assembler code determined mostly by which alternative
- was matched. When this is so, the C code for writing the assembler code
- can use the variable `which_alternative', which is the ordinal number of
- the alternative that was actually satisfied (0 for the first, 1 for the
- second alternative, etc.). For example:
- (define_insn ""
- [(set (match_operand:SI 0 "general_operand" "r,m")
- (const_int 0))]
- ""
- "*
- return (which_alternative == 0
- ? \"clrreg %0\" : \"clrmem %0\");
- ")
- File: internals, Node: Class Preferences, Next: Modifiers, Prev: Multi-Alternative, Up: Constraints
- Register Class Preferences
- --------------------------
- The operand constraints have another function: they enable the compiler to
- decide which kind of hardware register a pseudo register is best allocated
- to. The compiler examines the constraints that apply to the insns that use
- the pseudo register, looking for the machine-dependent letters such as `d'
- and `a' that specify classes of registers. The pseudo register is put in
- whichever class gets the most ``votes''. The constraint letters `g' and
- `r' also vote: they vote in favor of a general register. The machine
- description says which registers are considered general.
- Of course, on some machines all registers are equivalent, and no register
- classes are defined. Then none of this complexity is relevant.
- File: internals, Node: Modifiers, Next: No Constraints, Prev: Class Preferences, Up: Constraints
- Constraint Modifier Characters
- ------------------------------
- `='
- Means that this operand is write-only for this instruction: the
- previous value is discarded and replaced by output data.
- `+'
- Means that this operand is both read and written by the instruction.
- When the compiler fixes up the operands to satisfy the constraints, it
- needs to know which operands are inputs to the instruction and which
- are outputs from it. `=' identifies an output; `+' identifies an
- operand that is both input and output; all other operands are assumed
- to be input only.
- `&'
- Means (in a particular alternative) that this operand is written
- before the instruction is finished using the input operands.
- Therefore, this operand may not lie in a register that is used as an
- input operand or as part of any memory address.
- `&' applies only to the alternative in which it is written. In
- constraints with multiple alternatives, sometimes one alternative
- requires `&' while others do not. See, for example, the `movdf' insn
- of the 68000.
- `&' does not obviate the need to write `='.
- `%'
- Declares the instruction to be commutative for this operand and the
- following operand. This means that the compiler may interchange the
- two operands if that is the cheapest way to make all operands fit the
- constraints. This is often used in patterns for addition instructions
- that really have only two operands: the result must go in one of the
- arguments. Here for example, is how the 68000 halfword-add
- instruction is defined:
- (define_insn "addhi3"
- [(set (match_operand:HI 0 "general_operand" "=m,r")
- (plus:HI (match_operand:HI 1 "general_operand" "%0,0")
- (match_operand:HI 2 "general_operand" "di,g")))]
- ...)
- Note that in previous versions of GNU CC the `%' constraint modifier
- always applied to operands 1 and 2 regardless of which operand it was
- written in. The usual custom was to write it in operand 0. Now it
- must be in operand 1 if the operands to be exchanged are 1 and 2.
- `#'
- Says that all following characters, up to the next comma, are to be
- ignored as a constraint. They are significant only for choosing
- register preferences.
- `*'
- Says that the following character should be ignored when choosing
- register preferences. `*' has no effect on the meaning of the
- constraint as a constraint.
- Here is an example: the 68000 has an instruction to sign-extend a
- halfword in a data register, and can also sign-extend a value by
- copying it into an address register. While either kind of register is
- acceptable, the constraints on an address-register destination are
- less strict, so it is best if register allocation makes an address
- register its goal. Therefore, `*' is used so that the `d' constraint
- letter (for data register) is ignored when computing register
- preferences.
- (define_insn "extendhisi2"
- [(set (match_operand:SI 0 "general_operand" "=*d,a")
- (sign_extend:SI
- (match_operand:HI 1 "general_operand" "0,g")))]
- ...)
- File: internals, Node: No Constraints, Prev: Modifiers, Up: Constraints
- Not Using Constraints
- ---------------------
- Some machines are so clean that operand constraints are not required. For
- example, on the Vax, an operand valid in one context is valid in any other
- context. On such a machine, every operand constraint would be `g',
- excepting only operands of ``load address'' instructions which are written
- as if they referred to a memory location's contents but actual refer to its
- address. They would have constraint `p'.
- For such machines, instead of writing `g' and `p' for all the constraints,
- you can choose to write a description with empty constraints. Then you
- write `""' for the constraint in every `match_operand'. Address operands
- are identified by writing an `address' expression around the
- `match_operand', not by their constraints.
- When the machine description has just empty constraints, certain parts of
- compilation are skipped, making the compiler faster.
- File: internals, Node: Standard Names, Next: Pattern Ordering, Prev: Constraints, Up: Machine Desc
- Standard Names for Patterns Used in Generation
- ==============================================
- Here is a table of the instruction names that are meaningful in the RTL
- generation pass of the compiler. Giving one of these names to an
- instruction pattern tells the RTL generation pass that it can use the
- pattern in to accomplish a certain task.
- `movM'
- Here M is a two-letter machine mode name, in lower case. This
- instruction pattern moves data with that machine mode from operand 1
- to operand 0. For example, `movsi' moves full-word data.
- If operand 0 is a `subreg' with mode M of a register whose natural
- mode is wider than M, the effect of this instruction is to store the
- specified value in the part of the register that corresponds to mode
- M. The effect on the rest of the register is undefined.
- `movstrictM'
- Like `movM' except that if operand 0 is a `subreg' with mode M of a
- register whose natural mode is wider, the `movstrictM' instruction is
- guaranteed not to alter any of the register except the part which
- belongs to mode M.
- `addM3'
- Add operand 2 and operand 1, storing the result in operand 0. All
- operands must have mode M. This can be used even on two-address
- machines, by means of constraints requiring operands 1 and 0 to be the
- same location.
- `subM3', `mulM3', `umulM3', `divM3', `udivM3', `modM3', `umodM3', `andM3', `iorM3', `xorM3'
- Similar, for other arithmetic operations.
- `andcbM3'
- Bitwise logical-and operand 1 with the complement of operand 2 and
- store the result in operand 0.
- `mulhisi3'
- Multiply operands 1 and 2, which have mode `HImode', and store a
- `SImode' product in operand 0.
- `mulqihi3', `mulsidi3'
- Similar widening-multiplication instructions of other widths.
- `umulqihi3', `umulhisi3', `umulsidi3'
- Similar widening-multiplication instructions that do unsigned
- multiplication.
- `divmodM4'
- Signed division that produces both a quotient and a remainder.
- Operand 1 is divided by operand 2 to produce a quotient stored in
- operand 0 and a remainder stored in operand 3.
- `udivmodM4'
- Similar, but does unsigned division.
- `divmodMN4'
- Like `divmodM4' except that only the dividend has mode M; the divisor,
- quotient and remainder have mode N. For example, the Vax has a
- `divmoddisi4' instruction (but it is omitted from the machine
- description, because it is so slow that it is faster to compute
- remainders by the circumlocution that the compiler will use if this
- instruction is not available).
- `ashlM3'
- Arithmetic-shift operand 1 left by a number of bits specified by
- operand 2, and store the result in operand 0. Operand 2 has mode
- `SImode', not mode M.
- `ashrM3', `lshlM3', `lshrM3', `rotlM3', `rotrM3'
- Other shift and rotate instructions.
- Logical and arithmetic left shift are the same. Machines that do not
- allow negative shift counts often have only one instruction for
- shifting left. On such machines, you should define a pattern named
- `ashlM3' and leave `lshlM3' undefined.
- `negM2'
- Negate operand 1 and store the result in operand 0.
- `absM2'
- Store the absolute value of operand 1 into operand 0.
- `sqrtM2'
- Store the square root of operand 1 into operand 0.
- `ffsM2'
- Store into operand 0 one plus the index of the least significant 1-bit
- of operand 1. If operand 1 is zero, store zero. M is the mode of
- operand 0; operand 1's mode is specified by the instruction pattern,
- and the compiler will convert the operand to that mode before
- generating the instruction.
- `one_cmplM2'
- Store the bitwise-complement of operand 1 into operand 0.
- `cmpM'
- Compare operand 0 and operand 1, and set the condition codes. The RTL
- pattern should look like this:
- (set (cc0) (minus (match_operand:M 0 ...)
- (match_operand:M 1 ...)))
- Each such definition in the machine description, for integer mode M,
- must have a corresponding `tstM' pattern, because optimization can
- simplify the compare into a test when operand 1 is zero.
- `tstM'
- Compare operand 0 against zero, and set the condition codes. The RTL
- pattern should look like this:
- (set (cc0) (match_operand:M 0 ...))
- `movstrM'
- Block move instruction. The addresses of the destination and source
- strings are the first two operands, and both are in mode `Pmode'. The
- number of bytes to move is the third operand, in mode M.
- `cmpstrM'
- Block compare instruction, with operands like `movstrM' except that
- the two memory blocks are compared byte by byte in lexicographic
- order. The effect of the instruction is to set the condition codes.
- `floatMN2'
- Convert operand 1 (valid for fixed point mode M) to floating point
- MODE N and store in operand 0 (which has mode N).
- `fixMN2'
- Convert operand 1 (valid for floating point mode M) to fixed point
- MODE N as a signed number and store in operand 0 (which has mode N).
- This instruction's result is defined only when the value of operand 1
- is an integer.
- `fixunsMN2'
- Convert operand 1 (valid for floating point mode M) to fixed point
- MODE N as an unsigned number and store in operand 0 (which has mode
- N). This instruction's result is defined only when the value of
- operand 1 is an integer.
- `ftruncM2'
- Convert operand 1 (valid for floating point mode M) to an integer
- value, still represented in floating point mode M, and store it in
- operand 0 (valid for floating point mode M).
- `fix_truncMN2'
- Like `fixMN2' but works for any floating point value of mode M by
- converting the value to an integer.
- `fixuns_truncMN2'
- Like `fixunsMN2' but works for any floating point value of mode M by
- converting the value to an integer.
- `truncMN'
- Truncate operand 1 (valid for mode M) to mode N and store in operand 0
- (which has mode N). Both modes must be fixed point or both floating
- point.
- `extendMN'
- Sign-extend operand 1 (valid for mode M) to mode N and store in
- operand 0 (which has mode N). Both modes must be fixed point or both
- floating point.
- `zero_extendMN'
- Zero-extend operand 1 (valid for mode M) to mode N and store in
- operand 0 (which has mode N). Both modes must be fixed point.
- `extv'
- Extract a bit-field from operand 1 (a register or memory operand),
- where operand 2 specifies the width in bits and operand 3 the starting
- bit, and store it in operand 0. Operand 0 must have `Simode'.
- Operand 1 may have mode `QImode' or `SImode'; often `SImode' is
- allowed only for registers. Operands 2 and 3 must be valid for
- `SImode'.
- The RTL generation pass generates this instruction only with constants
- for operands 2 and 3.
- The bit-field value is sign-extended to a full word integer before it
- is stored in operand 0.
- `extzv'
- Like `extv' except that the bit-field value is zero-extended.
- `insv'
- Store operand 3 (which must be valid for `SImode') into a bit-field in
- operand 0, where operand 1 specifies the width in bits and operand 2
- the starting bit. Operand 0 may have mode `QImode' or `SImode'; often
- `SImode' is allowed only for registers. Operands 1 and 2 must be
- valid for `SImode'.
- The RTL generation pass generates this instruction only with constants
- for operands 1 and 2.
- `sCOND'
- Store zero or nonzero in the operand according to the condition codes.
- Value stored is nonzero iff the condition COND is true. COND is the
- name of a comparison operation expression code, such as `eq', `lt' or
- `leu'.
- You specify the mode that the operand must have when you write the
- `match_operand' expression. The compiler automatically sees which
- mode you have used and supplies an operand of that mode.
- The value stored for a true condition must have 1 as its low bit.
- Otherwise the instruction is not suitable and must be omitted from the
- machine description. You must tell the compiler exactly which value
- is stored by defining the macro `STORE_FLAG_VALUE'.
- `bCOND'
- Conditional branch instruction. Operand 0 is a `label_ref' that
- refers to the label to jump to. Jump if the condition codes meet
- condition COND.
- `call'
- Subroutine call instruction. Operand 1 is the number of bytes of
- arguments pushed (in mode `SImode'), and operand 0 is the function to
- call. Operand 0 should be a `mem' RTX whose address is the address of
- the function.
- `return'
- Subroutine return instruction. This instruction pattern name should
- be defined only if a single instruction can do all the work of
- returning from a function.
- `tablejump'
- `caseM'
- File: internals, Node: Pattern Ordering, Next: Dependent Patterns, Prev: Standard Names, Up: Machine Desc
- When the Order of Patterns Matters
- ==================================
- Sometimes an insn can match more than one instruction pattern. Then the
- pattern that appears first in the machine description is the one used.
- Therefore, more specific patterns (patterns that will match fewer things)
- and faster instructions (those that will produce better code when they do
- match) should usually go first in the description.
- In some cases the effect of ordering the patterns can be used to hide a
- pattern when it is not valid. For example, the 68000 has an instruction
- for converting a fullword to floating point and another for converting a
- byte to floating point. An instruction converting an integer to floating
- point could match either one. We put the pattern to convert the fullword
- first to make sure that one will be used rather than the other. (Otherwise
- a large integer might be generated as a single-byte immediate quantity,
- which would not work.) Instead of using this pattern ordering it would be
- possible to make the pattern for convert-a-byte smart enough to deal
- properly with any constant value.
- File: internals, Node: Dependent Patterns, Next: Jump Patterns, Prev: Pattern Ordering, Up: Machine Desc
- Interdependence of Patterns
- ===========================
- Every machine description must have a named pattern for each of the
- conditional branch names `bCOND'. The recognition template must always
- have the form
- (set (pc)
- (if_then_else (COND (cc0) (const_int 0))
- (label_ref (match_operand 0 "" ""))
- (pc)))
- In addition, every machine description must have an anonymous pattern for
- each of the possible reverse-conditional branches. These patterns look like
- (set (pc)
- (if_then_else (COND (cc0) (const_int 0))
- (pc)
- (label_ref (match_operand 0 "" ""))))
- They are necessary because jump optimization can turn direct-conditional
- branches into reverse-conditional branches.
- The compiler does more with RTL than just create it from patterns and
- recognize the patterns: it can perform arithmetic expression codes when
- constant values for their operands can be determined. As a result,
- sometimes having one pattern can require other patterns. For example, the
- Vax has no `and' instruction, but it has `and not' instructions. Here is
- the definition of one of them:
- (define_insn "andcbsi2"
- [(set (match_operand:SI 0 "general_operand" "")
- (and:SI (match_dup 0)
- (not:SI (match_operand:SI
- 1 "general_operand" ""))))]
- ""
- "bicl2 %1,%0")
- If operand 1 is an explicit integer constant, an instruction constructed
- using that pattern can be simplified into an `and' like this:
- (set (reg:SI 41)
- (and:SI (reg:SI 41)
- (const_int 0xffff7fff)))
- (where the integer constant is the one's complement of what appeared in the
- original instruction).
- To avoid a fatal error, the compiler must have a pattern that recognizes
- such an instruction. Here is what is used:
- (define_insn ""
- [(set (match_operand:SI 0 "general_operand" "")
- (and:SI (match_dup 0)
- (match_operand:SI 1 "general_operand" "")))]
- "GET_CODE (operands[1]) == CONST_INT"
- "*
- { operands[1]
- = gen_rtx (CONST_INT, VOIDmode, ~INTVAL (operands[1]));
- return \"bicl2 %1,%0\";
- }")
- Whereas a pattern to match a general `and' instruction is impossible to
- support on the Vax, this pattern is possible because it matches only a
- constant second argument: a special case that can be output as an `and not'
- instruction.
- A ``compare'' instruction whose RTL looks like this:
- (set (cc0) (minus OPERAND (const_int 0)))
- may be simplified by optimization into a ``test'' like this:
- (set (cc0) OPERAND)
- So in the machine description, each ``compare'' pattern for an integer mode
- must have a corresponding ``test'' pattern that will match the result of
- such simplification.
- In some cases machines support instructions identical except for the
- machine mode of one or more operands. For example, there may be
- ``sign-extend halfword'' and ``sign-extend byte'' instructions whose
- patterns are
- (set (match_operand:SI 0 ...)
- (extend:SI (match_operand:HI 1 ...)))
-
- (set (match_operand:SI 0 ...)
- (extend:SI (match_operand:QI 1 ...)))
- Constant integers do not specify a machine mode, so an instruction to
- extend a constant value could match either pattern. The pattern it
- actually will match is the one that appears first in the file. For correct
- results, this must be the one for the widest possible mode (`HImode',
- here). If the pattern matches the `QImode' instruction, the results will
- be incorrect if the constant value does not actually fit that mode.
- Such instructions to extend constants are rarely generated because they are
- optimized away, but they do occasionally happen in nonoptimized compilations.
- File: internals, Node: Jump Patterns, Next: Peephole Definitions, Prev: Dependent Patterns, Up: Machine Desc
- Defining Jump Instruction Patterns
- ==================================
- GNU CC assumes that the machine has a condition code. A comparison insn
- sets the condition code, recording the results of both signed and unsigned
- comparison of the given operands. A separate branch insn tests the
- condition code and branches or not according its value. The branch insns
- come in distinct signed and unsigned flavors. Many common machines, such
- as the Vax, the 68000 and the 32000, work this way.
- Some machines have distinct signed and unsigned compare instructions, and
- only one set of conditional branch instructions. The easiest way to handle
- these machines is to treat them just like the others until the final stage
- where assembly code is written. At this time, when outputting code for the
- compare instruction, peek ahead at the following branch using `NEXT_INSN
- (insn)'. (The variable `insn' refers to the insn being output, in the
- output-writing code in an instruction pattern.) If the RTL says that is an
- unsigned branch, output an unsigned compare; otherwise output a signed
- compare. When the branch itself is output, you can treat signed and
- unsigned branches identically.
- The reason you can do this is that GNU CC always generates a pair of
- consecutive RTL insns, one to set the condition code and one to test it,
- and keeps the pair inviolate until the end.
- To go with this technique, you must define the machine-description macro
- `NOTICE_UPDATE_CC' to do `CC_STATUS_INIT'; in other words, no compare
- instruction is superfluous.
- Some machines have compare-and-branch instructions and no condition code.
- A similar technique works for them. When it is time to ``output'' a
- compare instruction, record its operands in two static variables. When
- outputting the branch-on-condition-code instruction that follows, actually
- output a compare-and-branch instruction that uses the remembered operands.
- It also works to define patterns for compare-and-branch instructions. In
- optimizing compilation, the pair of compare and branch instructions will be
- combined accoprding to these patterns. But this does not happen if
- optimization is not requested. So you must use one of the solutions above
- in addition to any special patterns you define.
|