123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156 |
- 1. Better optimization.
- * More cse
- The techniques for doing full global cse are described in the
- red dragon book. It is likely to be slow and use a lot of memory,
- but it might be worth offering as an additional option.
- It is probably possible to extend cse to a few very frequent cases
- without so much expense.
- For example, it is not very hard to handle cse through if-then
- statements with no else clauses. Here's how to do it. On reaching a
- label, notice that the label's use-count is 1 and that the last
- preceding jump jumps conditionally to this label. Now you know it
- is a simple if-then statement. Remove from the hash table
- all the expressions that were entered since that jump insn
- and you can continue with cse.
- It is probably not hard to handle cse from the end of a loop
- around to the beginning, and a few loops would be greatly sped
- up by this.
- * Iteration variables and strength reduction.
- The red dragon book describes standard techniques for these kinds
- of loop optimization. But be careful! These optimization techniques
- don't always make the code better. You need to avoid performing
- the standard transformations unless they are greatly worth while.
- In many common cases it is possible to deduce that an iteration
- variable is always positive during the loop. This information
- may make it possible to use decrement-and-branch instructions
- whose branch conditions are inconvenient. For example, the 68000
- `dbra' instruction branches if the value was not equal to zero.
- Therefore, it is not applicable to `for (i = 10; i >= 0; i--)'
- unless the compiler can know that I will never be negative
- before it is decremented.
- * Using constraints on values.
- Many operations could be simplified based on knowledge of the
- minimum and maximum possible values of a register at any particular time.
- These limits could come from the data types in the tree, via rtl generation,
- or they can be deduced from operations that are performed. For example,
- the result of an `and' operation one of whose operands is 7 must be in
- the range 0 to 7. Compare instructions also tell something about the
- possible values of the operand, in the code beyond the test.
- Value constraints can be used to determine the results of a further
- comparison. They can also indicate that certain `and' operations are
- redundant. Constraints might permit a decrement and branch
- instruction that checks zeroness to be used when the user has
- specified to exit if negative.
- * Smarter reload pass.
- The reload pass as currently written can reload values only into registers
- that are reserved for reloading. This means that in order to use a
- register for reloading it must spill everything out of that register.
- It would be straightforward, though complicated, for reload1.c to keep
- track, during its scan, of which hard registers were available at each
- point in the function, and use for reloading even registers that were
- free only at the point they were needed. This would avoid much spilling
- and make better code.
- * Change the type of a variable.
- Sometimes a variable is declared as `int', it is assigned only once
- from a value of type `char', and then it is used only by comparison
- against constants. On many machines, better code would result if
- the variable had type `char'. If the compiler could detect this
- case, it could change the declaration of the variable and change
- all the places that use it.
- * Order of subexpressions.
- It might be possible to make better code by paying attention
- to the order in which to generate code for subexpressions of an expression.
- * Better code for switch statements.
- If a switch statement has only a few cases, a sequence of conditional
- branches is generated for it, rather than a jump table. It would
- be better to output a binary tree of branches.
- * Distributive law.
- The C expression *(X + 4 * (Y + C)) compiles better on certain
- machines if rewritten as *(X + 4*C + 4*Y) because of known addressing
- modes. It may be tricky to determine when, and for which machines, to
- use each alternative.
- * Jump-execute-next.
- Many recent machines have jumps which optionally execute the following
- instruction before the instruction jumped to, either conditionally or
- unconditionally. To take advantage of this capability requires a new
- compiler pass that would reorder instructions when possible. After
- reload may be a good place for it.
- On some machines, the result of a load from memory is not available
- until after the following instruction. The easiest way to support
- these machines is to output each RTL load instruction as two assembler
- instructions, the second being a no-op. Putting useful instructions
- after the load instructions may be a similar task to putting them
- after jump instructions.
- * Pipeline scheduling.
- On many machines, code gets faster if instructions are reordered
- so that pipelines are kept full. Doing the best possible job of this
- requires knowing which functional units each kind of instruction executes
- in and how long the functional unit stays busy with it. Then the
- goal is to reorder the instructions to keep many functional units
- busy but never feed them so fast they must wait.
- 2. Simpler porting.
- Right now, describing the target machine's instructions is done
- cleanly, but describing its addressing mode is done with several
- ad-hoc macro definitions. Porting would be much easier if there were
- an RTL description for addressing modes like that for instructions.
- Tools analogous to genflags and genrecog would generate macros from
- this description.
- There would be one pattern in the address-description file for each
- kind of addressing, and this pattern would have:
- * the RTL expression for the address
- * C code to verify its validity (since that may depend on
- the exact data).
- * C code to print the address in assembler language.
- * C code to convert the address into a valid one, if it is not valid.
- (This would replace LEGITIMIZE_ADDRESS).
- * Register constraints for all indeterminates that appear
- in the RTL expression.
- 3. Other languages.
- Front ends for Pascal, Fortran, Algol, Cobol, Modula-2 and Ada are
- desirable.
- Pascal, Modula-2 and Ada require the implementation of functions
- within functions. Some of the mechanisms for this already exist.
- 4. Generalize the machine model.
- Some new compiler features may be needed to do a good job on machines
- where static data needs to be addressed using base registers.
- Some machines have two stacks in different areas of memory, one used
- for scalars and another for large objects. The compiler does not
- now have a way to understand this.
|