TODO.txt 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277
  1. TODO:
  2. - Make another test program that will test all operations carefully, one by
  3. one, always only using the operations that have already been tested before
  4. (supposing these are working correctly). This should help with making
  5. bugless frontends.
  6. - add profiling (randomly sample lines with debug info)
  7. - COMUN PHASE 2: after self hosted implementation start working on new version
  8. incorporating proposed changes, mainly make the bytecode suck less!
  9. - Maybe for phase 2: simplify the file include system? As related to the issue
  10. of writing the "include" preprocessor that takes all files and resolves the
  11. includes into one big file -- now it either requires a lot of RAM or
  12. unelegant solution by feeding the files in many passes. Would be ideal if
  13. including could be solved just by appending all files together (now it can't
  14. be done because some pointers may require to be defined before including
  15. some file).
  16. ============================================================================
  17. - COMUN SHELL: part of the future PD computer, now it could be simply written
  18. in C just to define the interface, features:
  19. - running programs (no multitask, just like DOS)
  20. - some basic file operations, support for temporary RAM files?
  21. - basic stuff like getting system time, its name, specs, shutdown, restart
  22. etc. (IRC suggested: command to increment any pointer by stack top ot
  23. type env. 0)
  24. - scriptablity ofc (or maybe rather not), possibility to execute a script
  25. from file?
  26. - the NEW IDEA:
  27. comun shell will be the shell, the OS and the I/O library, all in one
  28. The shell is a program that reads input (text commands) and
  29. "does things", like switch mode (no screen, text screen, graphics
  30. screen, ...), play sound, write out something, run a program,
  31. list files, communicate over network, get capabilities, ...
  32. In user shell input simply comes from keyboard. As a library the program
  33. simply does the same by sending input characters to the shell.
  34. Maybe there could be a global buffer that will hold latest N characters
  35. output by previous program => could be used for non-multitasking
  36. pipelining.
  37. ============================================================================
  38. - BC: comparison instrs like greater could prolly be dropped, they can be
  39. replaced by "swap, less comparison"
  40. - possible new syntax for pointers:
  41. - $ptr1:ptr2 instead of $ptr1>ptr2
  42. - $ptr1=ptr2, $ptr1>ptr2, $ptr1<ptr2 for pointer comparison?
  43. - Make some kind of source code simplifier that pre-optimizes a plain text
  44. comun source code? E.g. just regex-style detects functions that are just
  45. constants and replaces then in the source etc. It could allow compilation
  46. of larger source code on weaker computers.
  47. - for comun implementation MAKE A BETTER STRING PSEUDOHASH FUNCTION, this one
  48. fails too much, mostly at strings that are 1 or 2 characters longer than
  49. the max string length, possible improvement: first store n LAST (not first)
  50. chars of the identifier, as the beginning of identifiers are many times the
  51. same ("CMN_getCurrent..." etc.), TEST this on some huge dataset of
  52. identifiers
  53. - uxn interpreter/compiler
  54. - consider: allowing more interactivity between type envs., e.g. adding a
  55. 32 bit value to a pointer in type env. 8
  56. - maybe logical xor would better be ||! than |!!? the latter looks like double
  57. negation or something
  58. - regexp library
  59. - proposed operator: $:, pop1 X, N, sets Nth value below stack top to X
  60. (dual operator to $)
  61. - markdown (simplified) library
  62. - preserve also label names?
  63. - port comun onto other langs by writing the comun bytecode translator to the
  64. language itself in the language itself, i.e. for example python program
  65. that translate comun BC to python
  66. - example program: arrays (with minilib)
  67. - make a tiny toy IDE using SAF
  68. - test out popping vs non-popping variants of various commands (e.g. pointer
  69. ones), see if errors occur where they should
  70. - add warning on ops that do nothing? (e.g. $<1) maybe add a function that
  71. says instr. does nothing, can also be used for optim.
  72. - program: raycasting
  73. - directive to hint on inlining a function (making it kind of a "macro"?)
  74. - directive hinting on minimum stack size
  75. - more optimizations
  76. possible bytecode optimizations:
  77. - remove instructions that do nothing (like no-pop transfer to same env)
  78. - remove NOPs while recomputing addresses <--- DONE
  79. - sequences like "MUC 1, DIC 1" can just be eliminated <--- KINDA DONE (but can we can detect more sequences)
  80. - merge multiple pointer increments or pops to a single increment by a constant
  81. TODO/CONSIDER IN FUTURE VERSIONS:
  82. - SUGGESTION for comun next version (suggested by two people now): allow
  83. adding values to pointers from different TEs (i.e. for example shift TE8
  84. ptr by the stack top value from TE0) -- in BC this could be done by just
  85. adding IC to PAX instruction that would say from which TE the value should
  86. be taken. Just make a nice comun syntax for it.
  87. - program that translates brainfuck to comun
  88. - block comments? could be between ## and ##
  89. - since bit shifts were added, change optimization of mult by pow. of 2 to
  90. a bit shift!
  91. - ADD ACTUAL BIT SHIFT OPERATIONS AND INSTRUCTIONS. Yes, they are needed,
  92. because shifts by variable size can't be easily encoded.
  93. - run on Pokitto etc., test even Arduboy
  94. - make SAF programs with comun
  95. - remove unused functions and labels
  96. - add bit shift operations?!?! no new instructions would be needed (they
  97. translate to MUC/DIC). Normal mul/div are hard to convert to shifts. OR
  98. maybe find a way to optimize e.g. CON 2, MUX to other two instructions with
  99. MUC -- this could be done by making CON popping, i.e. normally CON would be
  100. used more like CON', and CON' 2, MUX could be translated to MUC 2, CON 2
  101. - <-- command to read string from input? Though this isn't nearly as simple
  102. as -->, maybe just don't do this.
  103. - switch statement in future version? would make faster code, but compiler
  104. can do something akin switch during if optimization (this would need a new
  105. instruction in bytecode). Switch should be just an extension of if branch,
  106. maybe like this:
  107. condition ?
  108. # else
  109. ;
  110. # for condition = 2
  111. ;
  112. # for condition = 1
  113. ;
  114. # for condition = 0
  115. .
  116. As for bytecode: there could be an instruction "jump by" given step, then
  117. a multibranch if would just be this instruction followed by constant jumps
  118. to individual labels, like:
  119. <JUMP BY>
  120. <JUMP TO CASE 0 ADDR>
  121. <JUMP TO CASE 1 ADDR>
  122. ...
  123. - maybe remove logical functions (i.e. ||, &&, ...), they don't serve much
  124. purpose (no short circuit as in C)
  125. DONE:
  126. - self hosted: in TE 0 pushing -1 and +xffffffff creates the same BC! By
  127. current BC spec the latter should probably push half and then ADC the rest,
  128. OR specify that CON always pushes UNSIGNED value and implement -1 as CON 0;
  129. SUC 1.
  130. - BUG: -O3 makes comunpress stuck in an infinite loop? doesn't even print
  131. anything
  132. - IN SELF HOSTED IMPLEMENTATION: possibly split the impl. into multiple files
  133. (general, interpreter, compiler, optimized, ...) so that we can make a
  134. MINIMAL COMPILER that is able to compile itself even on low-RAM devices
  135. (a full comun including interpreter and everything may eventually be over
  136. 30 kb which might not fit e.g. on Pokitto).
  137. - programs: bytebeat
  138. - BUG: imagelib testing program transpiled to comun doesn't work!
  139. - INCLUDE ISSUE: if two libraries included at the same level both inclulde the
  140. same library, both will be included and cause name collisions! Includes
  141. should probably always behave as "include once" (is there any scenario in
  142. which we'd want to include the same lib twice? not even preprocessor
  143. templates need this as preproc can simply generate the same code twice in
  144. a loop) -- prolly change this in spec and implementaion.
  145. - in comun to comun transpile maybe try to detect what could be a string
  146. literal (DONE) and translate it so, plus try to detect what could be a "-->"
  147. command and translate it so
  148. - create a comun mini library in example programs
  149. - make the C transpiled output nicer (there are weird literal formats etc.)
  150. - add compilation to comun (i.e. when loading bytecode, we can turn it back
  151. to comun)
  152. - fix/improve the vim highligter, KINDA
  153. - try to make preprocessing stage 1 code smaller (reduce unnecessary spaces
  154. etc.)
  155. - add new possible value to DES that would indicate start of string literal?
  156. ^ rather not now, there are not many free DES values left, plus there would
  157. likely have to be two values (string start and end), plus we would again
  158. make everything more complex... string literals can be guessed just from the
  159. instructions alone anyway
  160. - specify minimum stack size?
  161. - option for non-minimized preprocessor output (can be useful for debugging
  162. or just generating human readable sources)
  163. - raised error param in IC
  164. - test program for gotos
  165. - implement CLI arguments
  166. - test: very big program
  167. - BUG: goto test with -O3 shits itself
  168. - expand big program
  169. - function for basic sanity check of bytecode
  170. - hash collisions happen, e.g. SAF_loop and SAF_COLOR_YELLOW <-- FIXED
  171. - make a uber test, a shell script that tries all the test programs with
  172. different optimization levels etc.
  173. - change findExternalFunc to just findFunc and allow also searchin for defined
  174. functions + make a function in compiler for calling such func
  175. - Consider 64 bit support? Currently only 32 bit is supported due to useing
  176. uint32_t e.g. in interpreterGetXY etc.
  177. - bash script that takes comun program and makes a syntax-highlighted HTML
  178. - BEFORE RELEASE: try to make small executable, currently smallest one seems
  179. to be produced by gcc -Os, also try to compress it with gzexe, AND make
  180. a statically linked executable and see how much that one takes
  181. - add beautify and minify options (can just use the tokenizer maybe), maybe
  182. create a Formatter "class" that does this automatically, can be used in
  183. preprocessor to minimize the underlying code so that the resulting 1st
  184. stage preprocessing bytecode is smaller
  185. - add measure option (-m, -M ?) that runs the program and writes how many
  186. steps it took, how many symbols were needed to store, the highest address
  187. touched in every type env, bytecode size (before and after optim) etc.
  188. - rename "variables" to "pointers" in source code
  189. - focus on safety between unsigned <-> signed conversions, simple cast is
  190. probably not super portable
  191. - Check out the casts from int64_t * to uint64_t * -- prolly not OK, fix.
  192. ^ dunno maybe it's actually OK
  193. - make doxyfile and test
  194. - unify names in the comun.h library
  195. - minicomun.h: extremely simple minicomun pure text interpretation
  196. - change interpreter to incorporate the separate 0 type environment!
  197. - function to estimate the memory needed for type envs and pointers from
  198. bytecode, use in interpreterInit
  199. - create syntax highlighter for vim
  200. - gotos!
  201. - somehow handle reporting correct error position in code with
  202. includes/preprocess. (with includes push the pos on stack, with prep. prolly
  203. can't easily do this, maybe just don't report pos.)
  204. - inline functions whose bytecode is same or shorter than the shortest call
  205. of that func :) but this shouldn't be default because it dropts the info
  206. about func (e.g. bad for transpile), make an option like -O2
  207. - interpreter doesn't have all instructions implemented (those that never
  208. get generated now)
  209. - program "$3" segfaults (should return interpretation error)
  210. - need to also add string output instruction? dunno how to make it with normal
  211. instructions if its non-popping <-- nope, changed specs of string output
  212. - possibly add halt (whole program) and return (from function) commands? halt
  213. could use the END instruction -- the last END instruction would have to have
  214. IC = 0, otherwise 1. <-- done with jump
  215. - if no iofunction provided, ignore it (if interpreter->iofunction == 0 ...)
  216. - add option which enables special external function that cause interpreter
  217. to do various things, e.g. print debug info? somethinkg like a small
  218. built-in library. Not sure if this is a good idea tho. <- RATHER NOT
  219. - Test all the sign stuff (especially pushing negative literal) on 64 bit CPU!
  220. - in a sequence of NOPs (and DESs etc.) add jump instead of the first one to skip all the NOPs, easy to do <- BAD IDEA because not well formed
  221. - Specify that the minimum size of type env 0 should be e.g. 16 bits?
  222. Otherwise many programs can't simply be though of as portable because env
  223. 0 may in theory be just 2 or something.
  224. - C transpile: throw error if goto jumps out of a function
  225. - add CLI option to run with debug (-d?)
  226. - maybe separate comun.c and different frontends, i.e. frontend_c.h,
  227. frontendy_py.h etc.
  228. - bytecode optimization
  229. - try if everything works if we increase pseudohash size
  230. - add goto test to main comun test!
  231. - maybe add pointer comparison, like $ptr1=ptr2, is useful for stopping
  232. pointers (maybe returns 0 on equality, 1 if ptr1 > ptr2 else 2?)
  233. - test the !. command in general tests!
  234. - maybe remove the jump offset instrs? they're not really used
  235. - add pointer comparison ($p1=p2) to tests!
  236. - with runtime errors report the number of steps of interpreter
  237. - Throw error (NOT SUPPORTED) when trying to push literal outside 32bit range
  238. (can̈́'t be dont because internally we use int32_t)
  239. - add convenience function to comun.h that just takes comun string and
  240. interprets it
  241. - self hosted comun: add stack trace to error reports.
  242. - Add debug info to bytecode! Maybe like this: make a new DES type; when the
  243. constant in this is let's say 0, this marks start of a new line in source
  244. code (i.e. no need to record actual line numbers in the instr, its enough
  245. to just count number of these markers since BC start). However an issue is
  246. how to handle different source code files. (now done in BC spec)
  247. - WTF, it seems compiler/interpreter don't take into account sign extension
  248. with signed ops, also if we fix this C transpiler has to be fixed (the
  249. constToC func) to deal with this!
  250. WIDER PROBLEM: All the things with storing consts with taking into account
  251. sign extension is mess, we'd have to store all constants with 0 at the
  252. beginning because we don't know by what operation (sign or unsig) they'll
  253. be used. We could just say fuck it and only store unsigned consts, but what
  254. if we e.g. need to store const. -1 in type env. 0 in which we don't know
  255. number bit width?!?!?!?! Possible solution:
  256. Just store unsigned bits and only at maximum as many as needed by given
  257. type env. (easily known from instruction), AND for type env. zero just
  258. suppose bit width 32 -- this won't allow for storing some values (e.g. those
  259. outside 32 bit range, or those above range of signed 32 int for signed ops),
  260. but will probably mostly work. I.e. even if int has 64 bits on some
  261. platform, -1 will be stored as 0xffffffff in BC and CMN_instrGetConstSigned
  262. will correctly return -1.
  263. Also mention this in limitations in README.