CHANGELOG 8.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194
  1. Version 6.1
  2. * Support for newer Pin versions, up to Pin 2.14-71313 / PinPlay 2.1
  3. * Support GCC 4.9
  4. * Integrate the Cheetah cache models
  5. * Remove legacy Graphite core models simple, magic and iocoom
  6. * A new dumpstats.py option (-c|--config) prints configuration parameters one-per-line
  7. * Numerous bug fixes and improvements
  8. Version 6.0
  9. * Add Instruction Window-Centric core model (internal name: ROB core model, use rob.cfg)
  10. * Add SMT support (IWC/ROB core model only, use smt*.cfg)
  11. * Add SniperLite, a fast cache-only model (use nehalem-lite.cfg)
  12. * Add DRAM cache prefetcher
  13. * Add support for non-standard SQLite paths through the SQLITE_PATH environment variable
  14. * Add sniperdiff.py for easy diff-ing of Sniper results and configurations
  15. * Branch predictor accuracy improvements
  16. * New roi-icount script to more easily specify fast-forward/warmup/detailed lengths by instruction count
  17. * Faster mode for running single-program / single-threaded PinBall (--pinball-non-sift)
  18. * Support for newer Pin versions, up to Pin 2.13.65163 / PinPlay 1.3
  19. * Numerous bug fixes and improvements
  20. Version 5.3
  21. * Fast, cache-only one-IPC timing model (-c cacheonly)
  22. * Add energystats script to provide runtime energy counters
  23. * Add Python-based thread scheduler infrastructure and example
  24. * Add support for Query-Based Selection (Jaleel, MICRO 2010)
  25. * Add support for plotting Bottle graphs (Du Bois, OOPSLA 2013)
  26. * Memory tracker infrastructure to measure cache hit rates by allocation site
  27. * Support for McPAT 1.0
  28. * Various bug fixes and improvements
  29. Version 5.2
  30. * Configurable coherency protocol (MSI/MESI/MESIF), made MESI the default
  31. * Add more cache statistics: LRU stack distance historgram, LLC miss latency breakdown
  32. * Implement auxiliary tag directories (ATD) to track constructive/destructive interference in shared caches
  33. * Implement 2-level TLB hierarchy with Nehalem configuration
  34. * New hooks HOOK_APPLICATION_ROI_{BEGIN,END}, called even when ROI markers are not used directly; these hooks can be used to trigger ROI from a script
  35. * Improved stop-by-icount script to support ROI-relative warmup and detailed lengths
  36. * Ondemand routine stack printer: configure routine_tracer/type=ondemand, then send a SIGUSR1 to Sniper to get a per-thread application backtrace
  37. * Emulate leaf 11 of the cpuid instruction to pass topology information to runtimes (used by Intel OpenMP)
  38. * Emulation of sched_* system calls, gettimeofday replacement, cpuid in SIFT mode
  39. * Improve handling of LD_LIBRARY_PATH: use SNIPER_SIM_LD_LIBRARY_PATH for the simulator, SNIPER_APP_LD_LIBRARY_PATH for the application
  40. * Re-implemented BigSmall scheduler to use thread affinity calls rather than the low-level (and error prone) moveThread API
  41. * sim.thread Python interface to interact with threads (get num threads, get appid, get/set affinity)
  42. * Use newest Pin version 2.13.61206
  43. * Numerous bug fixes and improvements
  44. Version 5.1
  45. * New Suggestions for Optimization visualization (--viz-aso)
  46. * KCacheGrind-compatible output for profiling simulated applications (--profile)
  47. * Roaming (equal-time) scheduler allowing for thread migrations (scheduler/type=roaming)
  48. * Support for newest Pin version 2.12.58423
  49. * Various bugfixes and improvements
  50. Version 5.0
  51. * Periodic sampling infrastructure
  52. * Extensible per-thread statistics infrastructure
  53. * Routine tracing infrastructure and per-function statistics
  54. * NUCA cache model
  55. * Distributed tag directories
  56. * sim.mem Python module for reading application memory
  57. * Various other improvements and bugfixes
  58. Version 4.2
  59. * Various accuracy fixes for Nehalem core model
  60. * Add cache replacement policies: NRU, MRU, NMRU, PLRU, S-RRIP, Random
  61. * Add statistical DRAM performance model
  62. * Add syscall enter/exit hooks
  63. * Add topology view to visualization
  64. * Speed up McPAT by caching architecture-specific CACTI results
  65. * Fixes to running multiple multi-threaded workloads
  66. * Multi-programmed mode: end simulation at first/last program end, optional trace/application restart
  67. * PinPlay support
  68. Version 4.1
  69. * Visualization support (--viz)
  70. * Minor cleanups and bug fixes
  71. Version 4.0
  72. * Thread migration and scheduler support
  73. * Pinned (round-robin), static, random thread schedulers
  74. * Heterogeneous configuration files with tags
  75. * Configurable address2set hash functions for non-power of two sized caches
  76. * Various prefetcher improvements
  77. * DRAM cache model
  78. * One-IPC fast-forward model
  79. * Fault injection framework
  80. * New SQLite3-based statistics format
  81. * ROI support for SIFT
  82. * Support for MPI applications (shared-memory backend)
  83. * Limited support for Jikes/DaCapo benchmarks
  84. * Use newest Pin 2.12.53271
  85. * Add script for generating topology images
  86. * Preserve history in Git repository
  87. * Many cleanups and bugfixes
  88. Version 3.07
  89. * Prefetcher improvements, add global history buffer-based prefetcher
  90. * HOOK_PERIODIC_INS: Instruction-based periodic callback
  91. * Implement CLONE_CHILD_CLEARTID syscall interface
  92. * Add example scripts for periodic statistics, periodic McPAT, simulating limited iteration counts
  93. * Support for Pin 2.12
  94. * Fixes to Python environment
  95. * Various bugfixes
  96. Version 3.06
  97. * Fix modeled size of network messages
  98. * Build fixes for 32-bit, compiler overrides
  99. Version 3.05
  100. * Scheduler: expose application ID
  101. * Add example script roi-iter.py to dynamically select ROI based on SimMarkers
  102. * CPI stacks: --aggregate and --partial support, fixes for heterogeneous configurations
  103. * Traces: support for 32-bit executables
  104. * Build fixes for older Linux versions
  105. Version 3.04
  106. * Support for running multiple multi-threaded workloads in a single simulation
  107. * McPAT fixes for heterogeneous configurations
  108. * Build system fixes for newer Linux versions
  109. Version 3.03
  110. * Bugfixes in configuration parser, starting of multi-program workloads
  111. Version 3.02
  112. * Fixes for specifying heterogeneous configurations
  113. * L2 prefetcher improvements
  114. * Perfect cache modeling
  115. * Self-modifying code support
  116. * PyControl scripting interface
  117. * GCC 4.7 support
  118. * McPAT integration for area, power and energy predictions
  119. Version 3.01
  120. * Add heterogeneous cache configuration support
  121. * Emulate pause, sleep system calls
  122. * Improve support for 32-bit applications
  123. * Pin 2.11 support
  124. Version 3.0
  125. * Support for heterogeneous core types
  126. * Separate core microarchitectural characteristics into CoreModel class
  127. * Improve CPI stack detail
  128. * Add initial implementation for basic L2 prefetcher
  129. * Optionally access DRAM directly in configurations with a single LLC
  130. * Deprecate replacement of pthread_* synchronization calls
  131. * Support more SYS_futex options
  132. * Remove unused code for Graphite FULL mode
  133. * Fixes to the build system, including parallel builds (make -j)
  134. * Support for building on 32-bit hosts
  135. * Remove configuration defaults from code, require everything to be specified in a configuration file
  136. Version 2.04
  137. * Fix trace playback for non-predicated instructions
  138. Version 2.03
  139. * Fix record-trace -d 0
  140. * Fix keyboard interrupt behavior
  141. Version 2.02
  142. * [sift_recorder] Add support for -f (number of instructions to fast-forward) and -d (number of instructions to trace in detail) command-line options
  143. Version 2.01
  144. * Compilation fixes for GCC 4.6
  145. Version 2.0
  146. * Multi-program mode
  147. * Instruction trace collection tool
  148. * New ROI-aware sim.out file generation
  149. * CPI components for various SYS_futex system calls
  150. * Improve accuracy of the core interval model
  151. * GCC 4.6 support
  152. Version 1.06
  153. * Fix queue overflow when many TLB misses occur in a single basic block
  154. Version 1.05
  155. * [cpistack] Add --abstime command-line parameter to scale Y axis according to absolute time in seconds
  156. * [interval] Fix branch resolution latency calculation
  157. Version 1.04
  158. * Fix instruction dependencies for LEA and stores
  159. * Support synchronous signal handling (when using the
  160. -g --general/enable_signals=true command-line parameter)
  161. Version 1.03
  162. * Build changes to improve compatibility with Fedora 16
  163. Version 1.02
  164. * Remove unneeded warnings from CPI stack script
  165. Version 1.0
  166. * Initial public release