semantic.texi 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628
  1. \input texinfo
  2. @setfilename ../../info/semantic.info
  3. @set TITLE Semantic Manual
  4. @set AUTHOR Eric M. Ludlam, David Ponce, and Richard Y. Kim
  5. @settitle @value{TITLE}
  6. @include docstyle.texi
  7. @c *************************************************************************
  8. @c @ Header
  9. @c *************************************************************************
  10. @c Merge all indexes into a single index for now.
  11. @c We can always separate them later into two or more as needed.
  12. @syncodeindex vr cp
  13. @syncodeindex fn cp
  14. @syncodeindex ky cp
  15. @syncodeindex pg cp
  16. @syncodeindex tp cp
  17. @c @footnotestyle separate
  18. @c @paragraphindent 2
  19. @c @@smallbook
  20. @c %**end of header
  21. @copying
  22. This manual documents the Semantic library and utilities.
  23. Copyright @copyright{} 1999--2005, 2007, 2009--2016 Free Software
  24. Foundation, Inc.
  25. @quotation
  26. Permission is granted to copy, distribute and/or modify this document
  27. under the terms of the GNU Free Documentation License, Version 1.3 or
  28. any later version published by the Free Software Foundation; with no
  29. Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,''
  30. and with the Back-Cover Texts as in (a) below. A copy of the license
  31. is included in the section entitled ``GNU Free Documentation License.''
  32. (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
  33. modify this GNU manual.''
  34. @end quotation
  35. @end copying
  36. @dircategory Emacs misc features
  37. @direntry
  38. * Semantic: (semantic). Source code parser library and utilities.
  39. @end direntry
  40. @titlepage
  41. @center @titlefont{Semantic}
  42. @sp 4
  43. @center by @value{AUTHOR}
  44. @page
  45. @vskip 0pt plus 1filll
  46. @insertcopying
  47. @end titlepage
  48. @page
  49. @macro semantic{}
  50. @i{Semantic}
  51. @end macro
  52. @macro keyword{kw}
  53. @anchor{\kw\}
  54. @b{\kw\}
  55. @end macro
  56. @macro obsolete{old,new}
  57. @sp 1
  58. @strong{Compatibility}:
  59. @code{\new\} introduced in @semantic{} version 2.0 supersedes
  60. @code{\old\} which is now obsolete.
  61. @end macro
  62. @c *************************************************************************
  63. @c @ Document
  64. @c *************************************************************************
  65. @contents
  66. @node top
  67. @top @value{TITLE}
  68. @semantic{} is a suite of Emacs libraries and utilities for parsing
  69. source code. At its core is a lexical analyzer and two parser
  70. generators (@code{bovinator} and @code{wisent}) written in Emacs Lisp.
  71. @semantic{} provides a variety of tools for making use of the parser
  72. output, including user commands for code navigation and completion, as
  73. well as enhancements for imenu, speedbar, whichfunc, eldoc,
  74. hippie-expand, and several other parts of Emacs.
  75. To send bug reports, or participate in discussions about semantic,
  76. use the mailing list cedet-semantic@@sourceforge.net via the URL:
  77. @url{http://lists.sourceforge.net/lists/listinfo/cedet-semantic}
  78. @ifnottex
  79. @insertcopying
  80. @end ifnottex
  81. @menu
  82. * Introduction::
  83. * Using Semantic::
  84. * Semantic Internals::
  85. * Glossary::
  86. * GNU Free Documentation License::
  87. * Index::
  88. @end menu
  89. @node Introduction
  90. @chapter Introduction
  91. This chapter gives an overview of @semantic{} and its goals.
  92. Ordinarily, Emacs uses regular expressions (and syntax tables) to
  93. analyze source code for purposes such as syntax highlighting. This
  94. approach, though simple and efficient, has its limitations: roughly
  95. speaking, it only ``guesses'' the meaning of each piece of source code
  96. in the context of the programming language, instead of rigorously
  97. ``understanding'' it.
  98. @semantic{} provides a new infrastructure to analyze source code using
  99. @dfn{parsers} instead of regular expressions. It contains two
  100. built-in parser generators (an @acronym{LL} generator named
  101. @code{Bovine} and an @acronym{LALR} generator named @code{Wisent},
  102. both written in Emacs Lisp), and parsers for several common
  103. programming languages. It can also make use of @dfn{external
  104. parsers}---programs such as GNU Global and GNU IDUtils.
  105. @semantic{} provides a uniform, language-independent @acronym{API} for
  106. accessing the parser output. This output can be used by other Emacs
  107. Lisp programs to implement ``syntax-aware'' behavior. @semantic{}
  108. itself includes several such utilities, including user-level Emacs
  109. commands for navigating, searching, and completing source code.
  110. The following diagram illustrates the structure of the @semantic{}
  111. package:
  112. @table @strong
  113. @item Please Note:
  114. The words in all-capital are those that @semantic{} itself provides.
  115. Others are current or future languages or applications that are not
  116. distributed along with @semantic{}.
  117. @end table
  118. @example
  119. Applications
  120. and
  121. Utilities
  122. -------
  123. / \
  124. +---------------+ +--------+ +--------+
  125. C --->| C PARSER |--->| | | |
  126. +---------------+ | | | |
  127. +---------------+ | COMMON | | COMMON |<--- SPEEDBAR
  128. Java --->| JAVA PARSER |--->| PARSE | | |
  129. +---------------+ | TREE | | PARSE |<--- SEMANTICDB
  130. +---------------+ | FORMAT | | API |
  131. Scheme --->| SCHEME PARSER |--->| | | |<--- ecb
  132. +---------------+ | | | |
  133. +---------------+ | | | |
  134. Texinfo --->| TEXI. PARSER |--->| | | |
  135. +---------------+ | | | |
  136. ... ... ... ...
  137. +---------------+ | | | |
  138. Lang. Y --->| Y Parser |--->| | | |<--- app. ?
  139. +---------------+ | | | |
  140. +---------------+ | | | |<--- app. ?
  141. Lang. Z --->| Z Parser |--->| | | |
  142. +---------------+ +--------+ +--------+
  143. @end example
  144. @menu
  145. * Semantic Components::
  146. @end menu
  147. @node Semantic Components
  148. @section Semantic Components
  149. In this section, we provide a more detailed description of the major
  150. components of @semantic{}, and how they interact with one another.
  151. The first step in parsing a source code file is to break it up into
  152. its fundamental components. This step is called lexical analysis:
  153. @example
  154. syntax table, keywords list, and options
  155. |
  156. |
  157. v
  158. input file ----> Lexer ----> token stream
  159. @end example
  160. @noindent
  161. The output of the lexical analyzer is a list of tokens that make up
  162. the file. The next step is the actual parsing, shown below:
  163. @example
  164. parser tables
  165. |
  166. v
  167. token stream ---> Parser ----> parse tree
  168. @end example
  169. @noindent
  170. The end result, the parse tree, is @semantic{}'s internal
  171. representation of the language grammar. @semantic{} provides an
  172. @acronym{API} for Emacs Lisp programs to access the parse tree.
  173. Parsing large files can take several seconds or more. By default,
  174. @semantic{} automatically caches parse trees by saving them in your
  175. @file{.emacs.d} directory. When you revisit a previously-parsed file,
  176. the parse tree is automatically reloaded from this cache, to save
  177. time. @xref{SemanticDB}.
  178. @node Using Semantic
  179. @chapter Using Semantic
  180. @include sem-user.texi
  181. @node Semantic Internals
  182. @chapter Semantic Internals
  183. This chapter provides an overview of the internals of @semantic{}.
  184. This information is usually not needed by application developers or
  185. grammar developers; it is useful mostly for the hackers who would like
  186. to learn more about how @semantic{} works.
  187. @menu
  188. * Parser code:: Code used for the parsers
  189. * Tag handling:: Code used for manipulating tags
  190. * Semanticdb Internals:: Code used in the semantic database
  191. * Analyzer Internals:: Code used in the code analyzer
  192. * Tools:: Code used in user tools
  193. * Tests:: Code used for testing
  194. @end menu
  195. @node Parser code
  196. @section Parser code
  197. @semantic{} parsing code is spread across a range of files.
  198. @table @file
  199. @item semantic.el
  200. The core infrastructure sets up buffers for parsing, and has all the
  201. core parsing routines. Most parsing routines are overloadable, so the
  202. actual implementation may be somewhere else.
  203. @item semantic-edit.el
  204. Incremental reparse based on user edits.
  205. @item semantic-grammar.el
  206. @itemx semantic-grammar.wy
  207. Parser for the different grammar languages, and a major mode for
  208. editing grammars in Emacs.
  209. @item semantic-lex.el
  210. Infrastructure for implementing lexical analyzers. Provides macros
  211. for creating individual analyzers for specific features, and a way to
  212. combine them together.
  213. @item semantic-lex-spp.el
  214. Infrastructure for a lexical symbolic preprocessor. This was written
  215. to implement the C preprocessor, but could be used for other lexical
  216. preprocessors.
  217. @item bovine/bovine-grammar.el
  218. @itemx bovine/bovine-grammar-macros.el
  219. @itemx bovine/semantic-bovine.el
  220. The ``bovine'' grammar. This is the first grammar mode written for
  221. @semantic{} and is useful for simple creating simple parsers.
  222. @item wisent/wisent.el
  223. @itemx wisent/bison-wisent.el
  224. @itemx wisent/semantic-wisent.el
  225. @itemx wisent/semantic-debug-grammar.el
  226. A port of bison to Emacs. This infrastructure lets you create LALR
  227. based parsers for @semantic{}.
  228. @item semantic-ast.el
  229. Manage Abstract Syntax Trees for parsers.
  230. @item semantic-debug.el
  231. Infrastructure for debugging grammars.
  232. @item semantic-util.el
  233. Various utilities for manipulating tags, such as describing the tag
  234. under point, adding labels, and the all important
  235. @code{semantic-something-to-tag-table}.
  236. @end table
  237. @node Tag handling
  238. @section Tag handling
  239. A tag represents an individual item found in a buffer, such as a
  240. function or variable. Tag handling is handled in several source
  241. files.
  242. @table @file
  243. @item semantic-tag.el
  244. Basic tag creation, queries, cloning, binding, and unbinding.
  245. @item semantic-tag-write.el
  246. Write a tag or tag list to a stream. These routines are used by
  247. @file{semanticdb-file.el} when saving a list of tags.
  248. @item semantic-tag-file.el
  249. Files associated with tags. Goto-tag, file for include, and file for
  250. a prototype.
  251. @item semantic-tag-ls.el
  252. Language dependent features of a tag, such as parent calculation, slot
  253. protection, and other states like abstract, virtual, static, and leaf.
  254. @item semantic-dep.el
  255. Include file handling. Contains the include path concepts, and
  256. routines for looking up file names in the include path.
  257. @item semantic-format.el
  258. Convert a tag into a nicely formatted and colored string. Use
  259. @code{semantic-test-all-format-tag-functions} to test different output
  260. options.
  261. @item semantic-find.el
  262. Find tags matching different conditions in a tag table.
  263. These routines are used by @file{semanticdb-find.el} once the database
  264. has been converted into a simpler tag table.
  265. @item semantic-sort.el
  266. Sorting lists of tags in different ways. Includes sorting a plain
  267. list of tags forward or backward. Includes binning tags based on
  268. attributes (bucketize), and tag adoption for multiple references to
  269. the same thing.
  270. @item semantic-doc.el
  271. Capture documentation comments from near a tag.
  272. @end table
  273. @node Semanticdb Internals
  274. @section Semanticdb Internals
  275. @acronym{Semanticdb} complexity is certainly an issue. It is a rather
  276. hairy problem to try and solve.
  277. @table @file
  278. @item semanticdb.el
  279. Defines a @dfn{database} and a @dfn{table} base class. You can
  280. instantiate these classes, and use them, but they are not persistent.
  281. This file also provides support for @code{semanticdb-minor-mode},
  282. which automatically associates files with tables in databases so that
  283. tags are @emph{saved} while a buffer is not in memory.
  284. The database and tables both also provide applicable cache information,
  285. and cache flushing system. The semanticdb search routines use caches
  286. to save data structures that are complex to calculate.
  287. Lastly, it provides the concept of @dfn{project root}. It is a system
  288. by which a file can be associated with the root of a project, so if
  289. you have a tree of directories and source files, it can find the root,
  290. and allow a tag-search to span all available databases in that
  291. directory hierarchy.
  292. @item semanticdb-file.el
  293. Provides a subclass of the basic table so that it can be saved to
  294. disk. Implements all the code needed to unbind/rebind tags to a
  295. buffer and writing them to a file.
  296. @item semanticdb-el.el
  297. Implements a special kind of @dfn{system} database that uses Emacs
  298. internals to perform queries.
  299. @item semanticdb-ebrowse.el
  300. Implements a system database that uses Ebrowse to parse files into a
  301. table that can be queried for tag names. Successful tag hits during a
  302. find causes @semantic{} to pick up and parse the reference files to
  303. get the full details.
  304. @item semanticdb-find.el
  305. Infrastructure for searching groups @semantic{} databases, and dealing
  306. with the search results format.
  307. @item semanticdb-ref.el
  308. Tracks crossreferences. Cross references are needed when buffer is
  309. reparsed, and must alert other tables that any dependent caches may
  310. need to be flushed. References are in the form of include files.
  311. @end table
  312. @node Analyzer Internals
  313. @section Analyzer Internals
  314. The @semantic{} analyzer is a complex engine which has been broken
  315. down across several modules. When the @semantic{} analyzer fails,
  316. start with @code{semantic-analyze-debug-assist}, then dive into some
  317. of these files.
  318. @table @file
  319. @item semantic-analyze.el
  320. The core analyzer for defining the @dfn{current context}. The
  321. current context is an object that contains references to aspects of
  322. the local context including the current prefix, and a tag list
  323. defining what the prefix means.
  324. @item semantic-analyze-complete.el
  325. Provides @code{semantic-analyze-possible-completions}.
  326. @item semantic-analyze-debug.el
  327. The analyzer debugger. Useful when attempting to get everything
  328. configured.
  329. @item semantic-analyze-fcn.el
  330. Various support functions needed by the analyzer.
  331. @item semantic-ctxt.el
  332. Local context parser. Contains overloadable functions used to move
  333. around through different scopes, get local variables, and collect the
  334. current prefix used when doing completion.
  335. @item semantic-scope.el
  336. Calculate @dfn{scope} for a location in a buffer. The scope includes
  337. local variables, and tag lists in scope for various reasons, such as
  338. C++ using statements.
  339. @item semanticdb-typecache.el
  340. The typecache is part of @code{semanticdb}, but is used primarily by
  341. the analyzer to look up datatypes and complex names. The typecache is
  342. bound across source files and builds a master lookup table for data
  343. type names.
  344. @item semantic-ia.el
  345. Interactive Analyzer functions. Simple routines that do completion or
  346. lookups based on the results from the Analyzer. These routines are
  347. meant as examples for application writers, but are quite useful as
  348. they are.
  349. @item semantic-ia-sb.el
  350. Speedbar support for the analyzer, displaying context info, and
  351. completion lists.
  352. @end table
  353. @node Tools
  354. @section Tools
  355. These files contain various tools a user can use.
  356. @table @file
  357. @item semantic-idle.el
  358. Idle scheduler for @semantic{}. Manages reparsing buffers after
  359. edits, and large work tasks in idle time. Includes modes for showing
  360. summary help and pop-up completion.
  361. @item senator.el
  362. The @semantic{} navigator. Provides many ways to move through a
  363. buffer based on the active tag table.
  364. @item semantic-decorate.el
  365. A minor mode for decorating tags based on details from the parser.
  366. Includes overlines for functions, or coloring class fields based on
  367. protection.
  368. @item semantic-decorate-include.el
  369. A decoration mode for include files, which assists users in setting up
  370. parsing for their includes.
  371. @item semantic-complete.el
  372. Advanced completion prompts for reading tag names in the minibuffer, or
  373. inline in a buffer.
  374. @item semantic-imenu.el
  375. Imenu support for using @semantic{} tags in imenu.
  376. @item semantic-mru-bookmark.el
  377. Automatic bookmarking based on tags. Jump to locations you've been
  378. before based on tag name.
  379. @item semantic-sb.el
  380. Support for @semantic{} tag usage in Speedbar.
  381. @item semantic-util-modes.el
  382. A bunch of small minor-modes that exposes aspects of the semantic
  383. parser state. Includes @code{semantic-stickyfunc-mode}.
  384. @item document.el
  385. @itemx document-vars.el
  386. Create an update comments for tags.
  387. @item semantic-adebug.el
  388. Extensions of @file{data-debug.el} for @semantic{}.
  389. @item semantic-chart.el
  390. Draw some charts from stats generated from parsing.
  391. @item semantic-elp.el
  392. Profiler for helping to optimize the @semantic{} analyzer.
  393. @end table
  394. @node Tests
  395. @section Tests
  396. @table @file
  397. @item semantic-utest.el
  398. Basic testing of parsing and incremental parsing for most supported
  399. languages.
  400. @item semantic-ia-utest.el
  401. Test the semantic analyzer's ability to provide smart completions.
  402. @item semantic-utest-c.el
  403. Tests for the C parser's lexical pre-processor.
  404. @item semantic-regtest.el
  405. Regression tests from the older Semantic 1.x API.
  406. @end table
  407. @node Glossary
  408. @appendix Glossary
  409. @table @asis
  410. @item BNF
  411. In semantic 1.4, a BNF file represented ``Bovine Normal Form'', the
  412. grammar file used for the 1.4 parser generator. This was a play on
  413. Backus-Naur Form which proved too confusing.
  414. @item bovinate
  415. A verb representing what happens when a bovine parser parses a file.
  416. @item bovine lambda
  417. In a bovine, or LL parser, the bovine lambda is a function to execute
  418. when a specific set of match rules has succeeded in matching text from
  419. the buffer.
  420. @item bovine parser
  421. A parser using the bovine parser generator. It is an LL parser
  422. suitable for small simple languages.
  423. @item context
  424. @item LALR
  425. @item lexer
  426. A program which converts text into a stream of tokens by analyzing
  427. them lexically. Lexers will commonly create strings, symbols,
  428. keywords and punctuation, and strip whitespaces and comments.
  429. @item LL
  430. @item nonterminal
  431. A nonterminal symbol or simply a nonterminal stands for a class of
  432. syntactically equivalent groupings. A nonterminal symbol name is used
  433. in writing grammar rules.
  434. @item overloadable
  435. Some functions are defined via @code{define-overload}.
  436. These can be overloaded via ....
  437. @item parser
  438. A program that converts @b{tokens} to @b{tags}.
  439. @item tag
  440. A tag is a representation of some entity in a language file, such as a
  441. function, variable, or include statement. In semantic, the word tag is
  442. used the same way it is used for the etags or ctags tools.
  443. A tag is usually bound to a buffer region via overlay, or it just
  444. specifies character locations in a file.
  445. @item token
  446. A single atomic item returned from a lexer. It represents some set
  447. of characters found in a buffer.
  448. @item token stream
  449. The output of the lexer as well as the input to the parser.
  450. @item wisent parser
  451. A parser using the wisent parser generator. It is a port of bison to
  452. Emacs Lisp. It is an LALR parser suitable for complex languages.
  453. @end table
  454. @node GNU Free Documentation License
  455. @appendix GNU Free Documentation License
  456. @include doclicense.texi
  457. @node Index
  458. @unnumbered Index
  459. @printindex cp
  460. @iftex
  461. @contents
  462. @summarycontents
  463. @end iftex
  464. @bye
  465. @c Following comments are for the benefit of ispell.
  466. @c LocalWords: alist API APIs arg argc args argv asis assoc autoload Wisent
  467. @c LocalWords: bnf bovinate bovinates LALR
  468. @c LocalWords: bovinating bovination bovinator bucketize
  469. @c LocalWords: cb cdr charquote checkcache cindex CLOS
  470. @c LocalWords: concat concocting const ctxt Decl defcustom
  471. @c LocalWords: deffn deffnx defun defvar destructor's dfn diff dir
  472. @c LocalWords: doc docstring EDE EIEIO elisp emacsman emph enum
  473. @c LocalWords: eq Exp EXPANDFULL expression fn foo func funcall
  474. @c LocalWords: ia ids ifinfo imenu imenus init int isearch itemx java kbd
  475. @c LocalWords: keymap keywordtable lang languagemode lexer lexing Ludlam
  476. @c LocalWords: menubar metaparent metaparents min minibuffer Misc mode's
  477. @c LocalWords: multitable NAvigaTOR noindent nomedian nonterm noselect
  478. @c LocalWords: nosnarf obarray OLE OO outputfile paren parsetable POINT's
  479. @c LocalWords: popup positionalonly positiononly positionormarker pre
  480. @c LocalWords: printf printindex Programmatically pt quotemode
  481. @c LocalWords: ref regex regexp Regexps reparse resetfile samp sb
  482. @c LocalWords: scopestart SEmantic semanticdb setfilename setq
  483. @c LocalWords: settitle setupfunction sexp sp SPC speedbar speedbar's
  484. @c LocalWords: streamorbuffer struct subalist submenu submenus
  485. @c LocalWords: subsubsection sw sym texi texinfo titlefont titlepage
  486. @c LocalWords: tok TOKEN's toplevel typemodifiers uml unset untar
  487. @c LocalWords: uref usedb var vskip xref yak