refactoring - Is there command-line tool to extract typedef, structure, enumeration, variable, function from a C or C++ file? -
i desiring command-line tool extract definition or declaration (typedef, structure, enumeration, variable, or function) c or c++ source file. way replace existing definition/declaration handy (after transforming extracted definition user-submitted script). there such generic tool available, or resonably close approximation of such tool?
scriptability , ability hook-up user created scripts or programs of importance here, although academically curious of gui programs too. open source solutions unix/linux camp preferred (although curious of windows , os x tools too). primary language interests c , c++ more generic solution better (i think not need super accurate parsing capabilities finding, extracting , replacing definition in program source file).
sample use cases (extra - curious mind):
- given nested
struct
s , variable (array) initializations of these types, suppose there need change struct definition adding or reordering fields or rewriting variable/array definitions in more readable format without introducing errors resulting manual labor. work extracting old initializations, using script/program write new initializations replace old ones. - for implementing code browsing tool - extract definition.
- decorative code generation (e.g. logging function entries / returns).
- scripted code structuring (e.g. extract , thing , put in different place without change - version control commit comment document command perform operation make evident , verifiable nothing changed).
alternative problem: if there tool tell location of definition (beginning , end line suffice - assume definitions/declarations interested in in own line), exercise of finger dexterity to write program to
- extract definitions,
- replace definitions, or
extract definition, run program specified command line options (or editor)
- receive desired extracted definitions
stdin
(or temporary file), - perform transformation (editing), ,
- output new definitions
stdout
(or save them given temporary file)
to replaced executing program.
- receive desired extracted definitions
so major, more challenging problem finding begin , end line of definition.
note tags: more accurate tag code-generation
code-transformation
not exist.
our dms software reengineering toolkit trying tool wishing for. pushing state of art , isn't nirvana style tool. enough real, interesting work.
dms provides general facilities parsing, analyzing , transforming source code.
it uses explicit grammars define languages (such c , c++); grammars drive parsers build abstract syntax trees (asts). variety of analysis primitives provide a) facilities ["attribute grammars" atgs] collecting information along tree-like information flow paths match shape of asts nicely, b) construction of symbol use symbol definition maps ["symbol tables"], c) control , data flow analysis using facts extracted atgs, d) range analysis, e) points-to analysis both local , global. these primitive analyzers can used compose facts ast draw conclusions code represented asts (e.g., "this statement modifies these variables"). langauge front end packages grammar , language-specific analyzers in reusable bundle. dms has such language front ends of varying levels of depth , maturity wide variety of languages.
[edit 6/27: c , c++ front ends have support specific dialects of c , c++: ansic, c99, gcc3/4 c, ms visual c, ansi c++98, ansi c++11, gcc3/4 c++, ms visual c++ 2005/2008/2010. if want accurate analysis of code, should use "right" dialect process code.]
but "analysis" isn't point. purpose of analysis drive change. dms provides additional support procedurally modify asts, modify asts source-to-source rewrite rules written in surface syntax of language (both conditioned chosen analysis result), or group sets of procedural , source-to-source rewrites make compound, complex rewrites can carry off massive code changes such re-architecting, etc. after asts transformed, can used regenerate ("prettyprint") syntactically correct code in corresponding front-end language/dialect. [by modifying ast 1 language piecewise until have ast another, can build translators, isn't easy sentence implies].
this works considerable degree, yet still stymied language complications. c , c++, famous complication preprocessor; editing program text arbitrarily, preprocessor conditionals can render source code unparseable resembling standard parsing technology. dms's c , c++ front ends ameliorate , can parse code well-structured preprocessor directives including strange cases people not call structured commonly occur:
#if cond if (abc) { #else if (def) { #endif
we making interesting progress on parsing code arbitrary placement of preprocessor conditionals. once that, of analyzers have take preprocessor conditionals account , we're on turf compiler people have not visited.
dms has been used make major architectural shifts in large c++ programs, converting non-corba style corba style immense amount of code shuffling, extract code along arbitrary control flow paths generate sow-style apis existing c code, insert instrumentation in large c programs detect pointer errors, etc. [it has been applied other tasks in many of other languages].
in our own experience, still pretty hard use. in our opinion, in same sense democracy worst of systems of government except rest; ymmv. website has lots of dms-derived tools , discussions.
it has in fact been used extract functions (the sow-exercise more general that) , insert functions (this generalized case of instrumentation).
tools gcc-xml shadows of dms's capabilities. gcc-xml parses, builds symbol tables, , dumps data declarations (not code), can't make code changes. clang better; parses c , c++ asts, can analyses on llvm intermediate representation, , has kind of mechanism spitting out to-be-applied-later patches source text inspired desired tree change. don't know if clang can carry out massive code transformations, 1 transformation's result transformed again (how modify tree delayed text patch?). dms can day long, , can many languages other c , c++, , can arbitrary mixture of langauges knows.
until preprocessor problem conditionals gets solved, analyzing/transforming c , c++ code not easy. succeed in these tasks on these languages sheer willpower , using the strongest tools can build. (java doesn't have these problems, , dms correspondingly better @ analyzing/transforming it).
at severe risk of hubris, believe dms best of tools out there general purpose analysis , transformation. architect, view long term job make ever stronger task.
Comments
Post a Comment