|
I don't think your expectations are reasonable. You can't really parse without knowing certain things about the data you're parsing. Basically, about all you're going to be able to do in a generic sense is break the data stream into tokens using some defined delimiter(s) (for example, whitespace characters). Remember that there will be gotchas. For example, a C++ parser would need to treat whitespace differently depending on the situation; after a #define directive, there must be a newline, but in normal code, the whitespace character(s) don't matter (of course, if the preprocessor is a separate unit with its own parser, this can explain things). Regular spaces in quoted strings must be preserved. Another gotcha would be in multi-token constructs; for example, long and long int are semantically the same, but the second is two tokens.
Even given all that, a tree is not necessarily the ideal structure to represent the parsed data. You might want a stack-based implementation for processing mathematical expressions. It really depends on the needs.
All in all, expecting to create a 'one solution fits all' approach for this sort of thing just isn't feasible. Even though it's possible to do without subclassing, it would mean the user of your class would need to pass in all kinds of data to tell it how to accomplish the parsing. This seems a throwback to the C way of doing things, rather than an object-oriented approach.
__________________
And once again, Probability proves itself willing to sneak into a back alley and service Drama as would a copper-piece harlot.
- Vaarsuvius, Order of the Stick
|